非关系型数据库 之 图数据库Neo4j的使用(Python3)

版权声明:本文为博主原创文章,遵循 CC 4.0 by-sa 版权协议,转载请附上原文出处链接和本声明。
本文链接:https://shazhenyu.blog.csdn.net/article/details/93116754

1、Neo4j 简介

1.1、简述

Neo4j是现今最火爆的图数据。在2010年发布,产品的发展势头还算不错。
作为图数据库,Neo4j最大的特点是关系数据的存储。
图数据库除了能够像普通的数据库一样存储一行一行的数据之外,还可以很方便的看出存储数据之间的关系信息。
适合存储”修改较少,查询较多,没有超大节点“的图数据。

1.2、应用场景

  • 社交网络
    根据用户与其他用户的关系为用户推荐新的朋友。例如,在QQ中给你推荐朋友的朋友 。

  • 智能推荐引擎
    通过分析用户有哪些朋友、用户朋友喜好的产品、用户的浏览记录等关系信息推测用户的喜好进而为用户推荐商品。

  • 知识图谱
    根据知识点间的关系建立图谱,帮助用户搜索到关联的知识。例如在百度上搜索Neo4j,会同时出现MySQL等类似的内容。

  • 恶意软件检测
    通过记录软件行为的各种关系数据,例如其访问了哪些IP、访问了哪些系统资源,进而分析软件行为是否具有恶意。

  • 网络、数据中心管理
    网络、数据中心这些基础设施自身就是一个包含复杂关系的网络,利用Neo4j可以方便的建立设备之间的关系,以便于对整个系统的管理。

1.3、优点

  • 数据的插入,查询操作很直观,不用再像之前要考虑各个表之间的关系。
  • 提供的图搜索和图遍历方法很方便,速度也是比较快的。

1.4、缺点

  • 最不能让人忍受的就是极慢的插入速度。可能是因为创建节点和边的时候需要保存一些额外信息(为了查询服务)。不知道是不是我代码的问题,插入10000个节点,10000条边花了将近10分钟…
  • 超大节点。当有一个节点的边非常多时(常见于大V),有关这个节点的操作的速度将大大下降。这个问题很早就有了,官方也说过会处理,然而现在仍然不能让人满意。
  • 提高数据库速度的常用方法就是多分配内存,然而看了官方操作手册,貌似无法直接设置数据库内存占用量,而是需要计算后为其”预留“内存…

2、CentOS 操作 Neo4j

2.1、安装启动

2.1.1、下载

下载地址:https://neo4j.com/download-center/#community
包地址:https://neo4j.com/artifact.php?name=neo4j-community-3.5.6-unix.tar.gz
在这里插入图片描述
下载 3.5.6 版本

curl -O https://neo4j.com/artifact.php?name=neo4j-community-3.5.6-unix.tar.gz

顺便吐槽一下neo4j安装包的下载速度实在是太慢了,所以我上传了一份到csdn……

csdn下载地址:https://download.csdn.net/download/u014597198/11250958

安装

tar -zxvf neo4j-community-3.5.6-unix.tar.gz

移动文件夹

mv neo4j-community-3.5.6/ /usr/local/neo4j

效果
在这里插入图片描述

2.1.2、修改配置文件

配置文件路径
在这里插入图片描述

  • 1、修改第22行load csv时路径,在前面加个#注释掉,可从任意路径读取文件
#dbms.directories.import=import

在这里插入图片描述

  • 2、修改35行和36行,去除注释,设置JVM初始堆内存和JVM最大堆内存
    (理论上JVM最大 堆内存越大越好,但是要小于机器的物理内存)
dbms.memory.heap.initial_size=512m
dbms.memory.heap.max_size=1g

在这里插入图片描述
如果不知道还剩多少,可以用linux命令free -m
在这里插入图片描述

  • 3、修改46行,可以认为这个是缓存,如果机器配置高,这个越大越好
dbms.memory.pagecache.size=5g

在这里插入图片描述

  • 4、修改54行,去掉改行的#,可以远程通过ip访问neo4j数据库
dbms.connectors.default_listen_address=0.0.0.0

在这里插入图片描述

  • 5、默认 bolt端口是7687,http端口是7474,https关口是7473,不修改下面3项也可以
    在这里插入图片描述
dbms.connector.bolt.listen_address=:7687
dbms.connector.http.listen_address=:7474
dbms.connector.https.listen_address=:7473

去掉注释
在这里插入图片描述

  • 6、修改245行,去掉#,允许从远程url来load csv
dbms.security.allow_csv_import_from_file_urls=true

在这里插入图片描述

  • 7、修改265行,去除注释设置neo4j可读可写
dbms.read_only=false

在这里插入图片描述

  • 8、3.5.6 版本配置文件(注:各个版本中配置文件是不同的)
#*****************************************************************
# Neo4j configuration
#
# For more details and a complete list of settings, please see
# https://neo4j.com/docs/operations-manual/current/reference/configuration-settings/
#*****************************************************************

# The name of the database to mount
#dbms.active_database=graph.db

# Paths of directories in the installation.
#dbms.directories.data=data
#dbms.directories.plugins=plugins
#dbms.directories.certificates=certificates
#dbms.directories.logs=logs
#dbms.directories.lib=lib
#dbms.directories.run=run

# This setting constrains all `LOAD CSV` import files to be under the `import` directory. Remove or comment it out to
# allow files to be loaded from anywhere in the filesystem; this introduces possible security problems. See the
# `LOAD CSV` section of the manual for details.
# dbms.directories.import=import

# Whether requests to Neo4j are authenticated.
# To disable authentication, uncomment this line
#dbms.security.auth_enabled=false

# Enable this to be able to upgrade a store from an older version.
#dbms.allow_upgrade=true

# Java Heap Size: by default the Java heap size is dynamically
# calculated based on available system resources.
# Uncomment these lines to set specific initial and maximum
# heap size.
dbms.memory.heap.initial_size=512m
dbms.memory.heap.max_size=1g

# The amount of memory to use for mapping the store files, in bytes (or
# kilobytes with the 'k' suffix, megabytes with 'm' and gigabytes with 'g').
# If Neo4j is running on a dedicated server, then it is generally recommended
# to leave about 2-4 gigabytes for the operating system, give the JVM enough
# heap to hold all your transaction state and query context, and then leave the
# rest for the page cache.
# The default page cache memory assumes the machine is dedicated to running
# Neo4j, and is heuristically set to 50% of RAM minus the max Java heap size.
dbms.memory.pagecache.size=5g

#*****************************************************************
# Network connector configuration
#*****************************************************************

# With default configuration Neo4j only accepts local connections.
# To accept non-local connections, uncomment this line:
dbms.connectors.default_listen_address=0.0.0.0

# You can also choose a specific network interface, and configure a non-default
# port for each connector, by setting their individual listen_address.

# The address at which this server can be reached by its clients. This may be the server's IP address or DNS name, or
# it may be the address of a reverse proxy which sits in front of the server. This setting may be overridden for
# individual connectors below.
#dbms.connectors.default_advertised_address=localhost

# You can also choose a specific advertised hostname or IP address, and
# configure an advertised port for each connector, by setting their
# individual advertised_address.

# Bolt connector
dbms.connector.bolt.enabled=true
#dbms.connector.bolt.tls_level=OPTIONAL
dbms.connector.bolt.listen_address=:7687

# HTTP Connector. There can be zero or one HTTP connectors.
dbms.connector.http.enabled=true
dbms.connector.http.listen_address=:7474

# HTTPS Connector. There can be zero or one HTTPS connectors.
dbms.connector.https.enabled=true
dbms.connector.https.listen_address=:7473

# Number of Neo4j worker threads.
#dbms.threads.worker_count=

#*****************************************************************
# SSL system configuration
#*****************************************************************

# Names of the SSL policies to be used for the respective components.

# The legacy policy is a special policy which is not defined in
# the policy configuration section, but rather derives from
# dbms.directories.certificates and associated files
# (by default: neo4j.key and neo4j.cert). Its use will be deprecated.

# The policies to be used for connectors.
#
# N.B: Note that a connector must be configured to support/require
#      SSL/TLS for the policy to actually be utilized.
#
# see: dbms.connector.*.tls_level

#bolt.ssl_policy=legacy
#https.ssl_policy=legacy

#*****************************************************************
# SSL policy configuration
#*****************************************************************

# Each policy is configured under a separate namespace, e.g.
#    dbms.ssl.policy.<policyname>.*
#
# The example settings below are for a new policy named 'default'.

# The base directory for cryptographic objects. Each policy will by
# default look for its associated objects (keys, certificates, ...)
# under the base directory.
#
# Every such setting can be overridden using a full path to
# the respective object, but every policy will by default look
# for cryptographic objects in its base location.
#
# Mandatory setting

#dbms.ssl.policy.default.base_directory=certificates/default

# Allows the generation of a fresh private key and a self-signed
# certificate if none are found in the expected locations. It is
# recommended to turn this off again after keys have been generated.
#
# Keys should in general be generated and distributed offline
# by a trusted certificate authority (CA) and not by utilizing
# this mode.

#dbms.ssl.policy.default.allow_key_generation=false

# Enabling this makes it so that this policy ignores the contents
# of the trusted_dir and simply resorts to trusting everything.
#
# Use of this mode is discouraged. It would offer encryption but no security.

#dbms.ssl.policy.default.trust_all=false

# The private key for the default SSL policy. By default a file
# named private.key is expected under the base directory of the policy.
# It is mandatory that a key can be found or generated.

#dbms.ssl.policy.default.private_key=

# The private key for the default SSL policy. By default a file
# named public.crt is expected under the base directory of the policy.
# It is mandatory that a certificate can be found or generated.

#dbms.ssl.policy.default.public_certificate=

# The certificates of trusted parties. By default a directory named
# 'trusted' is expected under the base directory of the policy. It is
# mandatory to create the directory so that it exists, because it cannot
# be auto-created (for security purposes).
#
# To enforce client authentication client_auth must be set to 'require'!

#dbms.ssl.policy.default.trusted_dir=

# Client authentication setting. Values: none, optional, require
# The default is to require client authentication.
#
# Servers are always authenticated unless explicitly overridden
# using the trust_all setting. In a mutual authentication setup this
# should be kept at the default of require and trusted certificates
# must be installed in the trusted_dir.

#dbms.ssl.policy.default.client_auth=require

# It is possible to verify the hostname that the client uses
# to connect to the remote server. In order for this to work, the server public
# certificate must have a valid CN and/or matching Subject Alternative Names.

# Note that this is irrelevant on host side connections (sockets receiving
# connections).

# To enable hostname verification client side on nodes, set this to true.

#dbms.ssl.policy.default.verify_hostname=false

# A comma-separated list of allowed TLS versions.
# By default only TLSv1.2 is allowed.

#dbms.ssl.policy.default.tls_versions=

# A comma-separated list of allowed ciphers.
# The default ciphers are the defaults of the JVM platform.

#dbms.ssl.policy.default.ciphers=

#*****************************************************************
# Logging configuration
#*****************************************************************

# To enable HTTP logging, uncomment this line
#dbms.logs.http.enabled=true

# Number of HTTP logs to keep.
#dbms.logs.http.rotation.keep_number=5

# Size of each HTTP log that is kept.
#dbms.logs.http.rotation.size=20m

# To enable GC Logging, uncomment this line
#dbms.logs.gc.enabled=true

# GC Logging Options
# see http://docs.oracle.com/cd/E19957-01/819-0084-10/pt_tuningjava.html#wp57013 for more information.
#dbms.logs.gc.options=-XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCApplicationStoppedTime -XX:+PrintPromotionFailure -XX:+PrintTenuringDistribution

# For Java 9 and newer GC Logging Options
# see https://docs.oracle.com/javase/10/tools/java.htm#JSWOR-GUID-BE93ABDC-999C-4CB5-A88B-1994AAAC74D5
#dbms.logs.gc.options=-Xlog:gc*,safepoint,age*=trace

# Number of GC logs to keep.
#dbms.logs.gc.rotation.keep_number=5

# Size of each GC log that is kept.
#dbms.logs.gc.rotation.size=20m

# Log level for the debug log. One of DEBUG, INFO, WARN and ERROR. Be aware that logging at DEBUG level can be very verbose.
#dbms.logs.debug.level=INFO

# Size threshold for rotation of the debug log. If set to zero then no rotation will occur. Accepts a binary suffix "k",
# "m" or "g".
#dbms.logs.debug.rotation.size=20m

# Maximum number of history files for the internal log.
#dbms.logs.debug.rotation.keep_number=7

#*****************************************************************
# Miscellaneous configuration
#*****************************************************************

# Enable this to specify a parser other than the default one.
#cypher.default_language_version=3.0

# Determines if Cypher will allow using file URLs when loading data using
# `LOAD CSV`. Setting this value to `false` will cause Neo4j to fail `LOAD CSV`
# clauses that load data from the file system.
dbms.security.allow_csv_import_from_file_urls=true


# Value of the Access-Control-Allow-Origin header sent over any HTTP or HTTPS
# connector. This defaults to '*', which allows broadest compatibility. Note
# that any URI provided here limits HTTP/HTTPS access to that URI only.
#dbms.security.http_access_control_allow_origin=*

# Value of the HTTP Strict-Transport-Security (HSTS) response header. This header
# tells browsers that a webpage should only be accessed using HTTPS instead of HTTP.
# It is attached to every HTTPS response. Setting is not set by default so
# 'Strict-Transport-Security' header is not sent. Value is expected to contain
# directives like 'max-age', 'includeSubDomains' and 'preload'.
#dbms.security.http_strict_transport_security=

# Retention policy for transaction logs needed to perform recovery and backups.
dbms.tx_log.rotation.retention_policy=1 days

# Only allow read operations from this Neo4j instance. This mode still requires
# write access to the directory for lock purposes.
dbms.read_only=false

# Comma separated list of JAX-RS packages containing JAX-RS resources, one
# package name for each mountpoint. The listed package names will be loaded
# under the mountpoints specified. Uncomment this line to mount the
# org.neo4j.examples.server.unmanaged.HelloWorldResource.java from
# neo4j-server-examples under /examples/unmanaged, resulting in a final URL of
# http://localhost:7474/examples/unmanaged/helloworld/{nodeId}
#dbms.unmanaged_extension_classes=org.neo4j.examples.server.unmanaged=/examples/unmanaged

# A comma separated list of procedures and user defined functions that are allowed
# full access to the database through unsupported/insecure internal APIs.
#dbms.security.procedures.unrestricted=my.extensions.example,my.procedures.*

# A comma separated list of procedures to be loaded by default.
# Leaving this unconfigured will load all procedures found.
#dbms.security.procedures.whitelist=apoc.coll.*,apoc.load.*

#********************************************************************
# JVM Parameters
#********************************************************************

# G1GC generally strikes a good balance between throughput and tail
# latency, without too much tuning.
dbms.jvm.additional=-XX:+UseG1GC

# Have common exceptions keep producing stack traces, so they can be
# debugged regardless of how often logs are rotated.
dbms.jvm.additional=-XX:-OmitStackTraceInFastThrow

# Make sure that `initmemory` is not only allocated, but committed to
# the process, before starting the database. This reduces memory
# fragmentation, increasing the effectiveness of transparent huge
# pages. It also reduces the possibility of seeing performance drop
# due to heap-growing GC events, where a decrease in available page
# cache leads to an increase in mean IO response time.
# Try reducing the heap memory, if this flag degrades performance.
dbms.jvm.additional=-XX:+AlwaysPreTouch

# Trust that non-static final fields are really final.
# This allows more optimizations and improves overall performance.
# NOTE: Disable this if you use embedded mode, or have extensions or dependencies that may use reflection or
# serialization to change the value of final fields!
dbms.jvm.additional=-XX:+UnlockExperimentalVMOptions
dbms.jvm.additional=-XX:+TrustFinalNonStaticFields

# Disable explicit garbage collection, which is occasionally invoked by the JDK itself.
dbms.jvm.additional=-XX:+DisableExplicitGC

# Remote JMX monitoring, uncomment and adjust the following lines as needed. Absolute paths to jmx.access and
# jmx.password files are required.
# Also make sure to update the jmx.access and jmx.password files with appropriate permission roles and passwords,
# the shipped configuration contains only a read only role called 'monitor' with password 'Neo4j'.
# For more details, see: http://download.oracle.com/javase/8/docs/technotes/guides/management/agent.html
# On Unix based systems the jmx.password file needs to be owned by the user that will run the server,
# and have permissions set to 0600.
# For details on setting these file permissions on Windows see:
#     http://docs.oracle.com/javase/8/docs/technotes/guides/management/security-windows.html
#dbms.jvm.additional=-Dcom.sun.management.jmxremote.port=3637
#dbms.jvm.additional=-Dcom.sun.management.jmxremote.authenticate=true
#dbms.jvm.additional=-Dcom.sun.management.jmxremote.ssl=false
#dbms.jvm.additional=-Dcom.sun.management.jmxremote.password.file=/absolute/path/to/conf/jmx.password
#dbms.jvm.additional=-Dcom.sun.management.jmxremote.access.file=/absolute/path/to/conf/jmx.access

# Some systems cannot discover host name automatically, and need this line configured:
#dbms.jvm.additional=-Djava.rmi.server.hostname=$THE_NEO4J_SERVER_HOSTNAME

# Expand Diffie Hellman (DH) key size from default 1024 to 2048 for DH-RSA cipher suites used in server TLS handshakes.
# This is to protect the server from any potential passive eavesdropping.
dbms.jvm.additional=-Djdk.tls.ephemeralDHKeySize=2048

# This mitigates a DDoS vector.
dbms.jvm.additional=-Djdk.tls.rejectClientInitiatedRenegotiation=true

#********************************************************************
# Wrapper Windows NT/2000/XP Service Properties
#********************************************************************
# WARNING - Do not modify any of these properties when an application
#  using this configuration file has been installed as a service.
#  Please uninstall the service before modifying this section.  The
#  service can then be reinstalled.

# Name of the service
dbms.windows_service_name=neo4j

#********************************************************************
# Other Neo4j system properties
#********************************************************************
dbms.jvm.additional=-Dunsupported.dbms.udc.source=tarball

2.1.3、查看是否启动

  • 启动:进入bin目录执行./neo4j start
    在这里插入图片描述
  • 停止:进入bin目录执行./neo4j stop
    在这里插入图片描述
  • 查看状态:进入bin目录执行./neo4j status
    在这里插入图片描述

2.2、web访问

http://服务器ip:7474/browser/
在浏览器访问图数据库所在的机器上的7474端口(第一次访问账号neo4j,密码neo4j,会提示修改初始密码)
在这里插入图片描述
设置完密码后,点击左上角数据库,就能看到图数据库里面的信息了
在这里插入图片描述

3、Python3 操作Neo4j

3.1、安装py2neo

pip install py2neo

如果安不上,请用:

pip install git+https://github.com/nigelsmall/py2neo.git

在这里插入图片描述
官网地址:https://py2neo.org/v3/index.html
更多内容请参考官网给的命令:
在这里插入图片描述

3.2、效果图

在这里插入图片描述
在这里插入图片描述

3.3、简单讲解

如上图,是本示例的效果。
其中,我加了5个节点信息,3种关系(7个分支的关系),还有3种属性。
这里是给了节点加了属性,例如我给自己加了“博客地址”的属性,属性值为“https://shazhenyu.blog.csdn.net/”。
还可以给关系加属性,这里没做展示,方法是类似的。

3.4、完整源码

from py2neo import Graph, Node, Relationship

graph = Graph(host='IP地址', http_port=7474, user='neo4j', password='123456')

# 清空库
graph.delete_all()

# 创建结点
test_node_0 = Node('西游记', name='唐僧')  # 修改的部分
test_node_1 = Node('西游记', name='孙悟空')  # 修改的部分
test_node_2 = Node('西游记', name='猪八戒')  # 修改的部分
test_node_3 = Node('西游记', name='沙师弟')  # 修改的部分
test_node_4 = Node('西游记', name='白龙马')  # 修改的部分

test_node_3.setdefault("博客地址",'https://shazhenyu.blog.csdn.net/')

graph.create(test_node_0)
graph.create(test_node_1)
graph.create(test_node_2)
graph.create(test_node_3)
graph.create(test_node_4)

# 创建关系
# 分别建立了test_node_1指向test_node_2和test_node_2指向test_node_1两条关系,关系的类型为"丈夫、妻子",两条关系都有属性count,且值为1。
node_0_node_1 = Relationship(test_node_0, '师傅', test_node_1)
node_0_node_2 = Relationship(test_node_0, '师傅', test_node_2)
node_0_node_3 = Relationship(test_node_0, '师傅', test_node_3)
node_1_node_0 = Relationship(test_node_1, '徒弟', test_node_0)
node_2_node_0 = Relationship(test_node_2, '徒弟', test_node_0)
node_3_node_0 = Relationship(test_node_3, '徒弟', test_node_0)
node_4_node_0 = Relationship(test_node_4, '坐骑', test_node_0)
node_0_node_1['count'] = 1
node_4_node_0['count'] = 1

graph.create(node_0_node_1)
graph.create(node_0_node_2)
graph.create(node_0_node_3)
graph.create(node_1_node_0)
graph.create(node_2_node_0)
graph.create(node_3_node_0)
graph.create(node_4_node_0)

print(graph)
print(test_node_0)
print(test_node_1)
print(test_node_2)
print(test_node_3)
print(test_node_4)
print(node_0_node_1)
print(node_0_node_2)
print(node_0_node_3)
print(node_1_node_0)
print(node_2_node_0)
print(node_3_node_0)
print(node_4_node_0)
展开阅读全文

图数据库Neo4J使用

07-07

rn 吴斌 2013.7.5 rn1,图数据库Neo4J的介绍:rnhttp://www.neo4j.org/learn/neo4jrnNeo4J是个开源的图数据,很好用,轻便灵活,嵌入式,功能强大,而且相关资料比较齐全。rn按照Neo4J官方网站公布,该数据库可以支持数十亿的节点数:rnmassively scalable, up to several billion nodes/relationships/propertiesrn而且支持分布式部署,Master,Slave。rn对Neo4J的性能和可用性的介绍:http://video.neo4j.org/player/6qUmb/native/autoplay/rnNeo4J的CEO有个视频简单介绍Neo4J:http://player.vimeo.com/video/56040747rnrn2,安装:rnNeo4J的安装使用非常简单。下载一个稳定版本,解压,运行Neo4j.bat(windows版本) 即可。下载地址http://www.neo4j.org/download , 目前比较好用的稳定版本是1.9.1。rn然后可以访问WEB管理界面,地址:http://localhost:7474/webadminrnrn3,Neo4J的使用方式包括:rn1)可以写程序,添加,更新,用JAVA,Python,PHP, .NET等语言都可以实现。rn2)可以用命令行,添加,更新,查看,Neo4J提供基于WEB的执行界面,提供类SQL语言执行,这些语言包括Cypher,Gremlin等。 rn如下图:rn rn3)可以基于Neo4J提供的WEB UI界面添加,更新节点和关系,如下:rn rn4) 可以使用附加工具(ETL)导入数据。也可以从关系数据库中导入数据。rnhttp://www.neo4j.org/develop/importrnrn4,使用JAVA访问Neo4J:rn学习材料:http://www.neo4j.org/develop/javarnhttp://docs.neo4j.org/chunked/stable/tutorials-java-embedded.htmlrnNeo4J中节点,边,都可以任意添加属性,边(关系)的可以自定义,是个枚举值,举例:rnfirstNode = graphDb.createNode();rnfirstNode.setProperty( "message", "Hello, [Node A] , " );rnsecondNode = graphDb.createNode();rnsecondNode.setProperty( "message", " The Graph DB World! [Node B] " );rnrelationship = firstNode.createRelationshipTo( secondNode, RelTypes.KNOWS );rnrelationship.setProperty( "Reason", "Use Neo4j for testing [Edge] , " );rnrn还有一点很重要,Neo4J可以支持自定义节点类,就是说你可以按照你的需要设计Class做节点,也就是说你可以给每个节点增加任意方法,实现任意功能,这个功能的扩展性非常强。rn而且Neo4J提供嵌入式程序的能力,你可以把它当做一个Lib来使用,可以在保留其他应用开放方式的情况下,提供图数据库的强大能力。rn其例子socnet可以在Neo4J的网站上查找。rnrn5,Cypher介绍:rnCypher是Neo4J自己提供的一种高效类SQL语言,用于图数据和关系查询。关系查询采用一种模式匹配的方式,比较直观。rn很好的学习材料:http://www.neo4j.org/learn/cypherrn这个Video讲的很清楚:http://player.vimeo.com/video/50389825rnrn创建节点:(每个Node,系统会自动建立一个唯一的id,不可修改。下面的ID是Node的属性。)rncreate n=name:'Motion',ID:'M001' return n;rn创建关系:rnstart n=node(14),m=node(20) create m-[r:KNOWS]-n return r;rnrn查询:rn按id查询(这里的id是系统自动创建的):rnstart n=node(20) return m;rnrn查询所有节点:rnstart n=node(*) return n;rn查询属性,关系:rnstart n=node(9) return n,n.name,n.ID,n.level; //查看指定节点,返回需要的属性rnrnstart n=node(*) match (n)-[r:SubClassOf]->m return m,n,n.name,n.ID,r; //查找指定关系rnrn按关系查询多个节点:rnstart a = node(14) match b-[r]<->a return r,b;rnrnstart a = node(0) match c-[:KNOWS]->b-[:KNOWS]->a return a,b,c; //查找两层KNOWS关系的节点rnrnstart a = node(21) match b-[*]->a return a,b; //查找所有与a节点有关系的节点rnrn使用Where条件进行查询:(不用建立Index也可以使用)rnstart n=node(*) where n.name="Activity" return n;rn并且可以使用特定符号:rnstart n=node(*) where n.ID?="A*" return n; rnstart n=node(*) where HAS(n.type) return n,n.name,n.ID,n.type; //如果存在属性type,并且以A开头,就输出节点。rnrn配置文件自动建立索引:rn修改conf目录下的neo4j.properties文件内容如下,重启Neo4J,对重启后新建的Node生效。rn# Enable auto-indexing for nodes, default is falsernnode_auto_indexing=truernrn# The node property keys to be auto-indexed, if enabledrnnode_keys_indexable=name,IDrn# Enable auto-indexing for relationships, default is falsernrelationship_auto_indexing=truernrn# The relationship property keys to be auto-indexed, if enabledrnrelationship_keys_indexable=KNOWS,SubClassOfrnrn建立索引后可以用node_auto_index按属性值查询:rnstart n=node:node_auto_index(name="C") return n,n.name;rnrn修改属性值:rnstart a = node(*) where a.name="a" set a.name="A" return a,a.name ;rnstart n=node(0) set n.name="Root",n.ID="001" ; //给默认的根节点添加name,ID属性,便于查询。rnrn删除:rn删除所有节点和关系:rnSTART n=node(*) rnmatch n-[r]-()rndelete n,r;rnrn6,图形化显示数据:rnNeo4J自身提供WEB界面的图数据图形化展现工具,很Cool。这个Video讲的具体:https://player.vimeo.com/video/58016492rnrn在WEB管理http://localhost:7474/webadmin的“Data Browser”图形化显示区点Style,NewProfile,编辑自己的Profile,注意Add filter的 Nodes要放在前面,不然不生效,Rules是顺序执行的,显示多个属性可以用;号做换行符。rn rnrn看我做的一个User Model Ontology 关系模型:rn rnrnUser Model 与数据源的Ontology关系,左边圆圈是UserModel,右边方框是数据源:rn rnrn在Data Browser上直接输入节点编号,如10,点图形化显示,就可以图形化看到该节点和有关系的节点,点每个节点,就可以逐步显示所有节点和关系,非常方便:rn rnrn7,备份Neo4j的数据:rn1)停掉数据库.rn2)备份D:\Neo4J\neo4j-enterprise-1.9.1\data目录下graph.db目录中的所有内容.rn3)在服务器上拷贝graph.db目录中的内容到新的服务器的相同目录中,启动即可.rnrn8,为什么使用图数据库?rn这个问题,欢迎你跟我交流。图数据库的扩展性,灵活性非常好,适合用于复杂关系管理和关系查询推理,社交关系应用就是一个可选的应用场景。 而我选择图数据库,起因是研究语义网和Ontology的应用,语义网和Ontology的数据结构(三元组)就是图结构数据,而基于RDF构建的语义网,过于复杂,效率很低,实际上不如基于图数据库的构建方便好用,而且Neo4J支持RDF,SPARQL等扩展。另外,我认为OWL,实用价值不大,OWL几乎就是用XML来描述一种面向对象的编程语言,而这种编程语言如果表示能力过强(OWL Full),根本无法实现,如果表示能力过弱(OWL Lite),其实没有多少实用意义。而且,我觉得Tim Berners-Lee的语义网,即基于RDF的语义WEB,用RDF描述语义的这种思路走向了以前基于规则的NLP的思路,虽然加上Human Computation 众包的模式,但还是很难走通。我更关注的是本体关系,知识库的构建,而且知识本体主要是Meta Data,不需要很多数据,尽量不要保存instance data,获取instance data可以关联其他数据库的数据源进行查询(这种数据Key-Value或Table保存即可)。rn简单的关系推理,属性传递,其实就是查询,Neo4J中Cypher也可以完成。所以,我倾向于用图数据库来构建知识库knowledge base ("KB"),或knowledge graph ("KG")。知识库对于知识关联查询进行相关搜索或关联推荐,比较有价值。而且,从知识本体关系库中可以推理得到规则, 比如,If Place is Meeting Room and Date is Workday and Than Activity is Meeting. 这样的规则是可以从本体关系中查询得到的,规则可以在很大程度上弥补基于统计的计算学习的许多不足。 更大的作用,我觉得图数据库是一种 “合理的知识保存和描述的方式”,有利于知识的持续积累和不断演进。另外,在未来,知识本体,与模型算法建立关系,如果能在合适的关系条件下,使用正确的算法模型,那将发挥真正的巨大作用,那就离”智能“很近了。rnrn带图片的全文,参看这里 http://blog.csdn.net/ub1010/article/details/9263325rn 论坛

没有更多推荐了,返回首页