Neo4j Graph Data Science (GDS)
安装
基本步骤

- 创建图:从Neo4j中进行图形投影,放入内存中进行操作。
- 选择算法:选择合适的算法。
- 存储结果:存储算法执行结果。
基本介绍
三种等级
- Production-quality:算法已经稳定和可扩展(gds.<algorithm>.)
- Beta:算法待稳定(gds.beta.<algorithm>.)
- Alpha:算法不稳定(gds.alpha.<algorithm>.)
两种变体
- Named graph variant:要操作的图形将从图形目录中读取。
CALL gds.graph.create(
  'persons',            
  'Person',             
  'KNOWS'               
)
YIELD
  graphName AS graph, nodeProjection, nodeCount AS nodes, relationshipProjection, relationshipCount AS rels
- Anonymous graph variant:作为算法执行的一部分,将创建和删除要操作的图形。
四种执行模式
- stream:以记录流的形式返回算法的结果。
- stats:返回汇总统计信息的单个记录,但不写入Neo4j数据库。
- mutate:将算法的结果写入内存中的图形,并返回汇总统计信息的单个记录。这种模式是为命名图变量设计的,因为它的效果在匿名图上是不可见的。
- write:将算法的结果写入Neo4j数据库,并返回汇总统计信息的单个记录。
最后,可以通过在命令后面附加estimate来估计执行模式。
功能函数
常用功能函数:
-  gds.util.asNode / gds.util.asNodes 语法:gds.util.asNode(nodeId) / gds,util.asNodes(nodeIds) 例子: CREATE (nAlice:User {name: 'Alice'}) CREATE (nBridget:User {name: 'Bridget'}) CREATE (nCharles:User {name: 'Charles'}) CREATE (nAlice)-[:LINK]->(nBridget) CREATE (nBridget)-[:LINK]->(nCharles) MATCH (u:User{name: 'Alice'}) WITH id(u) AS nodeId RETURN gds.util.asNode(nodeId).name AS node MATCH (u:User) WHERE NOT u.name = 'Charles' WITH collect(id(u)) AS nodeIds RETURN [x in gds.util.asNodes(nodeIds)| x.name] AS nodes
算法
格式:
CALL gds[.<tier>].<algorithm>.<execution-mode>[.<estimate>](
  graphName: String,
  configuration: Map
)
以 Dijkstra Source-Target 算法为例:
//首先创建图
CREATE (a:Location {name: 'A'}),
       (b:Location {name: 'B'}),
       (c:Location {name: 'C'}),
       (d:Location {name: 'D'}),
       (e:Location {name: 'E'}),
       (f:Location {name: 'F'}),
       (a)-[:ROAD {cost: 50}]->(b),
       (a)-[:ROAD {cost: 50}]->(c),
       (a)-[:ROAD {cost: 100}]->(d),
       (b)-[:ROAD {cost: 40}]->(d),
       (c)-[:ROAD {cost: 40}]->(d),
       (c)-[:ROAD {cost: 80}]->(e),
       (d)-[:ROAD {cost: 30}]->(e),
       (d)-[:ROAD {cost: 80}]->(f),
       (e)-[:ROAD {cost: 40}]->(f);
//其次创建图投影,存入内存中提高算法速度
CALL gds.graph.create(
    'myGraph',
    'Location',
    'ROAD',
    {
        relationshipProperties: 'cost'
    }
)
//最后执行算法
//1.评估成本
MATCH (source:Location {name: 'A'}), (target:Location {name: 'F'})
CALL gds.shortestPath.dijkstra.write.estimate('myGraph', {
    sourceNode: source,
    targetNode: target,
    relationshipWeightProperty: 'cost',
    writeRelationshipType: 'PATH'
})
YIELD nodeCount, relationshipCount, bytesMin, bytesMax, requiredMemory
RETURN nodeCount, relationshipCount, bytesMin, bytesMax, requiredMemory
//2.返回算法结果
MATCH (source:Location {name: 'A'}), (target:Location {name: 'F'})
CALL gds.shortestPath.dijkstra.stream('myGraph', {
    sourceNode: source,
    targetNode: target,
    relationshipWeightProperty: 'cost'
})
YIELD index, sourceNode, targetNode, totalCost, nodeIds, costs, path
RETURN
    index,
    gds.util.asNode(sourceNode).name AS sourceNodeName,
    gds.util.asNode(targetNode).name AS targetNodeName,
    totalCost,
    [nodeId IN nodeIds | gds.util.asNode(nodeId).name] AS nodeNames,
    costs,
    nodes(path) as path
ORDER BY index










