Skip to content

mneedham/neo4j-graph-algorithms

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Efficient Graph Algorithms for Neo4j (pre-Alpha / WIP)

Build Status

This project aims to develop efficient, well tested graph algorithm implementations for Neo4j 3.1 and 3.2.

You can find the documentation (WIP) here http://neo4j-contrib.github.io/neo4j-graph-algorithms

The goal is to provide parallel versions of common graph algorithms for Neo4j exposed as Cypher user defined procedures:

Centralities:

  • Page Rank

  • Betweenness Centrality

  • Closeness Centrality

Graph Partitioning:

  • Label Propagation

  • (Weakly) Connected Components

  • Strongly Connected Components

Path Finding:

  • Minimum Weight Spanning Tree

  • All Pairs- and Single Source - Shortest Path

These procedures work on a subgraphm optionally filtered by label and relationship-type. Future versions will also provide filtering and projection using Cypher queries.

We’d love your feedback, so please try out these algorithms and let us know how well they work for your use-case. Also please note things that you miss from installation instructions, readme, etc.

Please raise GitHub issues for anything you encounter or join the neo4j-users Slack group and ask in the #neo4j-graph-algorithm channel.

Installation

Just copy the graph-algorithms-algo-*.jar from the matching release into your $NEO4J_HOME/plugins directory and restart Neo4j.

Then running call dbms.procedures(); should also list the algorithm procedures.

CALL dbms.procedures() YIELD name, description, signature WHERE name STARTS WITH "algo." RETURN name, description, signature ORDER BY name
Warning

For safety reasons, in Neo4j 3.2.x you will need to add/enable this line in your $NEO4J_HOME/conf/neo4j.conf:

dbms.security.procedures.unrestricted=algo.*

Introduction

Graph theory is the study of graphs, which are mathematical structures used to model pairwise relations between nodes. A graph is made up of nodes (vertices) which are connected by relationships (edges). A graph may be undirected, meaning that there is no distinction between the two nodes associated with each relationship, or its relationships may be directed from one node to another. Relationships are what graph is all about: two nodes are joined by a relationship when they are related in a specified way.

We are tied to our friends. Cities are connected by roads and airline routes. Flora and fauna are bound together in a food web. Countries are involved in trading relationships. The World Wide Web is a virtual network of information.

  • Note that Neo4j can only save directed relationships, but we can treat them as though they are undirected when we are doing the analysis

Usage

These algorithms are exposed as Neo4j procedures. You can call them directly from Cypher in your Neo4j Browser, from cypher-shell or your client code.

For most algorithms there are two procedures, one that writes results back to the graph as node-properties and another (named algo.<name>.stream) that returns a stream of data, e.g. node-ids and computed values.

The general call syntax is:

CALL algo.<name>([label],[relType],{config})

For example for page rank on dbpedia:

CALL algo.pageRank('Page','Link',{iterations:5, dampingFactor:0.85, write: true, writeProperty:'pagerank'}); // YIELD nodes, iterations, loadMillis, computeMillis, writeMillis, dampingFactor, write, writeProperty CALL algo.pageRank.stream('Page','Link',{iterations:5, dampingFactor:0.85}) YIELD node, score RETURN node, score ORDER BY score DESC LIMIT 10;

Cypher Loading

If label and relationship-type are not selective enough to describe your subgraph to run the algorithm on, you can use Cypher statements to load or project subsets of your graph. Then use a node-statement instead of the label parameter and a relationship-statement instead of the relationship-type and use graph:'cypher' in the config.

You can also return a property value or weight (according to your config) in addition to the id’s from these statements.

CALL algo.pageRank( 'MATCH (p:Page) RETURN id(p) as id', 'MATCH (p1:Page)-[:Link]->(p2:Page) RETURN id(p1) as source, id(p2) as target', {graph:'cypher', iterations:5, write: true});

Details on how to call the individual algorithms can be found in the project’s documentation

Building

Currently aiming at Neo4j 3.1 and 3.2 (in the 3.2 branch)

git clone https://github.com/neo4j-contrib/neo4j-graph-algorithms cd neo4j-graph-algorithms mvn clean install cp algo/target/graph-algorithms-*.jar $NEO4J_HOME/plugins/ $NEO4J_HOME/bin/neo4j restart

About

Efficient Graph Algorithms for Neo4j

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Java 100.0%