AI, Machine Learning and Graph Algorithms Real Life Use Cases with Graph Databases Neo4j
MATCH (s:Speaker {name:“Ivan”})-[p:PRESENTS]->(t:Talk {title:“AI”}) RETURN s,p,t
Neo4j - The Graph Company 500+ 7/10 12/25 8/10 53K+ 100+ 250+ 450+ Adoption Top Retail Firms Top Financial Firms Top Software Vendors Customers Partners • Creator of the Neo4j Graph Platform • ~200 employees • HQ in Silicon Valley, other offices include London, Munich, Paris and Malmö (Sweden) • $160M in funding from Morgan Stanley, Fidelity and others. • Over 10M+ downloads, • 250+ enterprise subscription customers with over half with >$1B in revenue Ecosystem Startups in program Enterprise customers Partners Meetup members Events per year Industry’s Largest Dedicated Investment in Graphs
What Is A Graph?
What Is A Graph?
What Is A Graph?
What Is A Graph? • Nodes (vertices) • Relationships (links, edges) • Properties • Labels
Neo4j — Changing the World ICIJ used Neo4j to uncover the world’s largest journalistic leak to date, The Panama Papers, exposing criminals, corruption and extensive tax evasion. The US space agency uses Neo4j for their “Lessons Learned” database to connect information to improve search ability effectiveness in space mission. eBay uses Neo4j to enable machine learning through knowledge graphs powering “conversational commerce”. Knowledge Graph for AIFraud Detection Knowledge Graph for humans
The world is a graph – everything is connected • people, places, events • companies, markets • countries, history, politics • sciences, art, teaching • technology, networks, machines, applications, users • software, code, dependencies, architecture, deployments • criminals, fraudsters and their behavior
• Nodes • Represent the objects in the graph • Can be labeled Property Graph Model Components Car Person Person
• Nodes • Represent the objects in the graph • Can be labeled • Relationships • Relate nodes by type and direction Property Graph Model Components Car DRIVES OW NS LOVES Person LOVES LIVES WITH Person
• Nodes • Represent the objects in the graph • Can be labeled • Relationships • Relate nodes by type and direction • Properties • Name-value pairs that can go on nodes and relationships. Property Graph Model Components Car DRIVES OW NS LOVES Person LOVES LIVES WITH Person brand: “Mini” model: “Cooper” name: “Dan” born: May 29, 1970 twitter: “@dan” name: “Ann” born: Dec 5, 1975 since: Jan 10, 2015
● Fraud Detection ● Anti Money Laundering (AML), e-commerce Fraud, First-Party Bank Fraud, Insurance Fraud, Link Analysis ● Real-time analysis of data relationships is essential to uncovering fraud rings and other sophisticated scams before fraudsters and criminals cause lasting damage. ● https://neo4j.com/use-cases/fraud-detection Use Cases
● Fraud Detection ● Master Data Management ● 360-Degree View of Customer, Cross Reference Business Objects, Data Ownership, Master Data, Organizational Hierarchies ● Organize and manage your master data with the flexible and schema-free graph database model in order to get real-time insights and a 360° view of your customers. ● https://neo4j.com/use-cases/master-data-manage ment Use Cases
● Fraud Detection ● Master Data Management ● Recommendation Engine ● Content & Media Recommendations, Graph-Aided Search Engine, Product Recommendations, Professional Networks, Social Recommendations ● Graph-powered recommendation engines help companies personalize products, content and services by leveraging a multitude of connections in real time. ● https://neo4j.com/use-cases/real-time-recommend ation-engine Use Cases
● Fraud Detection ● Master Data Management ● Recommendation Engine ● Knowledge Graph ● Asset Management, Cataloging, Content Management, Inventory, Workflow Processes ● Tap into the power of graph-based search tools for better digital asset management using the most flexible and scalable solution on the market. ● https://neo4j.com/use-cases/knowledge-graph Use Cases
● Fraud Detection ● Master Data Management ● Recommendation Engine ● Knowledge Graph ● Network and Database Infrastructure Monitoring ● Asset Management, Cybersecurity, Impact Analysis, Quality-of-Service Mapping, Root Cause Analysis ● Graph databases are inherently more suitable than RDBMS for making sense of complex interdependencies central to managing networks and IT infrastructure. ● https://neo4j.com/use-cases/network-and-it-opera tions Use Cases
● Fraud Detection ● Master Data Management ● Recommendation Engine ● Knowledge Graph ● Network and Database Infrastructure Monitoring ● Social Media and Social Network Graphs ● Community Cluster Analysis, Friend-of-Friend Recommendations, Influencer Analysis, Sharing & Collaboration, Social Recommendations ● Easily leverage social connections or infer relationships based on activity when you use a graph database to power your social network application. ● https://neo4j.com/use-cases/social-network Use Cases
● Fraud Detection ● Master Data Management ● Recommendation Engine ● Knowledge Graph ● Network and Database Infrastructure Monitoring ● Social Media and Social Network Graphs ● Artificial Intelligence and Machine Learning ● Artificial Intelligence (AI) is poised to drive the next wave of technological disruption across nearly every industry. Just like previous technology revolutions in web and mobile, however, there will be winners and losers based on who harnesses this technology for a true competitive advantage. ● https://neo4j.com/use-cases/artificial-intelligence Use Cases
Neo4j Is a Database No Size Limit Binary & HTTP Protocol ACID Transactions 2-4 M ops/s per core Clustering Scale & HA Official Drivers Neo4j RELIABILITY PERFORMANCE SCALABILITY AVAILABILITY INTEGRATION
Graph Visualization Developer Workbench Extensible Procedures & Functions Cypher Query Language Schema-free & Optionally Schema-based Property Graph Model Neo4j Is a Graph Database Native Graph DB Graph Storage Neo4j
Native Graph Storage At Write Time: data is connected as it is stored “We keep the connection lines alive.” At Read Time: Lightning-fast retrieval of data and relationships via pointer chasing Graph Value is found in the Traversals and Hops Index-free adjacency
Native Graph Storage Node Store Relationship Store Property Store Dynamic Store IN OUT LOOP PK PK PK PK Table A Table B Table C Table D
The Raft Consensus Algorithm Equivalent to Paxos in fault-tolerance and performance. Causal Clustering https://raft.github.io/
• Node property existence • Relationship property existence • Unique property • Node and combined properties uniqueness Schema-free or Schema-based ACTED_IN roles: [“Zachry”] name: Tom Hanks born: 1956 Person Actor name: Hugo Weaving born: 1960 Person Actor title: Cloud Atlas released: 2012 Movie ACTED_IN roles: [“Bill Smoke”] title: The Matrix released: 1999 Movie ACTED_IN roles: [“Agent Smith”] name: Lana Wachowski born: 1965 Person Director DIRECTED DIRECTED
Ann Cypher Query Language CREATE (:Person { name:"Dan"} ) -[:LOVES]-> (:Person { name:"Ann"} ) LOVES Dan NODE LABEL PROPERTY Relationship NODE LABEL PROPERTY
Cypher Query Language MATCH (:Person { name:"Dan"} ) -[:LOVES]-> ( whom ) RETURN whom NODE Relationship NODE ? LOVES Dan
Neo4j Browser
Neo4j Desktop
Neo4j Bloom
Neo4j Bloom
Fun without Fuss! https://neo4j.com/lp/try-neo4j-sandbox
Graph Analytics Query (e.g. Cypher/Python) Real-time, local decisioning and pattern matching Graph Algorithms Libraries Global analysis and iterations You know what you’re looking for and making a decision You’re learning the overall structure of a network, updating data, and predicting Local Patterns Global Computation
Predict & Prescribe Requires Understanding Relationships and Structures Flow & Dynamics Interactions & Resiliency Propagation Pathways Complex and Emerging Behavior
Source: “Systemic delay propagation in the US airport network” – Fleurquin, Ramasco, Eguiluz - https://ifisc.uib-csic.es/~jramasco/text/characterization_delays.pdf
Planning and Least Cost Routing
Bridge Points Languages Telecom Network Source: “Fast unfolding of communities in large networks” – Blondel, Guillaume, Lambiotte, Lefebvre - https://arxiv.org/pdf/0803.0476.pdf
Centrality ● PageRank ● ArticleRank ● Betweenness Centrality ● Closeness Centrality ● Harmonic Centrality ● Eigenvector Centrality ● Degree Centrality Community Detection ● Louvain ● Label Propagation ● Connected Components ● Strongly Connected Components ● Triangle Counting / Clustering Coefficient ● Balanced Triads Similarity ● Jaccard Similarity ● Cosine Similarity ● Pearson Similarity ● Euclidean Distance ● Overlap Similarity Graph Algorithms https://neo4j.com/docs/graph-algorithms https://neo4j.com/graph-algorithms-book Path Finding ● Minimum Weight Spanning Tree ● Shortest Path ● Single Source Shortest Path ● All Pairs Shortest Path ● A* ● Yen’s K-shortest paths ● Random Walk Link Prediction ● Adamic Adar ● Common Neighbors ● Preferential Attachment ● Resource Allocation ● Same Community ● Total Neighbors
Pathfinding & Search • Single-Source Shortest Path ○ Calculates “shortest” path between a node and all other nodes • All-Pairs Shortest Path ○ Finds all shortest paths between all nodes
https://github.com/johnymontana/osm-routing-app
Centrality Algorithms
Centrality Algorithms
Similarity Algorithms Evaluates how alike nodes are at an individual level Properties or attributes •Cosine Similarity Recommendations (Movies): https://neo4j.com/graphgist/movie-recommendations-with-k-nearest-neighbors-and-cosine-similarity •Social similarities (Interests): https://medium.com/neo4j/cosine-similarity-in-neo4j-d617b0442439
Community Detection Algorithms Evaluates how a group is clustered or partitioned Different approaches to define a community •Label Propagation Prediction Drug-Drug Interaction: https://neo4j.com/blog/graph-algorithms-neo4j-label-propagation •Twitter Polarity Classification: https://dl.acm.org/citation.cfm?id=2140465
Link Prediction Can we infer which new interactions are likely to occur in the future? “We formalize this question as the link prediction problem, and develop approaches to link prediction based on measures for analyzing the “proximity” of nodes in a network.” Jon Kleinberg and David Liben-Nowell A Goal, an Approach & an Algorithm Category
What can we use this approach for? ● future associations in a terrorist network ● co-authorships in a citation network ● associations between molecules in a biology network ● interest in an artist or artwork
Predicting a link means that we are predicting some future behaviour or an unobserved fact. For example, in a citation network, we’re actually predicting the action of two people collaborating on a paper. What's common across all these use cases?
Based on number of potential triangles / closing triangles Concept is that if 2 strangers have a friend/colleague in common, they are more likely to be introduced Common Neighbours
More Resources https://neo4j.com/sandbox-v2/ https://neo4j.com/graph-algorithms-book https://neo4j.com/graphacademy/online-training/
Thank You! 50
Source: “Communities, modules and large-scale structure in networks“ - Mark Newman Source: “Hierarchical structure and the prediction of missing links in networks”; ”Structure and inference in annotated networks” - A. Clauset, C. Moore, and M.E.J. Newman.  Graph Algorithms Extract Structure and Infer Behavior
Centralities • PageRank ○ Which nodes have the most overall influence • Closeness ○ Which nodes are able to reach entire group the fastest • Betweenness ○ Which nodes are the bridges between different clusters (most shortest paths) • Degree ○ The number of connections in/out of a node
Centralities • PageRank ○ Which nodes have the most overall influence • Closeness ○ Which nodes are able to reach entire group the fastest • Betweenness ○ Which nodes are the bridges between different clusters (most shortest paths) • Degree ○ The number of connections in/out of a node
Source: Maven 7 Centralities • PageRank ○ Which nodes have the most overall influence • Closeness ○ Which nodes are able to reach entire group the fastest • Betweenness ○ Which nodes are the bridges between different clusters (most shortest paths) • Degree ○ The number of connections in/out of a node

AI, ML and Graph Algorithms: Real Life Use Cases with Neo4j

  • 1.
    AI, Machine Learningand Graph Algorithms Real Life Use Cases with Graph Databases Neo4j
  • 2.
  • 3.
    Neo4j - TheGraph Company 500+ 7/10 12/25 8/10 53K+ 100+ 250+ 450+ Adoption Top Retail Firms Top Financial Firms Top Software Vendors Customers Partners • Creator of the Neo4j Graph Platform • ~200 employees • HQ in Silicon Valley, other offices include London, Munich, Paris and Malmö (Sweden) • $160M in funding from Morgan Stanley, Fidelity and others. • Over 10M+ downloads, • 250+ enterprise subscription customers with over half with >$1B in revenue Ecosystem Startups in program Enterprise customers Partners Meetup members Events per year Industry’s Largest Dedicated Investment in Graphs
  • 4.
    What Is AGraph?
  • 5.
    What Is AGraph?
  • 6.
    What Is AGraph?
  • 7.
    What Is AGraph? • Nodes (vertices) • Relationships (links, edges) • Properties • Labels
  • 8.
    Neo4j — Changing theWorld ICIJ used Neo4j to uncover the world’s largest journalistic leak to date, The Panama Papers, exposing criminals, corruption and extensive tax evasion. The US space agency uses Neo4j for their “Lessons Learned” database to connect information to improve search ability effectiveness in space mission. eBay uses Neo4j to enable machine learning through knowledge graphs powering “conversational commerce”. Knowledge Graph for AIFraud Detection Knowledge Graph for humans
  • 9.
    The world isa graph – everything is connected • people, places, events • companies, markets • countries, history, politics • sciences, art, teaching • technology, networks, machines, applications, users • software, code, dependencies, architecture, deployments • criminals, fraudsters and their behavior
  • 10.
    • Nodes • Representthe objects in the graph • Can be labeled Property Graph Model Components Car Person Person
  • 11.
    • Nodes • Representthe objects in the graph • Can be labeled • Relationships • Relate nodes by type and direction Property Graph Model Components Car DRIVES OW NS LOVES Person LOVES LIVES WITH Person
  • 12.
    • Nodes • Representthe objects in the graph • Can be labeled • Relationships • Relate nodes by type and direction • Properties • Name-value pairs that can go on nodes and relationships. Property Graph Model Components Car DRIVES OW NS LOVES Person LOVES LIVES WITH Person brand: “Mini” model: “Cooper” name: “Dan” born: May 29, 1970 twitter: “@dan” name: “Ann” born: Dec 5, 1975 since: Jan 10, 2015
  • 13.
    ● Fraud Detection ●Anti Money Laundering (AML), e-commerce Fraud, First-Party Bank Fraud, Insurance Fraud, Link Analysis ● Real-time analysis of data relationships is essential to uncovering fraud rings and other sophisticated scams before fraudsters and criminals cause lasting damage. ● https://neo4j.com/use-cases/fraud-detection Use Cases
  • 14.
    ● Fraud Detection ●Master Data Management ● 360-Degree View of Customer, Cross Reference Business Objects, Data Ownership, Master Data, Organizational Hierarchies ● Organize and manage your master data with the flexible and schema-free graph database model in order to get real-time insights and a 360° view of your customers. ● https://neo4j.com/use-cases/master-data-manage ment Use Cases
  • 15.
    ● Fraud Detection ●Master Data Management ● Recommendation Engine ● Content & Media Recommendations, Graph-Aided Search Engine, Product Recommendations, Professional Networks, Social Recommendations ● Graph-powered recommendation engines help companies personalize products, content and services by leveraging a multitude of connections in real time. ● https://neo4j.com/use-cases/real-time-recommend ation-engine Use Cases
  • 16.
    ● Fraud Detection ●Master Data Management ● Recommendation Engine ● Knowledge Graph ● Asset Management, Cataloging, Content Management, Inventory, Workflow Processes ● Tap into the power of graph-based search tools for better digital asset management using the most flexible and scalable solution on the market. ● https://neo4j.com/use-cases/knowledge-graph Use Cases
  • 17.
    ● Fraud Detection ●Master Data Management ● Recommendation Engine ● Knowledge Graph ● Network and Database Infrastructure Monitoring ● Asset Management, Cybersecurity, Impact Analysis, Quality-of-Service Mapping, Root Cause Analysis ● Graph databases are inherently more suitable than RDBMS for making sense of complex interdependencies central to managing networks and IT infrastructure. ● https://neo4j.com/use-cases/network-and-it-opera tions Use Cases
  • 18.
    ● Fraud Detection ●Master Data Management ● Recommendation Engine ● Knowledge Graph ● Network and Database Infrastructure Monitoring ● Social Media and Social Network Graphs ● Community Cluster Analysis, Friend-of-Friend Recommendations, Influencer Analysis, Sharing & Collaboration, Social Recommendations ● Easily leverage social connections or infer relationships based on activity when you use a graph database to power your social network application. ● https://neo4j.com/use-cases/social-network Use Cases
  • 19.
    ● Fraud Detection ●Master Data Management ● Recommendation Engine ● Knowledge Graph ● Network and Database Infrastructure Monitoring ● Social Media and Social Network Graphs ● Artificial Intelligence and Machine Learning ● Artificial Intelligence (AI) is poised to drive the next wave of technological disruption across nearly every industry. Just like previous technology revolutions in web and mobile, however, there will be winners and losers based on who harnesses this technology for a true competitive advantage. ● https://neo4j.com/use-cases/artificial-intelligence Use Cases
  • 20.
    Neo4j Is aDatabase No Size Limit Binary & HTTP Protocol ACID Transactions 2-4 M ops/s per core Clustering Scale & HA Official Drivers Neo4j RELIABILITY PERFORMANCE SCALABILITY AVAILABILITY INTEGRATION
  • 21.
  • 22.
    Native Graph Storage AtWrite Time: data is connected as it is stored “We keep the connection lines alive.” At Read Time: Lightning-fast retrieval of data and relationships via pointer chasing Graph Value is found in the Traversals and Hops Index-free adjacency
  • 23.
  • 24.
    The Raft ConsensusAlgorithm Equivalent to Paxos in fault-tolerance and performance. Causal Clustering https://raft.github.io/
  • 25.
    • Node propertyexistence • Relationship property existence • Unique property • Node and combined properties uniqueness Schema-free or Schema-based ACTED_IN roles: [“Zachry”] name: Tom Hanks born: 1956 Person Actor name: Hugo Weaving born: 1960 Person Actor title: Cloud Atlas released: 2012 Movie ACTED_IN roles: [“Bill Smoke”] title: The Matrix released: 1999 Movie ACTED_IN roles: [“Agent Smith”] name: Lana Wachowski born: 1965 Person Director DIRECTED DIRECTED
  • 26.
    Ann Cypher Query Language CREATE(:Person { name:"Dan"} ) -[:LOVES]-> (:Person { name:"Ann"} ) LOVES Dan NODE LABEL PROPERTY Relationship NODE LABEL PROPERTY
  • 27.
    Cypher Query Language MATCH(:Person { name:"Dan"} ) -[:LOVES]-> ( whom ) RETURN whom NODE Relationship NODE ? LOVES Dan
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
    Fun without Fuss!https://neo4j.com/lp/try-neo4j-sandbox
  • 33.
    Graph Analytics Query (e.g.Cypher/Python) Real-time, local decisioning and pattern matching Graph Algorithms Libraries Global analysis and iterations You know what you’re looking for and making a decision You’re learning the overall structure of a network, updating data, and predicting Local Patterns Global Computation
  • 34.
    Predict & Prescribe RequiresUnderstanding Relationships and Structures Flow & Dynamics Interactions & Resiliency Propagation Pathways Complex and Emerging Behavior
  • 35.
    Source: “Systemic delaypropagation in the US airport network” – Fleurquin, Ramasco, Eguiluz - https://ifisc.uib-csic.es/~jramasco/text/characterization_delays.pdf
  • 36.
    Planning and LeastCost Routing
  • 37.
    Bridge Points Languages TelecomNetwork Source: “Fast unfolding of communities in large networks” – Blondel, Guillaume, Lambiotte, Lefebvre - https://arxiv.org/pdf/0803.0476.pdf
  • 38.
    Centrality ● PageRank ● ArticleRank ●Betweenness Centrality ● Closeness Centrality ● Harmonic Centrality ● Eigenvector Centrality ● Degree Centrality Community Detection ● Louvain ● Label Propagation ● Connected Components ● Strongly Connected Components ● Triangle Counting / Clustering Coefficient ● Balanced Triads Similarity ● Jaccard Similarity ● Cosine Similarity ● Pearson Similarity ● Euclidean Distance ● Overlap Similarity Graph Algorithms https://neo4j.com/docs/graph-algorithms https://neo4j.com/graph-algorithms-book Path Finding ● Minimum Weight Spanning Tree ● Shortest Path ● Single Source Shortest Path ● All Pairs Shortest Path ● A* ● Yen’s K-shortest paths ● Random Walk Link Prediction ● Adamic Adar ● Common Neighbors ● Preferential Attachment ● Resource Allocation ● Same Community ● Total Neighbors
  • 39.
    Pathfinding & Search •Single-Source Shortest Path ○ Calculates “shortest” path between a node and all other nodes • All-Pairs Shortest Path ○ Finds all shortest paths between all nodes
  • 40.
  • 41.
  • 42.
  • 43.
    Similarity Algorithms Evaluates howalike nodes are at an individual level Properties or attributes •Cosine Similarity Recommendations (Movies): https://neo4j.com/graphgist/movie-recommendations-with-k-nearest-neighbors-and-cosine-similarity •Social similarities (Interests): https://medium.com/neo4j/cosine-similarity-in-neo4j-d617b0442439
  • 44.
    Community Detection Algorithms Evaluateshow a group is clustered or partitioned Different approaches to define a community •Label Propagation Prediction Drug-Drug Interaction: https://neo4j.com/blog/graph-algorithms-neo4j-label-propagation •Twitter Polarity Classification: https://dl.acm.org/citation.cfm?id=2140465
  • 45.
    Link Prediction Can weinfer which new interactions are likely to occur in the future? “We formalize this question as the link prediction problem, and develop approaches to link prediction based on measures for analyzing the “proximity” of nodes in a network.” Jon Kleinberg and David Liben-Nowell A Goal, an Approach & an Algorithm Category
  • 46.
    What can weuse this approach for? ● future associations in a terrorist network ● co-authorships in a citation network ● associations between molecules in a biology network ● interest in an artist or artwork
  • 47.
    Predicting a linkmeans that we are predicting some future behaviour or an unobserved fact. For example, in a citation network, we’re actually predicting the action of two people collaborating on a paper. What's common across all these use cases?
  • 48.
    Based on numberof potential triangles / closing triangles Concept is that if 2 strangers have a friend/colleague in common, they are more likely to be introduced Common Neighbours
  • 49.
  • 50.
  • 51.
    Source: “Communities, modulesand large-scale structure in networks“ - Mark Newman Source: “Hierarchical structure and the prediction of missing links in networks”; ”Structure and inference in annotated networks” - A. Clauset, C. Moore, and M.E.J. Newman.  Graph Algorithms Extract Structure and Infer Behavior
  • 52.
    Centralities • PageRank ○ Whichnodes have the most overall influence • Closeness ○ Which nodes are able to reach entire group the fastest • Betweenness ○ Which nodes are the bridges between different clusters (most shortest paths) • Degree ○ The number of connections in/out of a node
  • 53.
    Centralities • PageRank ○ Whichnodes have the most overall influence • Closeness ○ Which nodes are able to reach entire group the fastest • Betweenness ○ Which nodes are the bridges between different clusters (most shortest paths) • Degree ○ The number of connections in/out of a node
  • 54.
    Source: Maven 7 Centralities •PageRank ○ Which nodes have the most overall influence • Closeness ○ Which nodes are able to reach entire group the fastest • Betweenness ○ Which nodes are the bridges between different clusters (most shortest paths) • Degree ○ The number of connections in/out of a node