SlideShare a Scribd company logo
http://bit.ly/ArangoDBGraphAnalytics
tl;dr Graph Analytics Answer questions from Graph Data 2 Graph Embeddings and Graph Neural Networks Learning Graphs Graph-based Machine Learning Metadata Utilizing Graphs for Operating ML Infrastructure https://dzone.com/articles/graph-databases-machine-learning
Challenge...
Agenda ML Infrastructure & Metadata Graphs Graph Database Graph Analytics Graph Embeddings Graphs Neural Networks Part 2
Jörg Schad, PhD ● ○ ○ ○ ● @joerg_schad
● ● ● ● ●
This workshop... 7 … is for you! Please share ● Expectations ● Questions ● Feedback ● Ask for breaks if needed ● …. … is also virtual! ● Let us work together in these times!
Who are you? 8 Background Expectations ...
This workshop... 9 https://github.com/joerg84/Graph_Powered_ML_Workshop
Why should you care? 10 https://towardsdatascience.com/predictions-and-hopes-for-graph-ml-in-2021-6af2121c3e3d
What problems can we solve? Graph Analytics Answer questions from Graph - Community Detection - Recommendations - Centrality - Path Finding - Fraud Detection - Permission Management - ... 11 Graph Embeddings and Graph Neural Networks Learning Graphs - Node/Link Classification - Link Prediction - Classification of Graphs - ... Graph-based Machine Learning Metadata Utilizing Graphs for Operating ML Infrastructure - Data Provenance - Audit Trails - Privacy (GDPR/CCPA) - ,,,
Agenda ML Infrastructure & Metadata Graphs Graph Database Graph Analytics Graph Embeddings I Graphs Neural Networks
Graph Analytics with ArangoDB Graph Data Model ● Connections are first class citizens ● Vertices and Edges ● Native or build on top of other data models 13
Graph Analytics with ArangoDB Graph Properties ● (un)directed ○ Facebook vs Twitter ● weighted ● Sparse/Dense ● (a)cyclic Graph Queries ● Traversals ● Search ● Graph Algorithms 14
Optional Lab: Graphs & Properties https://colab.research.google.com/github/joerg84/Graph_Powered_ML_Workshop/blob/master/Graph_properties.ipynb
Graph Analytics with 16 ▸ ▸ ▸ ▸
Graph Databases 17
18 AQL - A Query Language That Feels Like Coding ● Common query language for all data-models ● Aims to be human-readable ● Same language for all clients, no matter what programming language people use ● Easy to understand for anyone with an SQL background FOR c IN company FILTER c.name == @companyName FOR department IN 1..6 INBOUND c isPartOf RETURN { c: c.name, department: department.name, ordered: ( FOR o IN orders FILTER o.contact == department.contact RETURN {date: o.date, amount: o.amount} ) }
FOR d IN v_imdb SEARCH ANALYZER(d.description IN TOKENS('amazing action world alien sci-fi science documental', 'text_en') || BOOST(d.description IN TOKENS('galaxy', 'text_en'), 5), 'text_en') SORT BM25(d) DESC LIMIT 10 FOR vertex, edge, path IN 1..1 INBOUND d imdb_edges FILTER path.edges[0].$label == "DIRECTED" RETURN DISTINCT { "director" : vertex.name, "movie" : d.title } ArangoSearch is a powerful search and similarity ranking engine natively integrated into ArangoDB. Combine search with any other data model. 19 ArangoSearch
Property-Graph-Model Languages ● Tinkerpop/Gremlin ● Cypher ● AQL ● ... ● subject, predicate, and object ● No internal structure of nodes/edges ● Languages ● SPARQL 20 Person name: Max City location: born_in year: 1984 --- RDF Triple Store Ontologies & Logic for Inference
21 https://w3c.github.io/rdf-star/ <<:bob foaf:age 23>> ex:certainty 0.9 . SELECT ?p ?a ?c WHERE { <<?p foaf:age ?a>> ex:certainty ?c . } Support - Convert to plain RDF (tool) - Optimized storage/processing - Conversion to PG (tool) Max Job1 start end empl oyer
Lab: SPARQL https://colab.research.google.com/github/joerg84/Graph_Powered_ML_Workshop/blob/master/Sparql.ipynb
Graph Modelling Edge Attribute Vertex Attribute 23 Person name: Max rated rating: 5 --- Person name: Max Movie: Free Solo: Movie: Free Solo Rating rating: 5 gave rated_by
Lab: Property Graph Queries https://colab.research.google.com/github/joerg84/Graph_Powered_ML_Workshop/blob/master/Graphs_Queries.ipynb
Graph Analytics with ArangoDB 25 http://btimmermans.com/2017/12/11/machine-learning-overview/
(Graph) Analytics 26 https://research.aimultiple.com/graph-analytics/
Graph Analytics with ArangoDB
Why Graph?
Knowledge Graphs and Machine Learning
Graph Algorithms ● Search/Traversal ○ Find a node/edge ○ BFS/DFS (already covered) ● Pathfinding ○ How to get from a to b ● Centrality ○ What are the important nodes (e.g., influencer) in a network? ● Cycle Detection ○ Deadlock Detection ○ Network Analysis ● Community Detection ○ Are there subgroups? 30
Shortest Path ● Shortest Path ○ Dijkstra ○ Bellman-Ford ● K shortest path ● Single Source Shortest path ● All-Pairs Shortest Path 31 https://towardsdatascience.com/10-graph-algorithms-visually-explained-e57faa1336f3
Minimal Spanning Tree ● Network Broadcast/routing ● Image segmentation ● Algorithms ○ Prim’s algorithm ■ Extend from random start vertex ○ Kruskal’s algorithm ■ Keep choosing cheapest edges as long as it doesn’t create a cycle 32 https://towardsdatascience.com/10-graph-algorithms-visually-explained-e57faa1336f3
Minimal Spanning Tree 33 https://amortizedminds.wordpress.com/tag/algorithm-2/
Minimal Spanning Tree 34 https://amortizedminds.wordpress.com/tag/algorithm-2/
Cycle Detection ● Deadlock Detection ● Network Analysis ● Algorithms ○ DFS ○ Floyd’s algorithm ■ tortoise and the hare algorithm ○ Brent’s algorithm ○ Johnson’s algorithm 35 https://towardsdatascience.com/10-graph-algorithms-visually-explained-e57faa1336f3
Community Detection ● Triangle Count ● (Strongly )Connected Components ○ Kosaraju’s algorithm ○ Tarjan’s algorithm ● Label Propagation ● Application ○ Social Networks ○ Clustering ○ … https://networkx.github.io/documentation/stable/r eference/algorithms/community.html 36
Topological Sort ● ● ● ● Applications ○ Dependencies ○ Scheduling ■ E.g., Makefiles 37
Maximum flow ● ● ● ● ○ 38
Centrality ● Degree Centrality ○ How many in/outgoing connections ● Closeness Centrality ○ Average closeness to all nodes ● Betweenness Centrality ○ Connecting subgroups ○ How often is node on shortest path ● PageRank ○ Transitive Influence 39 https://www.arangodb.com/docs/stable/graphs-pregel.html#vertex-centrality
40 https://networkx.github.io/ Graph ToolBox ● Load and store graphs ● Analyze network structure ● Build network models ● Design new network algorithms ● Visualize ● ...
Optional) Lab: NetworkX https://colab.research.google.com/github/joerg84/Graph_Powered_ML_Workshop/blob/master/NetworkX.ipynb
Lab: Graphs Algorithms https://colab.research.google.com/github/joerg84/Graph_Powered_ML_Workshop/blob/master/Graph_properties.ipynb
Graph Analytics with ArangoDB 43 Fraud Detections Panama papers Enterprise Hierarchies Permission Management Internet Of Things Bill of Materials Representation Learning ...
44 https://blog.dgraph.io/post/recommendation/
45 https://www.independent.co.uk/arts-entertainment/films/features/films-best-wat ch-coronavirus-isolation-quarantine-movies-classic-greatest-essential-list-a939 4006.html
46 User Movie Rates
47 User Movie Rates I Collaborative Filtering “Find highly rated movies, by people who also like movies I rated highly” 1. Find movies I rated with 5 stars 2. Find users who also rated these movies also with 5 stars 3. Find additional movies also rated 5 stars by those users
Lab: Graph Analytics https://colab.research.google.com/github/joerg84/Graph_Powered_ML_Workshop/blob/master/Graph_Analytics.ipynb
Fraud Detection 49 Bank Collection Branch Collection Customer Vertex Collection Account Vertex Collection Transaction Edge Collection AccountHolder Edge Collection
Lab: Fraud Detection https://colab.research.google.com/github/joerg84/Graph_Powered_ML_Workshop/blob/master/Fraud_Detection.ipynb
51 PageRank works by counting the number and quality of links to a page to determine a rough estimate of how important the website is. The underlying assumption is that more important websites are likely to receive more links from other websites. Google https://en.wikipedia.org/wiki/PageRank
52 Goal: How likely a random surfer will end up at a page? - Random walk across link graph - Iteratively distributing rank to neighbouring nodes https://en.wikipedia.org/wiki/PageRank https://stanford.edu/~rezab/classes/cme323/S15/notes/lec8.pdf
53 https://blog.acolyer.org/2015/05/26/pregel-a-system-for-large-scale-graph-processing/
54 https://blog.acolyer.org/2015/05/26/pregel-a-system-for-large-scale-graph-processing/
Lab: Pregel https://colab.research.google.com/github/arangodb/interactive_tutorials/blob/master/notebooks/Pregel.ipynb
Thanks for listening! Reach out with Feedback/Questions! • @arangodb • https://www.arangodb.com/ • docker pull arangodb https://www.udemy.com/course/getting-started-with-arangodb/

More Related Content

What's hot (20)

PDF
Optimizing Delta/Parquet Data Lakes for Apache Spark
Databricks
 
PPTX
Non relational databases-no sql
Ram kumar
 
PPTX
Visualization using Tableau
Girija Muscut
 
PDF
Data Engineering Basics
Catherine Kimani
 
PDF
Data Visualization With Tableau | Edureka
Edureka!
 
PPTX
Microsoft Azure Data Factory Hands-On Lab Overview Slides
Mark Kromer
 
PDF
The Modern Data Team for the Modern Data Stack: dbt and the Role of the Analy...
Databricks
 
PPTX
Azure data platform overview
James Serra
 
PPTX
Neo4j Popular use case
Neo4j
 
PPTX
Databricks Platform.pptx
Alex Ivy
 
PPTX
Graph databases
Vinoth Kannan
 
PDF
Enabling a Data Mesh Architecture with Data Virtualization
Denodo
 
PDF
RDBMS to Graph
Neo4j
 
PPTX
Tableau Presentation
Andrea Bissoli
 
PDF
Democratizing Data
Databricks
 
PDF
A deep dive session on Tableau
Visual_BI
 
PDF
Data lineage and observability with Marquez - subsurface 2020
Julien Le Dem
 
PPTX
PowerBI - Porto.Data - 20150219
Rui Romano
 
PDF
Data Modeling and Relational to NoSQL
DATAVERSITY
 
Optimizing Delta/Parquet Data Lakes for Apache Spark
Databricks
 
Non relational databases-no sql
Ram kumar
 
Visualization using Tableau
Girija Muscut
 
Data Engineering Basics
Catherine Kimani
 
Data Visualization With Tableau | Edureka
Edureka!
 
Microsoft Azure Data Factory Hands-On Lab Overview Slides
Mark Kromer
 
The Modern Data Team for the Modern Data Stack: dbt and the Role of the Analy...
Databricks
 
Azure data platform overview
James Serra
 
Neo4j Popular use case
Neo4j
 
Databricks Platform.pptx
Alex Ivy
 
Graph databases
Vinoth Kannan
 
Enabling a Data Mesh Architecture with Data Virtualization
Denodo
 
RDBMS to Graph
Neo4j
 
Tableau Presentation
Andrea Bissoli
 
Democratizing Data
Databricks
 
A deep dive session on Tableau
Visual_BI
 
Data lineage and observability with Marquez - subsurface 2020
Julien Le Dem
 
PowerBI - Porto.Data - 20150219
Rui Romano
 
Data Modeling and Relational to NoSQL
DATAVERSITY
 

Similar to Graph Analytics with ArangoDB (20)

PPTX
ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....
ArangoDB Database
 
PPTX
Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022
ArangoDB Database
 
PPTX
Machine Learning + Graph Databases for Better Recommendations
ChristopherWoodward16
 
PPTX
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022
ArangoDB Database
 
PPTX
Accelerating NLP with Dask and Saturn Cloud
Sujit Pal
 
PDF
GraphGen: Conducting Graph Analytics over Relational Databases
PyData
 
PDF
GraphGen: Conducting Graph Analytics over Relational Databases
Konstantinos Xirogiannopoulos
 
PPTX
Accelerating NLP with Dask on Saturn Cloud: A case study with CORD-19
Sujit Pal
 
PDF
Getting started with Apache Spark in Python - PyLadies Toronto 2016
Holden Karau
 
PDF
aRangodb, un package per l'utilizzo di ArangoDB con R
GraphRM
 
PDF
Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...
MLconf
 
PDF
Data Analysis with TensorFlow in PostgreSQL
EDB
 
PDF
A fast introduction to PySpark with a quick look at Arrow based UDFs
Holden Karau
 
PDF
Neo4j: Graph-like power
Roman Rodomansky
 
PDF
Brett Ragozzine - Graph Databases and Neo4j
Brett Ragozzine
 
PDF
Multiplatform Spark solution for Graph datasources by Javier Dominguez
Big Data Spain
 
PDF
VenmoPlus demo week6
Qingpeng "Q.P." Zhang
 
PDF
0629venmoplus
Qingpeng "Q.P." Zhang
 
PDF
R programming for data science
Sovello Hildebrand
 
PDF
Druid
Dori Waldman
 
ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....
ArangoDB Database
 
Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022
ArangoDB Database
 
Machine Learning + Graph Databases for Better Recommendations
ChristopherWoodward16
 
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022
ArangoDB Database
 
Accelerating NLP with Dask and Saturn Cloud
Sujit Pal
 
GraphGen: Conducting Graph Analytics over Relational Databases
PyData
 
GraphGen: Conducting Graph Analytics over Relational Databases
Konstantinos Xirogiannopoulos
 
Accelerating NLP with Dask on Saturn Cloud: A case study with CORD-19
Sujit Pal
 
Getting started with Apache Spark in Python - PyLadies Toronto 2016
Holden Karau
 
aRangodb, un package per l'utilizzo di ArangoDB con R
GraphRM
 
Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...
MLconf
 
Data Analysis with TensorFlow in PostgreSQL
EDB
 
A fast introduction to PySpark with a quick look at Arrow based UDFs
Holden Karau
 
Neo4j: Graph-like power
Roman Rodomansky
 
Brett Ragozzine - Graph Databases and Neo4j
Brett Ragozzine
 
Multiplatform Spark solution for Graph datasources by Javier Dominguez
Big Data Spain
 
VenmoPlus demo week6
Qingpeng "Q.P." Zhang
 
0629venmoplus
Qingpeng "Q.P." Zhang
 
R programming for data science
Sovello Hildebrand
 
Ad

More from ArangoDB Database (20)

PPTX
ArangoDB 3.9 - Further Powering Graphs at Scale
ArangoDB Database
 
PDF
GraphSage vs Pinsage #InsideArangoDB
ArangoDB Database
 
PDF
Webinar: ArangoDB 3.8 Preview - Analytics at Scale
ArangoDB Database
 
PDF
Getting Started with ArangoDB Oasis
ArangoDB Database
 
PDF
Custom Pregel Algorithms in ArangoDB
ArangoDB Database
 
PPTX
Hacktoberfest 2020 - Intro to Knowledge Graphs
ArangoDB Database
 
PDF
A Graph Database That Scales - ArangoDB 3.7 Release Webinar
ArangoDB Database
 
PDF
gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?
ArangoDB Database
 
PDF
ArangoML Pipeline Cloud - Managed Machine Learning Metadata
ArangoDB Database
 
PDF
ArangoDB 3.7 Roadmap: Performance at Scale
ArangoDB Database
 
PDF
Webinar: What to expect from ArangoDB Oasis
ArangoDB Database
 
PDF
ArangoDB 3.5 Feature Overview Webinar - Sept 12, 2019
ArangoDB Database
 
PDF
3.5 webinar
ArangoDB Database
 
PDF
Webinar: How native multi model works in ArangoDB
ArangoDB Database
 
PDF
An introduction to multi-model databases
ArangoDB Database
 
PDF
Running complex data queries in a distributed system
ArangoDB Database
 
PDF
Guacamole Fiesta: What do avocados and databases have in common?
ArangoDB Database
 
PPTX
Are you a Tortoise or a Hare?
ArangoDB Database
 
PDF
The Computer Science Behind a modern Distributed Database
ArangoDB Database
 
PDF
Fishing Graphs in a Hadoop Data Lake
ArangoDB Database
 
ArangoDB 3.9 - Further Powering Graphs at Scale
ArangoDB Database
 
GraphSage vs Pinsage #InsideArangoDB
ArangoDB Database
 
Webinar: ArangoDB 3.8 Preview - Analytics at Scale
ArangoDB Database
 
Getting Started with ArangoDB Oasis
ArangoDB Database
 
Custom Pregel Algorithms in ArangoDB
ArangoDB Database
 
Hacktoberfest 2020 - Intro to Knowledge Graphs
ArangoDB Database
 
A Graph Database That Scales - ArangoDB 3.7 Release Webinar
ArangoDB Database
 
gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?
ArangoDB Database
 
ArangoML Pipeline Cloud - Managed Machine Learning Metadata
ArangoDB Database
 
ArangoDB 3.7 Roadmap: Performance at Scale
ArangoDB Database
 
Webinar: What to expect from ArangoDB Oasis
ArangoDB Database
 
ArangoDB 3.5 Feature Overview Webinar - Sept 12, 2019
ArangoDB Database
 
3.5 webinar
ArangoDB Database
 
Webinar: How native multi model works in ArangoDB
ArangoDB Database
 
An introduction to multi-model databases
ArangoDB Database
 
Running complex data queries in a distributed system
ArangoDB Database
 
Guacamole Fiesta: What do avocados and databases have in common?
ArangoDB Database
 
Are you a Tortoise or a Hare?
ArangoDB Database
 
The Computer Science Behind a modern Distributed Database
ArangoDB Database
 
Fishing Graphs in a Hadoop Data Lake
ArangoDB Database
 
Ad

Recently uploaded (20)

PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
DOCX
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
PPTX
Mastering ODC + Okta Configuration - Chennai OSUG
HathiMaryA
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
PDF
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
PDF
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
PDF
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PPTX
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
PDF
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 
PDF
NLJUG Speaker academy 2025 - first session
Bert Jan Schrijver
 
PDF
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
PPTX
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
Mastering ODC + Okta Configuration - Chennai OSUG
HathiMaryA
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 
NLJUG Speaker academy 2025 - first session
Bert Jan Schrijver
 
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 

Graph Analytics with ArangoDB