Intro to Neo4j and Graph Databases Neo4j Webinar March 2016
ABOUT ME • William Lyon • Developer Relations Engineer @neo4j • http://neo4j.com/developer will@neo4j.com @lyonwj
Agenda • What is a graph [database]? • Use cases - why graphs? • Neo4j product overview • Labeled property graph data model • Cypher query language • RDBMS to graph • Resources • Questions?
Chart
Chart Graph
A Graph Is Connected Data
ROAD TRAFFIC LIGHTS A Graph Is Connected Data
HAS AVAILABLE A Graph Is Connected Data HOTEL ROOMS AVAILABLE
KNOWS KNOWS KNOWS WORKS_AT WORKS_AT WORKS_AT COMPANY STANFORD STUDIED_AT KNOWS NEO COLUMBIA A Graph Is Connected Data STUDIED_AT STU D IED _AT STUDIED_AT
A Graph Is Connected Data
Use of Graphs has created some of the most successful companies in the world C 34,3%B 38,4%A 3,3% D 3,8% 1,8% 1,8% 1,8% 1,8% 1,8% E 8,1% F 3,9%
Finance Social networks RetailHR & Recruiting Manufacturing & Logistics Health Care Telco Today we see graph-projects in virtually every industry
NEO4j USE CASES Real Time Recommendations Master Data Management Fraud Detection Identity & Access Management Graph Based Search Network & IT-Operations
NEO4j USE CASES Real Time Recommendations Master Data Management Fraud Detection Identity & Access Management Graph Based Search Network & IT-Operations GRAPH THINKING: Real Time Recommendations VIEWED VIEWED BOUGHT VIEWED BOUGHT BOUGHT BOUGHT BOUGHT
“As the current market leader in graph databases, and with enterprise features for scalability and availability, Neo4j is the right choice to meet our demands.” Marcos Wada Software Developer, Walmart NEO4j USE CASES Real Time Recommendations Master Data Management Fraud Detection Identity & Access Management Graph Based Search Network & IT-Operations
NEO4j USE CASES Real Time Recommendations Master Data Management Fraud Detection Identity & Access Management Graph Based Search Network & IT-Operations GRAPH THINKING: Master Data Management MANAGES MANAGES LEADS REGION M ANAG ES MANAGES REGION LEADS LEADS COLLABORATES
Neo4j is the heart of Cisco HMP: used for governance and single source of truth and a one-stop shop for all of Cisco’s hierarchies. NEO4j USE CASES Real Time Recommendations Master Data Management Fraud Detection Identity & Access Management Graph Based Search Network & IT-Operations
NEO4j USE CASES Real Time Recommendations Master Data Management Fraud Detection Identity & Access Management Graph Based Search Network & IT-Operations GRAPH THINKING: Fraud Detection O PENED_ACCO UNT HAS IS_ISSUED HAS LIVES LIVES IS_ISSUED OPENED_ACCOUNT
“Graph databases offer new methods of uncovering fraud rings and other sophisticated scams with a high-level of accuracy, and are capable of stopping advanced fraud scenarios in real-time.” Gorka Sadowski Cyber Security Expert NEO4j USE CASES Real Time Recommendations Master Data Management Fraud Detection Identity & Access Management Graph Based Search Network & IT-Operations
GRAPH THINKING: Graph Based Search NEO4j USE CASES Real Time Recommendations Master Data Management Fraud Detection Identity & Access Management Graph Based Search Network & IT-Operations PUBLISH INCLUDE INCLUDE CREATE CAPTURE IN IN SOURCE USES USES IN IN USES SOURCE SOURCE
Uses Neo4j to manage the digital assets inside of its next generation in-flight entertainment system. NEO4j USE CASES Real Time Recommendations Master Data Management Fraud Detection Identity & Access Management Graph Based Search Network & IT-Operations
NEO4j USE CASES Real Time Recommendations Master Data Management Fraud Detection Identity & Access Management Graph Based Search Network & IT-Operations BROWSES CONNECTS BRIDGES ROUTES POWERS ROUTES POWERS POWERS HOSTS QUERIES GRAPH THINKING: Network & IT-Operations
Uses Neo4j for network topology analysis for big telco service providers NEO4j USE CASES Real Time Recommendations Master Data Management Fraud Detection Identity & Access Management Graph Based Search Network & IT-Operations
GRAPH THINKING: Identity And Access Management NEO4j USE CASES Real Time Recommendations Master Data Management Fraud Detection Identity & Access Management Graph Based Search Network & IT-Operations TRUSTS TRUSTS ID ID AUTHENTICATES AUTHENTICATES O W NS OWNS CAN_READ
A way of representing data DATA DATA
Relational Database A way of representing data
Graph DatabaseRelational Database A way of representing data Good for: • Well-understood data structures that don’t change too frequently • Known problems involving discrete parts of the data, or minimal connectivity Good for: • Dynamic systems: where the data topology is difficult to predict • Dynamic requirements: 
 the evolve with the business • Problems where the relationships in data contribute meaning & value
THE PROPERTY GRAPH DATA MODEL
Ann DanLoves Ann Loves Dan
Ann Loves Dan LOVES RELATIONSHIPNODE NODE
Relationships are Directional LOVES LOVES RELATIONSHIPSNODE NODE
Detailed Property Graph name: “Ann” born: May 29, 1970 twitter: “@ann” name: “Dan” born: Dec 5, 1975 brand: “Volvo” model: “V70” LOVES LOVES LIVES WITH OW NS DRIVES DRIVESsince: Jan 10, 2011 since:
 Jan 10, 2011
OW NS name: “Ann” born: May 29, 1970 twitter: “@ann” name: “Dan” born: Dec 5, 1975 brand: “Volvo” model: “V70” LOVES LOVES LIVES WITH DRIVES DRIVESsince: Jan 10, 2011 since:
 Jan 10, 2011 :Person :Person :Car :Vehicle Labeled Property Graph
Mapping to Languages VERB VERB VERB VERB VERB VERBadverb adverb :Noun :Noun :Noun adjective adjective adjective adjective adjective adjective adjective
Property Graph Model Components Nodes • The objects in the graph • Can have name-value properties • Can be labeled Relationships • Relate nodes by type and direction • Can have name-value properties CAR DRIVES name: “Dan” born: May 29, 1970 twitter: “@dan” name: “Ann” born: Dec 5, 1975 since: 
 Jan 10, 2011 brand: “Volvo” model: “V70” LOVES LOVES LIVES WITH OW NS PERSON PERSON
WHY GRAPHS?
Intuitivness Speed Agility
Intuitiveness Speed Agility
Intuitiveness
Intuitivness Speed Agility
Relational Versus Graph Models Relational Model Graph Model KNOWS KNOWS KNOWS ANDREAS TOBIAS MICA DELIA Person FriendPerson-Friend ANDREAS DELIA TOBIAS MICA Index free adjacency
Speed “We found Neo4j to be literally thousands of times faster than our prior MySQL solution, with queries that require 10-100 times less code. Today, Neo4j provides eBay with functionality that was previously impossible.” - Volker Pacher, Senior Developer “Minutes to milliseconds” performance Queries up to 1000x faster than RDBMS or other NoSQL
Intuitivness Speed Agility
A Naturally Adaptive Model A Query Language Designed for Connectedness + =Agility
CYPHER SQL for graphs
(Very) Brief Cypher Tutorial
Creating the Data CREATE (:Person { name:“Ann”} ) - [:LOVES]-> (:Person { name:“Dan”} ) LOVES LABEL PROPERTY NODE NODE LABEL PROPERTY
Representing Bi-Directionality FB_FRIENDS MATCH (:Person { name:“Ann”} ) - [:FB_FRIENDS] -> (:Person { name:“Dan”} ) MATCH (:Person { name:“Ann”} ) - [:FB_FRIENDS] - (:Person { name:“Dan”} )
Cypher Typical Complex SQL Join The Same Query using Cypher MATCH (boss)-[:MANAGES*0..3]->(sub), (sub)-[:MANAGES*1..3]->(report) WHERE boss.name = “John Doe” RETURN sub.name AS Subordinate, 
 count(report) AS Total Project Impact Less time writing queries • More time understanding the answers • Leaving time to ask the next question Less time debugging queries: • More time writing the next piece of code • Improved quality of overall code base Code that’s easier to read: • Faster ramp-up for new project members • Improved maintainability & troubleshooting
http://www.opencypher.org/
Neo4j Graph Database • Property graph data model • Nodes and relationships • Native graph processing • (open)Cypher query language neo4j.com
Neo4j – Key Product Features Native Graph Storage
 Ensures data consistency and performance Native Graph Processing
 Millions of hops per second, in real time “Whiteboard Friendly” Data Modeling
 Model data as it naturally occurs High Data Integrity
 Fully ACID transactions Powerful, Expressive Query Language
 Requires 10x to 100x less code than SQL Scalability and High Availability
 Vertical and horizontal scaling optimized for graphs Built-in ETL
 Seamless import from other databases Integration
 Drivers and APIs for popular languages MATCH
 (A)
How do you use Neo4j? CREATE MODEL + LOAD DATA QUERY DATA
How do you use Neo4j?
How do you use Neo4j?
Language Drivers
Language Drivers
Native Server-Side Extensions
Architectural Options Data	Storage	and Business	Rules	Execu5on Data	Mining and	Aggrega5on Applica'on Graph	Database	Cluster Neo4j Neo4j Neo4j Ad	Hoc Analysis Bulk	Analy'c Infrastructure Hadoop,	EDW	… Data Scien'st End	User Databases Rela5onal NoSQL Hadoop
SQL Day in the Life of a RDBMS Developer
• Complex to model and store relationships • Performance degrades with increases in data • Queries get long and complex • Maintenance is painful SQL Pains
• Easy to model and store relationships • Performance of relationship traversal remains constant with growth in data size • Queries are shortened and more readable • Adding additional properties and relationships can be done on the fly - no migrations Graph Gains
FROM RDBMS TO GRAPHS
Northwind
Northwind - the canonical RDBMS Example
( )-[:TO]->(Graph)
( )-[:IS_BETTER_AS]->(Graph)
Starting with the ER Diagram
Locate the Foreign Keys
Drop the Foreign Keys
Find the JOIN Tables
(Simple) JOIN Tables Become Relationships
Attributed JOIN Tables -> Relationships with Properties
Querying a Subset Today
As a Graph
QUERYING THE GRAPH
using openCypher
Property Graph Model CREATE	(:Employee{	firstName:“Steven”}	)	-[:REPORTS_TO]->	(:Employee{	firstName:“Andrew”}	) REPORTS_TO Steven Andrew LABEL PROPERTY NODE NODE LABEL PROPERTY
Who do people report to? MATCH (e:Employee)<-[:REPORTS_TO]-(sub:Employee) RETURN *
Who do people report to?
Who do people report to? MATCH (e:Employee)<-[:REPORTS_TO]-(sub:Employee) RETURN e.employeeID AS managerID, e.firstName AS managerName, sub.employeeID AS employeeID, sub.firstName AS employeeName;
Who do people report to?
Who does Robert report to? MATCH p=(e:Employee)<-[:REPORTS_TO]-(sub:Employee) WHERE sub.firstName = ‘Robert’ RETURN p
Who does Robert report to?
What is Robert’s reporting chain? MATCH p=(e:Employee)<-[:REPORTS_TO*]-(sub:Employee) WHERE sub.firstName = ‘Robert’ RETURN p
What is Robert’s reporting chain?
Who’s the Big Boss? MATCH (e:Employee) WHERE NOT (e)-[:REPORTS_TO]->() RETURN e.firstName as bigBoss
Who’s the Big Boss?
Product Cross-Selling MATCH (choc:Product {productName: 'Chocolade'}) <-[:INCLUDES]-(:Order)<-[:SOLD]-(employee), (employee)-[:SOLD]->(o2)-[:INCLUDES]->(other:Product) RETURN employee.firstName, other.productName, COUNT(DISTINCT o2) as count ORDER BY count DESC LIMIT 5;
Product Cross-Selling
LOADING OUR DATA
CSV
CSV files for Northwind
3 Steps to Creating the Graph IMPORT NODES CREATE INDEXES IMPORT RELATIONSHIPS
Importing Nodes // Create customers USING PERIODIC COMMIT LOAD CSV WITH HEADERS FROM "https:// raw.githubusercontent.com/neo4j-contrib/developer-resources/ gh-pages/data/northwind/customers.csv" AS row CREATE (:Customer {companyName: row.CompanyName, customerID: row.CustomerID, fax: row.Fax, phone: row.Phone}); // Create products USING PERIODIC COMMIT LOAD CSV WITH HEADERS FROM "https:// raw.githubusercontent.com/neo4j-contrib/developer-resources/ gh-pages/data/northwind/products.csv" AS row CREATE (:Product {productName: row.ProductName, productID: row.ProductID, unitPrice: toFloat(row.UnitPrice)});
:play northward graph
High Performance LOADing neo4j-import 4.58 million things and their relationships… Loads in 100 seconds!
POWERING AN APP
Simple App
Simple Python Code
Simple Python Code
Simple Python Code
Simple Python Code
Resources
neo4j.com/download
Simple App http://network.graphdemos.com/
neo4j.com/developer
There Are Lots of Ways to Easily Learn Neo4j
graphdatabases.com
http://neo4j.com/graphgists/
THANK YOU! will@neo4j.com @lyonwj

Intro to Neo4j and Graph Databases