Skip to content

Commit b09a5c3

Browse files
committed
Improve notebook and doc
1 parent a9f25a2 commit b09a5c3

File tree

2 files changed

+184
-137
lines changed

2 files changed

+184
-137
lines changed

doc/modules/ROOT/pages/tutorials/load-data-via-graph-construction.adoc

Lines changed: 73 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -18,25 +18,40 @@ We need an environment where Neo4j and GDS are available, for example
1818
AuraDS (which comes with GDS preinstalled) or Neo4j Desktop.
1919

2020
Once the credentials to this environment are available, we can install
21-
the `graphdatascience` package and create the `gds` object.
21+
the `graphdatascience` package and import the client class.
2222

2323
[source, python, subs=attributes+, role=no-test]
2424
----
2525
!pip install graphdatascience=={docs-version}
2626
----
2727

28-
[source, python, role=no-test]
28+
[source, python, subs=attributes+, role=no-test]
2929
----
30-
# Import the client
3130
from graphdatascience import GraphDataScience
31+
----
3232

33-
# Replace with the actual credentials
34-
AURA_CONNECTION_URI = "neo4j+s://xxxxxxxx.databases.neo4j.io"
35-
AURA_USERNAME = "neo4j"
36-
AURA_PASSWORD = ""
33+
When using a local Neo4j setup, the default connection URI is `bolt://localhost:7687`:
3734

38-
# Configure the client with AuraDS-recommended settings if using AuraDS
39-
gds = GraphDataScience(AURA_CONNECTION_URI, auth=(AURA_USERNAME, AURA_PASSWORD), aura_ds=True)
35+
[source, python, subs=attributes+, role=no-test]
36+
----
37+
# Replace with the actual connection URI and credentials
38+
NEO4J_CONNECTION_URI = "bolt://localhost:7687"
39+
NEO4J_USERNAME = "neo4j"
40+
NEO4J_PASSWORD = ""
41+
42+
gds = GraphDataScience(NEO4J_CONNECTION_URI, auth=(NEO4J_USERNAME, NEO4J_PASSWORD))
43+
----
44+
45+
When using AuraDS, the connection URI is slightly different as it uses the `neo4j+s` protocol. The client should also include the `aura_ds=True` flag to enable AuraDS-recommended settings.
46+
47+
[source, python, subs=attributes+, role=no-test]
48+
----
49+
# Replace with the actual connection URI and credentials
50+
NEO4J_CONNECTION_URI = "neo4j+s://xxxxxxxx.databases.neo4j.io"
51+
NEO4J_USERNAME = "neo4j"
52+
NEO4J_PASSWORD = ""
53+
54+
gds = GraphDataScience(NEO4J_CONNECTION_URI, auth=(NEO4J_USERNAME, NEO4J_PASSWORD), aura_ds=True)
4055
----
4156

4257
We also import `pandas` to create a Pandas `DataFrame` from the original
@@ -110,6 +125,15 @@ Let's check the first 5 rows of the new `DataFrame`:
110125
nodes.head()
111126
----
112127

128+
----
129+
nodeId labels subject features
130+
0 31336 Paper 0 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
131+
1 1061127 Paper 1 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, ...
132+
2 1106406 Paper 2 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
133+
3 13195 Paper 2 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
134+
4 37879 Paper 3 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
135+
----
136+
113137
Now we create a new `DataFrame` containing the relationships between the nodes.
114138
To create the equivalent of an undirected graph, we need to add direct
115139
and inverse relationships explicitly.
@@ -129,6 +153,15 @@ Again, let's check the first 5 rows of the new `DataFrame`:
129153
relationships.head()
130154
----
131155

156+
----
157+
sourceNodeId targetNodeId relationshipType
158+
0 35 1033 CITES
159+
1 35 103482 CITES
160+
2 35 103515 CITES
161+
3 35 1050679 CITES
162+
4 35 1103960 CITES
163+
----
164+
132165
Finally, we can create the in-memory graph.
133166

134167
[source, python, role=no-test]
@@ -153,6 +186,21 @@ Let's also count the nodes in the graph:
153186
G.node_count()
154187
----
155188

189+
----
190+
2708
191+
----
192+
193+
The count matches with the number of rows in the Pandas dataset:
194+
195+
[source, python, role=no-test]
196+
----
197+
len(content)
198+
----
199+
200+
----
201+
2708
202+
----
203+
156204
We can stream the value of the `subject` node property for
157205
each node in the graph, printing only the first 10.
158206

@@ -161,9 +209,25 @@ each node in the graph, printing only the first 10.
161209
gds.graph.streamNodeProperties(G, ["subject"]).head(10)
162210
----
163211

212+
----
213+
nodeId nodeProperty propertyValue
214+
0 31336 subject 0
215+
1 1061127 subject 1
216+
2 1106406 subject 2
217+
3 13195 subject 2
218+
4 37879 subject 3
219+
5 1126012 subject 3
220+
6 1107140 subject 4
221+
7 1102850 subject 0
222+
8 31349 subject 0
223+
9 1106418 subject 4
224+
----
225+
164226

165227
== Cleanup
166228

229+
When the graph is no longer needed, it should be dropped to free up memory:
230+
167231
[source, python, role=no-test]
168232
----
169233
G.drop()

0 commit comments

Comments
 (0)