Skip to content

Commit 08323d6

Browse files
committed
Initial push for Dev Days
1 parent 3609ad7 commit 08323d6

File tree

1 file changed

+300
-0
lines changed

1 file changed

+300
-0
lines changed

notebooks/Window.ipynb

Lines changed: 300 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,300 @@
1+
{
2+
"nbformat": 4,
3+
"nbformat_minor": 0,
4+
"metadata": {
5+
"colab": {
6+
"name": "Window.ipynb",
7+
"provenance": [],
8+
"collapsed_sections": [],
9+
"include_colab_link": true
10+
},
11+
"kernelspec": {
12+
"name": "python3",
13+
"display_name": "Python 3"
14+
}
15+
},
16+
"cells": [
17+
{
18+
"cell_type": "markdown",
19+
"metadata": {
20+
"id": "view-in-github",
21+
"colab_type": "text"
22+
},
23+
"source": [
24+
"<a href=\"https://colab.research.google.com/github/arangodb/interactive_tutorials/blob/master/notebooks/Window.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
25+
]
26+
},
27+
{
28+
"cell_type": "markdown",
29+
"metadata": {
30+
"id": "islPPnmmfwIY"
31+
},
32+
"source": [
33+
"# WINDOW AQL Function\n",
34+
"Aggregate adjacent documents or value ranges with a sliding window to calculate running totals, rolling averages, and other statistical properties"
35+
]
36+
},
37+
{
38+
"cell_type": "code",
39+
"metadata": {
40+
"id": "gwdjttO5zcko"
41+
},
42+
"source": [
43+
"%%capture\n",
44+
"!pip3 install pyarango\n",
45+
"!pip3 install \"python-arango>=5.0\""
46+
],
47+
"execution_count": null,
48+
"outputs": []
49+
},
50+
{
51+
"cell_type": "code",
52+
"metadata": {
53+
"id": "J_gNZCCJ5ClS"
54+
},
55+
"source": [
56+
"%%capture\n",
57+
"!git clone https://github.com/arangodb/interactive_tutorials.git -b oasis_connector --single-branch\n",
58+
"!rsync -av interactive_tutorials/ ./ --exclude=.git"
59+
],
60+
"execution_count": null,
61+
"outputs": []
62+
},
63+
{
64+
"cell_type": "markdown",
65+
"metadata": {
66+
"id": "YvkFgzTY_0fg"
67+
},
68+
"source": [
69+
"Here we import the oasis package along with our python driver. Here we have imported both but only one is necessary."
70+
]
71+
},
72+
{
73+
"cell_type": "code",
74+
"metadata": {
75+
"id": "9sbDlV9l_fDp"
76+
},
77+
"source": [
78+
"import oasis\n",
79+
"\n",
80+
"from pyArango.connection import *\n",
81+
"from arango import ArangoClient"
82+
],
83+
"execution_count": null,
84+
"outputs": []
85+
},
86+
{
87+
"cell_type": "markdown",
88+
"metadata": {
89+
"id": "69GLRFL7_-Jn"
90+
},
91+
"source": [
92+
"## Connecting\n",
93+
"\n",
94+
"Be sure to update the `tutorialName` variable with your tutorials name."
95+
]
96+
},
97+
{
98+
"cell_type": "code",
99+
"metadata": {
100+
"id": "WOXCQc-xACUv"
101+
},
102+
"source": [
103+
"# Retrieve tmp credentials from ArangoDB Tutorial Service\n",
104+
"\n",
105+
"# ** UPDATE THE FOLLOWING VARIABLE **\n",
106+
"tutorialName = \"Window\"\n",
107+
"login = oasis.getTempCredentials(tutorialName=tutorialName, credentialProvider=\"https://tutorials.arangodb.cloud:8529/_db/_system/tutorialDB/tutorialDB\")\n",
108+
"\n",
109+
"# Here is an example of connecting with python arango \n",
110+
"database = oasis.connect_python_arango(login)\n",
111+
"\n",
112+
"# These are the credentials\n",
113+
"print(\"https://\"+login[\"hostname\"]+\":\"+str(login[\"port\"]))\n",
114+
"print(\"Username: \" + login[\"username\"])\n",
115+
"print(\"Password: \" + login[\"password\"])\n",
116+
"print(\"Database: \" + login[\"dbName\"])"
117+
],
118+
"execution_count": null,
119+
"outputs": []
120+
},
121+
{
122+
"cell_type": "markdown",
123+
"metadata": {
124+
"id": "zQW-kvq0Eo4A"
125+
},
126+
"source": [
127+
"# Importing Data\n",
128+
"You are free to parse and import your data however you choose but some simple options for those already familiar with ArangoDB are:\n",
129+
"* [arangorestore](https://www.arangodb.com/docs/stable/programs-arangorestore.html)\n",
130+
"* [arangoimport](https://www.arangodb.com/docs/stable/programs-arangoimport.html)\n",
131+
"\n",
132+
"It is sometimes necessary to adjust the permissions of the tools folder, if you are using any tools in it."
133+
]
134+
},
135+
{
136+
"cell_type": "code",
137+
"metadata": {
138+
"id": "q5gXOkinH5GF"
139+
},
140+
"source": [
141+
"!chmod -R 755 ./tools/*\n",
142+
"!mkdir data\n",
143+
"!curl -o ./data/sensor_data.csv https://raw.githubusercontent.com/arangodb/interactive_tutorials/master/notebooks/data/2017-07_bme280sof_smaller.csv\n",
144+
"# Complete data located here: https://www.kaggle.com/hmavrodiev/sofia-air-quality-dataset?select=2017-07_bme280sof.csv"
145+
],
146+
"execution_count": null,
147+
"outputs": []
148+
},
149+
{
150+
"cell_type": "code",
151+
"metadata": {
152+
"id": "yPjGeffH9yu3"
153+
},
154+
"source": [
155+
"%%capture\n",
156+
"! ./tools/arangoimport -c none --server.endpoint http+ssl://{login[\"hostname\"]}:{login[\"port\"]} --server.username {login[\"username\"]} --server.database {login[\"dbName\"]} --server.password {login[\"password\"]} --file \"data/sensor_data.csv\" --type \"csv\" --collection \"sensor_data\" --create-collection true"
157+
],
158+
"execution_count": null,
159+
"outputs": []
160+
},
161+
{
162+
"cell_type": "markdown",
163+
"metadata": {
164+
"id": "IcBG76sQBNUW"
165+
},
166+
"source": [
167+
"#Row-Based Aggregation\n",
168+
"* Allows aggregating over a fixed number of rows, following or preceding the current row. \n",
169+
"* It is also possible to define that all preceding or following rows should be aggregated (\"unbounded\").\n"
170+
]
171+
},
172+
{
173+
"cell_type": "code",
174+
"metadata": {
175+
"id": "GwTg_NIAevK1"
176+
},
177+
"source": [
178+
"aql = database.aql\n",
179+
"results = aql.execute(\n",
180+
" \"\"\"\n",
181+
" FOR t IN sensor_data\n",
182+
" SORT t.timestamp\n",
183+
" WINDOW { preceding: 1, following:1}\n",
184+
" AGGREGATE rollingAvg = AVG(t.temperature), rollingSum = SUM(t.temperature)\n",
185+
" WINDOW { preceding: \"unbounded\", following: 0}\n",
186+
" AGGREGATE cumulativeSum = SUM(t.temperature)\n",
187+
" LIMIT 10\n",
188+
" RETURN {\n",
189+
" time: t.timestamp,\n",
190+
" temp: t.temperature,\n",
191+
" sensor: t.sensor_id,\n",
192+
" rollingAvg,\n",
193+
" rollingSum,\n",
194+
" cumulativeSum\n",
195+
" \n",
196+
" }\n",
197+
" \"\"\"\n",
198+
")\n",
199+
"for res in results:\n",
200+
" print(res)"
201+
],
202+
"execution_count": null,
203+
"outputs": []
204+
},
205+
{
206+
"cell_type": "markdown",
207+
"metadata": {
208+
"id": "4gREqUxsBS1v"
209+
},
210+
"source": [
211+
"#Duration-based Aggregation\n",
212+
"* Allows aggregating over all documents by time intervals. \n",
213+
"* Calculate timestamp offsets using positive ISO 8601 duration strings (P1Y6, PT30M).\n"
214+
]
215+
},
216+
{
217+
"cell_type": "code",
218+
"metadata": {
219+
"id": "7WjxeBpTC6sl"
220+
},
221+
"source": [
222+
"results = aql.execute(\n",
223+
" \"\"\"\n",
224+
" FOR t IN sensor_data\n",
225+
" WINDOW DATE_TIMESTAMP(t.timestamp) WITH { preceding: \"PT30M\" }\n",
226+
" AGGREGATE rollingAverage = AVG(t.temperature), rollingSum = SUM(t.temperature)\n",
227+
" LIMIT 10\n",
228+
" RETURN {\n",
229+
" time: t.timestamp,\n",
230+
" temperature: t.temperature,\n",
231+
" sensor: t.sensor_id,\n",
232+
" rollingAverage,\n",
233+
" rollingSum\n",
234+
" }\n",
235+
" \"\"\"\n",
236+
")\n",
237+
"times = []\n",
238+
"temps = []\n",
239+
"rollingAverages = []\n",
240+
"for res in results:\n",
241+
" times.append(res['time'])\n",
242+
" temps.append(res['temperature'])\n",
243+
" rollingAverages.append(res['rollingAverage'])\n",
244+
" print(res)"
245+
],
246+
"execution_count": null,
247+
"outputs": []
248+
},
249+
{
250+
"cell_type": "code",
251+
"metadata": {
252+
"id": "G4C-i6KqXhaf"
253+
},
254+
"source": [
255+
"import time\n",
256+
"import datetime as dt\n",
257+
"import matplotlib.pyplot as plt\n",
258+
"\n",
259+
"# Create figure for plotting\n",
260+
"fig = plt.figure()\n",
261+
"ax = fig.add_subplot(1, 1, 1)\n",
262+
"ax.plot(times, temps, label=\"Temperatures\")\n",
263+
"plt.ylabel('Temperature',fontsize=18)\n",
264+
"plt.xlabel('Dates',fontsize=18)\n",
265+
"plt.legend(loc=\"upper left\")\n",
266+
"\n",
267+
"\n",
268+
"\n",
269+
"# Draw plot\n",
270+
"# ax.annotate(\"Original Temps\", (5,5), color='red', size=20)\n",
271+
"ax2 = ax.twinx()\n",
272+
"ax2.plot(times, rollingAverages, 'b-', label=\"Rolling\")\n",
273+
"plt.ylabel('rollingAverage',fontsize=18)\n",
274+
"plt.legend(loc=\"upper right\")\n",
275+
"\n",
276+
"# Format plot\n",
277+
"# plt.xticks(rotation=45, ha='right')\n",
278+
"plt.subplots_adjust(bottom=0.30)\n",
279+
"fig.set_size_inches(20, 15)\n",
280+
"plt.title('Temperature over Time', fontsize=20)\n",
281+
"\n",
282+
"# Draw the graph\n",
283+
"plt.show()\n"
284+
],
285+
"execution_count": null,
286+
"outputs": []
287+
},
288+
{
289+
"cell_type": "markdown",
290+
"metadata": {
291+
"id": "qCt1keEvM5Q6"
292+
},
293+
"source": [
294+
"If you would like to share your notebook simply place it in the `community_notebooks` folder in the interactive-tutorials repository and make a pull request.\n",
295+
"\n",
296+
"Good luck and we are excited to see what you are working on!"
297+
]
298+
}
299+
]
300+
}

0 commit comments

Comments
 (0)