data-flow-graph

Takes application logs from Elasticsearch (because you do have logs, right?) and visualizes how your data flow through the database allowing you to quickly identify which parts of your code inserts / updates / deletes / reads data from specific DB tables.

This can be extended to handle:

message queues (Redis, RabbitMQ, Scribe, ...)
HTTP services communication (GET, POST requests)
Amazon's S3 storage operations
tcpdump / varnishlog traffic between the hosts
(use your imagintation)

data-flow-graph uses d3.js library to visualize the data flow (heavily inspired by this demo by Neil Atkinson). Alternatively, you can generate *.gv file and render it using Graphviz.

Live demo

Graphs sharing

Via Gist

For easy dataflow sharing you can upload graph data in TSV form to Gist and have it visualized. Specific gist revisions are also supported.

Via s3

You can also upload TSV file to your s3 bucket (and have CORS set up there). Navigate to tsv.html or check the example from elecena.pl.

`dataflow.tsv`

Visualization is generated for a TSV file with the following format:

(source node)\t(edge label)\t(target node)\t(edge weight - optional)\t(optional metadata displayed in edge on-hover tooltip)

Example

# a comment - will be ignored by the visualization layer mq/request.php_updatemysql:shops0.0148QPS: 0.1023 sphinx:datasheetssearchElecena\Services\Sphinx0.1888QPS: 1.3053 mysql:productsgetImagesToFetchImageBot0.0007QPS: 0.0050 sphinx:productssearchElecena\Services\Sphinx0.0042QPS: 0.0291 sphinx:productsgetIndexCountElecena\Services\Sphinx0.0001QPS: 0.0007 sphinx:productsproductsElecena\Services\Search0.0323QPS: 0.2235 currency.php_mysql:currencies0.0001QPS: 0.0008 sphinx:productsgetLastChangesStatsController0.0002QPS: 0.0014 mysql:suggestgetSuggestionsElecena\Services\Sphinx0.0026QPS: 0.0181 mq/request.php_deletemysql:shops_stats0.0004QPS: 0.0030 sphinx:parametersgetDatabaseCountParameters0.0002QPS: 0.0010

Node names can by categorized by adding a label followed by : (e.g. mysql:foo, sphinx:index, solr:products, redis:queue)

Generating TSV file for data flow

You can write your own tool to analyze logs. It just needs to emit TSV file that matches the above format.

sources/elasticsearch/logs2dataflow.py is here as an example - it was used to generate TSV for a demo of this tool. 24 hours of logs from elecena.pl were analyzed (1mm+ of SQL queries).

Python module

pip install data_flow_graph

Please refer to /test directory for examples on how to use helper functions to generate Graphviz and TSV-formatted data flows.

Generating graphviz's dot file

from data_flow_graph import format_graphviz_lines lines = [{ 'source': 'Foo "bar" test', 'metadata': '"The Edge"', 'target': 'Test "foo" 42', }] graph = format_graphviz_lines(lines)

Generating TSV file

from data_flow_graph import format_tsv_lines lines = [ { 'source': 'foo', 'edge': 'select', 'target': 'bar', }, { 'source': 'foo2', 'edge': 'select', 'target': 'bar', 'value': 0.5, 'metadata': 'test' }, ] tsv = format_tsv_lines(lines)

Links

vis.js for visualization (a graph example)
Interactive & Dynamic Force-Directed Graphs with D3
d3.js curved links graph
Bi-directional hierarchical sankey diagram

Name		Name	Last commit message	Last commit date
Latest commit History 133 Commits
docs		docs
examples		examples
sources		sources
test		test
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
data_flow_graph.py		data_flow_graph.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

data-flow-graph

Live demo

Graphs sharing

Via Gist

Via s3

`dataflow.tsv`

Example

Generating TSV file for data flow

Python module

Generating graphviz's dot file

Generating TSV file

Links

About

Uh oh!

Releases 3

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

macbre/data-flow-graph

Folders and files

Latest commit

History

Repository files navigation

data-flow-graph

Live demo

Graphs sharing

Via Gist

Via s3

dataflow.tsv

Example

Generating TSV file for data flow

Python module

Generating graphviz's dot file

Generating TSV file

Links

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

`dataflow.tsv`

Packages