|
9 | 9 | **data-diff** is a command-line tool and Python library to efficiently diff |
10 | 10 | rows across two different databases. |
11 | 11 |
|
12 | | -* ⇄ Verifies across many different databases (e.g. Postgres -> Snowflake) |
13 | | -* 🔍 Outputs diff of rows in detail |
14 | | -* 🚨 Simple CLI/API to create monitoring and alerts |
15 | | -* 🔥 Verify 25M+ rows in <10s, and 1B+ rows in ~5min. |
16 | | -* ♾️ Works for tables with 10s of billions of rows |
| 12 | +⇄ Verifies across many different databases (e.g. Postgres -> Snowflake) ! |
| 13 | + |
| 14 | +🔍 Outputs diff of rows in detail |
| 15 | + |
| 16 | +🚨 Simple CLI/API to create monitoring and alerts |
| 17 | + |
| 18 | +🔥 Verify 25M+ rows in <10s, and 1B+ rows in ~5min. |
| 19 | + |
| 20 | +♾️ Works for tables with 10s of billions of rows |
| 21 | + |
| 22 | +For more information, `See our README <https://github.com/datafold/data-diff#readme>`_ |
| 23 | + |
| 24 | +How to install |
| 25 | +-------------- |
| 26 | + |
| 27 | +Requires Python 3.7+ with pip. |
| 28 | + |
| 29 | +:: |
| 30 | + |
| 31 | + pip install data-diff |
| 32 | + |
| 33 | +or when you need extras like mysql and postgres: |
| 34 | + |
| 35 | +:: |
| 36 | + |
| 37 | + pip install "data-diff[mysql,pgsql]" |
| 38 | + |
| 39 | + |
| 40 | +How to use from Python |
| 41 | +---------------------- |
| 42 | + |
| 43 | +.. code-block:: python |
| 44 | +
|
| 45 | + # Optional: Set logging to display the progress of the diff |
| 46 | + import logging |
| 47 | + logging.basicConfig(level=logging.INFO) |
| 48 | +
|
| 49 | + from datadiff import connect_to_table, diff_tables |
| 50 | +
|
| 51 | + table1 = connect_to_table("postgres:///", "table_name", "id") |
| 52 | + table2 = connect_to_table("mysql:///", "table_name", "id") |
| 53 | +
|
| 54 | + for different_row in diff_tables(table1, table2): |
| 55 | + plus_or_minus, columns = different_row |
| 56 | + print(plus_or_minus, columns) |
17 | 57 |
|
18 | | -See the README for more information. |
19 | 58 |
|
20 | 59 | Resources |
21 | 60 | --------- |
22 | 61 |
|
23 | | -- Tutorials |
24 | | - |
25 | | - - TODO |
| 62 | +- Git: `<https://github.com/datafold/data-diff>`_ |
26 | 63 |
|
27 | 64 | - Reference |
28 | 65 |
|
29 | 66 | - :doc:`python-api` |
30 | 67 |
|
| 68 | +- Tutorials |
| 69 | + |
| 70 | + - TODO |
| 71 | + |
| 72 | + |
0 commit comments