Skip to content
This repository was archived by the owner on May 17, 2024. It is now read-only.

datafold/data-diff

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

datafold

data-diff

[Link to documentation!]

๐Ÿ’ธ๐Ÿ’ธ Looking for paid contributors! ๐Ÿ’ธ๐Ÿ’ธ

We're looking for developers with a deep understanding of databases and solid Python knowledge. [Apply here!]


What is data-diff?

data-diff enables data professionals to detect differences in values between any two tables. It's fast, easy to use, and reliable. Even at massive scale.

How to use

Quickly identify issues when migrating data between databases

diff1 diff2

Improve code reviews by identifying data problems you don't have tests for

(video is rough draft, screenshot will be replaced with something better)

Why Cypress Video

ย  ย 

Get started

Installation

First, install data-diff using pip.

pip install data-diff

Note: Once you've installed Python 3.7+, it's most likely that pip and pip3 can be used interchangeably.

Then, install one or more driver(s) specific to the database(s) you want to connect to.

  • pip install 'data-diff[postgresql]'

  • pip install 'data-diff[snowflake]'

  • TODO We support 10+ other databases. Check out [TODO link to documentation] for specifics.

Run your first diff

Once you've installed data-diff, you can run it from the command line:

data-diff DB1_URI TABLE1_NAME DB2_URI TABLE2_NAME [OPTIONS]

Check out the Documentation TODO add link for all the options and database-specific configurations.

Reporting bugs and contributing

License

This project is licensed under the terms of the MIT License.