Skip to content
This repository was archived by the owner on May 17, 2024. It is now read-only.

Conversation

leoebfolsom
Copy link
Contributor

Per my discussion with @williebsweet, I'm putting together some ideas for making the README (and eventually the datafold docs) more accessible and digestible for a less technical first-time user.

@erezsh
Copy link
Contributor

erezsh commented Sep 28, 2022

Overall I'm okay with these changes, but I think it was good to have the benchmark graph prominent, since data-diff's main feature is its speed. I don't think it's a good idea to push it down to the technical explanation.

Ideally, we should re-run the benchmarks to include the in-db diff (joindiff), and we can re-write the explanation for the benchmark so it's a little less technical.

@leoebfolsom
Copy link
Contributor Author

Thanks for the feedback @erezsh, I appreciate it!

Just to be clear, I don't think we should merge this yet--it was meant as a work in progress and first draft so I could collect feedback and make updates.

I'd love to schedule some time to meet and chat about this all next week, when I start full-time as a Solutions Engineer.

I agree with you that we should re-run benchmarks with joindiff. Is that something you'd be able to do? Even if it's not presentation-ready, I'd be interested in the results.

In addition to the count(*) comparison that's currently in the benchmark chart, it would be good to see how joindiff compares to cross-db-diff when applied to two tables in the same database. As we know, people have been using cross-db-diff to do within-db-diff (that was my original use case when I first tried out data-diff), and we want to highlight to existing users that we released a new version for in-db-diff, which is faster and designed for that use case.

When I start FT next week, I'll be more than happy to help where I can to develop these benchmarks and messaging.

@erezsh
Copy link
Contributor

erezsh commented Sep 28, 2022

Sounds good.

From initial running, it seems joindiff is about x3 to x4 faster than hashdiff on the same db. But I'll run a more exact benchmark soon.

P.S. you can mark your PRs as 'draft', if you don't want them merged yet.

@leoebfolsom leoebfolsom marked this pull request as draft September 28, 2022 17:49
@leoebfolsom
Copy link
Contributor Author

Thanks @erezsh ! Marked it as draft. 🙌

@leoebfolsom
Copy link
Contributor Author

leoebfolsom commented Oct 7, 2022

Just a reminder that this is very much in flight, so if you take a look, please keep its WIP 🚧 👷 status in mind! 😇

@leoebfolsom leoebfolsom marked this pull request as ready for review October 19, 2022 01:45
@leoebfolsom leoebfolsom changed the title WIP: ideas for making the README more digestible README updates for coalesce and pre release Oct 19, 2022
@diveart diveart requested review from diveart and removed request for KGmajor October 19, 2022 20:21
Copy link

@diveart diveart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@leoebfolsom leoebfolsom merged commit 53b6a3c into datafold:master Oct 19, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
4 participants