Skip to content

Conversation

@AkiTheMemeGod
Copy link

This pull request introduces a new snapshot feature for pandas DataFrames, allowing users to create, restore, list, drop, and clear named snapshots of a DataFrame's state. The implementation includes a per-DataFrame snapshot store, deep copy safety, and comprehensive tests for all new functionality.

New DataFrame snapshot functionality:

  • Added methods to DataFrame for snapshot management: snapshot, restore, list_snapshots, drop_snapshot, clear_snapshots, and snapshot_info. These allow users to save and revert to previous states of a DataFrame, as well as manage snapshots.
  • Introduced the DataFrameSnapshotStore class in the new file pandas/core/frame_versioning.py, which handles the storage, retrieval, and metadata of DataFrame snapshots using deep copies for safety.
  • Implemented the helper function _ensure_snapshot_store to attach and manage the snapshot store per DataFrame instance.
  • Added a comprehensive test suite in pandas/tests/frame/test_versioning.py to verify snapshot creation, restoration, inplace mutation, deletion, clearing, copy behavior, and error handling for missing snapshots.

Integration and usage example:

  • Provided a usage example in asv_bench/benchmarks/bench_snapshot_memory.py demonstrating how to create a snapshot and inspect memory usage, showcasing the new API in practice.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant