We recommend that you create an incremental artifact when you need to apply changes to a subset of files in an artifact, where the size of the original artifact is significantly larger.
Create new artifact versions from scratch
There are two ways to create a new artifact version: from a single run and from distributed runs. They are defined as follows:- Single run: A single run provides all the data for a new version. This is the most common case and is best suited when the run fully recreates the needed data. For example: outputting saved models or model predictions in a table for analysis.
- Distributed runs: A set of runs collectively provides all the data for a new version. This is best suited for distributed jobs which have multiple runs generating data, often in parallel. For example: evaluating a model in a distributed manner, and outputting the predictions.
v0 alias if you pass a name to the wandb.Artifact API that does not exist in your project. W&B checksums the contents when you log again to the same artifact. If the artifact changed, W&B saves a new version v1. W&B will retrieve an existing artifact if you pass a name and artifact type to the wandb.Artifact API that matches an existing artifact in your project. The retrieved artifact will have a version greater than 1. 
Single run
Log a new version of an Artifact with a single run that produces all the files in the artifact. This case occurs when a single run produces all the files in the artifact. Based on your use case, select one of the tabs below to create a new artifact version inside or outside of a run:- Inside a run
- Outside of a run
Create an artifact version within a W&B run:
- Create a run with wandb.init.
- Create a new artifact or retrieve an existing one with wandb.Artifact.
- Add files to the artifact with .add_file.
- Log the artifact to the run with .log_artifact.
Distributed runs
Allow a collection of runs to collaborate on a version before committing it. This is in contrast to single run mode described above where one run provides all the data for a new version.- Each run in the collection needs to be aware of the same unique ID (called distributed_id) in order to collaborate on the same version. By default, if present, W&B uses the run’sgroupas set bywandb.init(group=GROUP)as thedistributed_id.
- There must be a final run that “commits” the version, permanently locking its state.
- Use upsert_artifactto add to the collaborative artifact andfinish_artifactto finalize the commit.
upsert_artifact. Run 1:
Run 2:
Run 3
Must run after Run 1 and Run 2 complete. The Run that callsfinish_artifact can include files in the artifact, but does not need to. Create a new artifact version from an existing version
Add, modify, or remove a subset of files from a previous artifact version without the need to re-index the files that didn’t change. Adding, modifying, or removing a subset of files from a previous artifact version creates a new artifact version known as an incremental artifact.
- add: you periodically add a new subset of files to a dataset after collecting a new batch.
- remove: you discovered several duplicate files and want to remove them from your artifact.
- update: you corrected annotations for a subset of files and want to replace the old files with the correct ones.
You can create an incremental artifact within a single run or with a set of runs (distributed mode).
- Obtain the artifact version you want to perform an incremental change on:
- Inside a run
- Outside of a run
- Create a draft with:
- Perform any incremental changes you want to see in the next version. You can either add, remove, or modify an existing entry.
- Add
- Remove
- Modify
Add a file to an existing artifact version with the 
add_file method:You can also add multiple files by adding a directory with the 
add_dir method.- Lastly, log or save your changes. The following tabs show you how to save your changes inside and outside of a W&B run. Select the tab that is appropriate for your use case:
- Inside a run
- Outside of a run
- Inside a run
- Outside of a run