Skip to content

Conversation

vbourgin
Copy link

Summary:

Context

Metric names may be included in checkpoint names when specifying a best_checkpoint_config, but no verification is done on the metric name. This may lead to nested directory structures if checkpoint names contain /, e.g.:

f721785233

Here we use top1_accuracy/evaluate as the monitored_metric, which will create checkpoints in a nested directory:

{F1977112918}

Checkpointers won't be able to appropriately restore the checkpoint with the best monitored metric, as each checkpoint will be stored in a different directory.

Proposed change

In this diff, we sanitize the metric name prior to checkpoint saving, replacing / with _. Now, checkpoints are saved in the same directory:

f721793003
{F1977113027}

Differential Revision: D73004419

Summary: # Context Metric names may be included in checkpoint names when specifying a `best_checkpoint_config`, but no verification is done on the metric name. This may lead to nested directory structures if checkpoint names contain `/`, e.g.: f721785233 Here we use `top1_accuracy/evaluate` as the `monitored_metric`, which will create checkpoints in a nested directory: {F1977112918} Checkpointers won't be able to appropriately restore the checkpoint with the best monitored metric, as each checkpoint will be stored in a different directory. # Proposed change In this diff, we sanitize the metric name prior to checkpoint saving, replacing `/` with `_`. Now, checkpoints are saved in the same directory: f721793003 {F1977113027} Differential Revision: D73004419
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D73004419

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

2 participants