-
Couldn't load subscription status.
- Fork 560
Closed
Description
🐛 Bug
In dfcf306e7 Apply precision config env vars in the root process. (#6152)
we started running load_benchmark() from experiment_runner's
main process. Unfortunately, load_benchmark() for
some models does exit the calling process.
This results in experiment_runner exiting prematurely.
To Reproduce
Try to run under XLA any of the benchmarks added to the deny list in #6199. For example:
python xla/benchmarks/experiment_runner.py --dynamo=openxla --dynamo=openxla_eval --xla=PJRT --test=eval --test=train --accelerator=cuda --output-dirname=/tmp/pix2pix --repeat=5 --print-subprocess --suite-name=torchbench --filter='^pytorch_CycleGAN_and_pix2pix$' --log-level=debug ; echo $? Note: pytorch_CycleGAN_and_pix2pix also fails early under inductor.
Expected behavior
The above should print a 0 exit code regardless of whether the benchmark fails to run or not. However, it prints 2.
Environment
- Reproducible on XLA backend [CPU/TPU]: GPU
- torch_xla version: dfcf306 and later.
Metadata
Metadata
Assignees
Labels
No labels