benchmarks/torchbench_model: some benchmarks fail to load and kill experiment_runner's main process

🐛 Bug

In dfcf306e7 Apply precision config env vars in the root process. (#6152)
we started running load_benchmark() from experiment_runner's
main process. Unfortunately, load_benchmark() for
some models does exit the calling process.
This results in experiment_runner exiting prematurely.

To Reproduce

Try to run under XLA any of the benchmarks added to the deny list in #6199. For example:

python xla/benchmarks/experiment_runner.py --dynamo=openxla --dynamo=openxla_eval --xla=PJRT --test=eval --test=train --accelerator=cuda --output-dirname=/tmp/pix2pix --repeat=5 --print-subprocess --suite-name=torchbench --filter='^pytorch_CycleGAN_and_pix2pix$' --log-level=debug ; echo $?

Note: pytorch_CycleGAN_and_pix2pix also fails early under inductor.

Expected behavior

The above should print a 0 exit code regardless of whether the benchmark fails to run or not. However, it prints 2.

Environment

Reproducible on XLA backend [CPU/TPU]: GPU
torch_xla version: dfcf306 and later.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

benchmarks/torchbench_model: some benchmarks fail to load and kill experiment_runner's main process #6207

🐛 Bug

To Reproduce

Expected behavior

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

benchmarks/torchbench_model: some benchmarks fail to load and kill experiment_runner's main process #6207

Description

🐛 Bug

To Reproduce

Expected behavior

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions