Run detectnet_v2.ipynb error with my own data

I can run detectnet_v2.ipynb all the code with its default data,

however, when I change data in cv_sample/data/training with my own data, it can not work.

I run this cell:

!tao detectnet_v2 train -e $SPECS_DIR/detectnet_v2_train_resnet18_kitti.txt \ -r $USER_EXPERIMENT_DIR/experiment_dir_unpruned \ -k $KEY \ -n resnet18_detector \ --gpus $NUM_GPUS 

then it prints:

tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found. (0) Invalid argument: bboxes class ID out of range [0, 3[, got-1 [[{{node BboxRasterizer/RasterizeBbox}}]] [[resnet18_nopool_bn_detectnet_v2/block_1a_bn_shortcut/AssignMovingAvg_1/_3755]] (1) Invalid argument: bboxes class ID out of range [0, 3[, got-1 [[{{node BboxRasterizer/RasterizeBbox}}]] 0 successful operations. 0 derived errors ignored. 

how can I fix it?

Did you generate tfrecord files for your own data?

yes, of course

I run this code before:

!tao detectnet_v2 dataset_convert \ -d $SPECS_DIR/detectnet_v2_tfrecords_kitti_trainval.txt \ -o $DATA_DOWNLOAD_DIR/tfrecords/kitti_trainval/kitti_trainval 

and result can be seen:

!ls -rlt $LOCAL_DATA_DIR/tfrecords/kitti_trainval/ 
total 248 -rw-r--r-- 1 root root 3027 2月 15 16:03 kitti_trainval-fold-000-of-002-shard-00000-of-00010 -rw-r--r-- 1 root root 3025 2月 15 16:03 kitti_trainval-fold-000-of-002-shard-00001-of-00010 -rw-r--r-- 1 root root 3024 2月 15 16:03 kitti_trainval-fold-000-of-002-shard-00002-of-00010 -rw-r--r-- 1 root root 3027 2月 15 16:03 kitti_trainval-fold-000-of-002-shard-00003-of-00010 -rw-r--r-- 1 root root 3026 2月 15 16:03 kitti_trainval-fold-000-of-002-shard-00004-of-00010 -rw-r--r-- 1 root root 3025 2月 15 16:03 kitti_trainval-fold-000-of-002-shard-00005-of-00010 -rw-r--r-- 1 root root 3026 2月 15 16:03 kitti_trainval-fold-000-of-002-shard-00006-of-00010 -rw-r--r-- 1 root root 3025 2月 15 16:03 kitti_trainval-fold-000-of-002-shard-00007-of-00010 -rw-r--r-- 1 root root 3026 2月 15 16:03 kitti_trainval-fold-000-of-002-shard-00008-of-00010 -rw-r--r-- 1 root root 4839 2月 15 16:03 kitti_trainval-fold-000-of-002-shard-00009-of-00010 -rw-r--r-- 1 root root 19362 2月 15 16:03 kitti_trainval-fold-001-of-002-shard-00000-of-00010 -rw-r--r-- 1 root root 19366 2月 15 16:03 kitti_trainval-fold-001-of-002-shard-00001-of-00010 -rw-r--r-- 1 root root 19362 2月 15 16:03 kitti_trainval-fold-001-of-002-shard-00002-of-00010 -rw-r--r-- 1 root root 19365 2月 15 16:03 kitti_trainval-fold-001-of-002-shard-00003-of-00010 -rw-r--r-- 1 root root 19364 2月 15 16:03 kitti_trainval-fold-001-of-002-shard-00004-of-00010 -rw-r--r-- 1 root root 19359 2月 15 16:03 kitti_trainval-fold-001-of-002-shard-00005-of-00010 -rw-r--r-- 1 root root 19364 2月 15 16:03 kitti_trainval-fold-001-of-002-shard-00006-of-00010 -rw-r--r-- 1 root root 19362 2月 15 16:03 kitti_trainval-fold-001-of-002-shard-00007-of-00010 -rw-r--r-- 1 root root 19364 2月 15 16:03 kitti_trainval-fold-001-of-002-shard-00008-of-00010 -rw-r--r-- 1 root root 23598 2月 15 16:03 kitti_trainval-fold-001-of-002-shard-00009-of-00010 

I run detectnet_v2.ipynb step by step.

Could you share the log when you run above “!tao detectnet_v2 dataset_convert” ?

After running the code, it prints:

Converting Tfrecords for kitti trainval dataset 2022-02-16 14:26:31,913 [INFO] root: Registry: ['nvcr.io'] 2022-02-16 14:26:32,077 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit-tf:v3.21.11-tf1.15.4-py3 2022-02-16 14:26:32,708 [WARNING] tlt.components.docker_handler.docker_handler: Docker will run the commands as root. If you would like to retain your local host permissions, please add the "user":"UID:GID" in the DockerOptions portion of the "/root/.tao_mounts.json" file. You can obtain your users UID and GID by using the "id -u" and "id -g" commands on the terminal. Using TensorFlow backend. WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them. Using TensorFlow backend. 2022-02-16 06:26:41,630 [INFO] iva.detectnet_v2.dataio.build_converter: Instantiating a kitti converter 2022-02-16 06:26:41,630 [INFO] root: Instantiating a kitti converter 2022-02-16 06:26:41,631 [INFO] iva.detectnet_v2.dataio.dataset_converter_lib: Creating output directory /workspace/tao-experiments/data/tfrecords/kitti_trainval 2022-02-16 06:26:41,631 [INFO] root: Generating partitions 2022-02-16 06:26:41,632 [INFO] iva.detectnet_v2.dataio.kitti_converter_lib: Num images in Train: 327	Val: 53 2022-02-16 06:26:41,633 [INFO] root: Num images in Train: 327	Val: 53 2022-02-16 06:26:41,633 [INFO] iva.detectnet_v2.dataio.kitti_converter_lib: Validation data in partition 0. Hence, while choosing the validationset during training choose validation_fold 0. 2022-02-16 06:26:41,633 [INFO] root: Validation data in partition 0. Hence, while choosing the validationset during training choose validation_fold 0. 2022-02-16 06:26:41,633 [INFO] iva.detectnet_v2.dataio.dataset_converter_lib: Writing partition 0, shard 0 2022-02-16 06:26:41,633 [INFO] root: Writing partition 0, shard 0 WARNING:tensorflow:From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/dataio/dataset_converter_lib.py:161: The name tf.python_io.TFRecordWriter is deprecated. Please use tf.io.TFRecordWriter instead. 2022-02-16 06:26:41,633 [WARNING] tensorflow: From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/dataio/dataset_converter_lib.py:161: The name tf.python_io.TFRecordWriter is deprecated. Please use tf.io.TFRecordWriter instead. /usr/local/lib/python3.6/dist-packages/iva/detectnet_v2/dataio/kitti_converter_lib.py:297: VisibleDeprecationWarning: Reading unicode strings without specifying the encoding argument is deprecated. Set the encoding, use None for the system default. 2022-02-16 06:26:41,647 [INFO] iva.detectnet_v2.dataio.dataset_converter_lib: Writing partition 0, shard 1 2022-02-16 06:26:41,647 [INFO] root: Writing partition 0, shard 1 2022-02-16 06:26:41,652 [INFO] iva.detectnet_v2.dataio.dataset_converter_lib: Writing partition 0, shard 2 2022-02-16 06:26:41,652 [INFO] root: Writing partition 0, shard 2 2022-02-16 06:26:41,658 [INFO] iva.detectnet_v2.dataio.dataset_converter_lib: Writing partition 0, shard 3 2022-02-16 06:26:41,658 [INFO] root: Writing partition 0, shard 3 2022-02-16 06:26:41,664 [INFO] iva.detectnet_v2.dataio.dataset_converter_lib: Writing partition 0, shard 4 2022-02-16 06:26:41,665 [INFO] root: Writing partition 0, shard 4 2022-02-16 06:26:41,670 [INFO] iva.detectnet_v2.dataio.dataset_converter_lib: Writing partition 0, shard 5 2022-02-16 06:26:41,670 [INFO] root: Writing partition 0, shard 5 2022-02-16 06:26:41,678 [INFO] iva.detectnet_v2.dataio.dataset_converter_lib: Writing partition 0, shard 6 2022-02-16 06:26:41,678 [INFO] root: Writing partition 0, shard 6 2022-02-16 06:26:41,683 [INFO] iva.detectnet_v2.dataio.dataset_converter_lib: Writing partition 0, shard 7 2022-02-16 06:26:41,683 [INFO] root: Writing partition 0, shard 7 2022-02-16 06:26:41,689 [INFO] iva.detectnet_v2.dataio.dataset_converter_lib: Writing partition 0, shard 8 2022-02-16 06:26:41,689 [INFO] root: Writing partition 0, shard 8 2022-02-16 06:26:41,695 [INFO] iva.detectnet_v2.dataio.dataset_converter_lib: Writing partition 0, shard 9 2022-02-16 06:26:41,695 [INFO] root: Writing partition 0, shard 9 2022-02-16 06:26:41,704 [INFO] iva.detectnet_v2.dataio.dataset_converter_lib: Wrote the following numbers of objects: b'city': 53 2022-02-16 06:26:41,704 [INFO] iva.detectnet_v2.dataio.dataset_converter_lib: Writing partition 1, shard 0 2022-02-16 06:26:41,704 [INFO] root: Writing partition 1, shard 0 2022-02-16 06:26:41,741 [INFO] iva.detectnet_v2.dataio.dataset_converter_lib: Writing partition 1, shard 1 2022-02-16 06:26:41,741 [INFO] root: Writing partition 1, shard 1 2022-02-16 06:26:41,778 [INFO] iva.detectnet_v2.dataio.dataset_converter_lib: Writing partition 1, shard 2 2022-02-16 06:26:41,778 [INFO] root: Writing partition 1, shard 2 2022-02-16 06:26:41,815 [INFO] iva.detectnet_v2.dataio.dataset_converter_lib: Writing partition 1, shard 3 2022-02-16 06:26:41,815 [INFO] root: Writing partition 1, shard 3 2022-02-16 06:26:41,853 [INFO] iva.detectnet_v2.dataio.dataset_converter_lib: Writing partition 1, shard 4 2022-02-16 06:26:41,853 [INFO] root: Writing partition 1, shard 4 2022-02-16 06:26:41,890 [INFO] iva.detectnet_v2.dataio.dataset_converter_lib: Writing partition 1, shard 5 2022-02-16 06:26:41,890 [INFO] root: Writing partition 1, shard 5 2022-02-16 06:26:41,927 [INFO] iva.detectnet_v2.dataio.dataset_converter_lib: Writing partition 1, shard 6 2022-02-16 06:26:41,927 [INFO] root: Writing partition 1, shard 6 2022-02-16 06:26:41,965 [INFO] iva.detectnet_v2.dataio.dataset_converter_lib: Writing partition 1, shard 7 2022-02-16 06:26:41,966 [INFO] root: Writing partition 1, shard 7 2022-02-16 06:26:42,002 [INFO] iva.detectnet_v2.dataio.dataset_converter_lib: Writing partition 1, shard 8 2022-02-16 06:26:42,002 [INFO] root: Writing partition 1, shard 8 2022-02-16 06:26:42,037 [INFO] iva.detectnet_v2.dataio.dataset_converter_lib: Writing partition 1, shard 9 2022-02-16 06:26:42,037 [INFO] root: Writing partition 1, shard 9 2022-02-16 06:26:42,081 [INFO] iva.detectnet_v2.dataio.dataset_converter_lib: Wrote the following numbers of objects: b'city': 327 2022-02-16 06:26:42,081 [INFO] iva.detectnet_v2.dataio.dataset_converter_lib: Cumulative object statistics 2022-02-16 06:26:42,081 [INFO] root: Cumulative object statistics 2022-02-16 06:26:42,081 [INFO] root: { "city": 380 } 2022-02-16 06:26:42,081 [INFO] iva.detectnet_v2.dataio.dataset_converter_lib: Wrote the following numbers of objects: b'city': 380 2022-02-16 06:26:42,081 [INFO] iva.detectnet_v2.dataio.dataset_converter_lib: Class map. Label in GT: Label in tfrecords file b'city': b'city' 2022-02-16 06:26:42,081 [INFO] root: Class map. Label in GT: Label in tfrecords file b'city': b'city' For the dataset_config in the experiment_spec, please use labels in the tfrecords file, while writing the classmap. 2022-02-16 06:26:42,081 [INFO] root: For the dataset_config in the experiment_spec, please use labels in the tfrecords file, while writing the classmap. 2022-02-16 06:26:42,081 [INFO] iva.detectnet_v2.dataio.dataset_converter_lib: Tfrecords generation complete. 2022-02-16 06:26:42,081 [INFO] root: TFRecords generation complete. 2022-02-16 14:26:43,674 [INFO] tlt.components.docker_handler.docker_handler: Stopping container. 

So you are training one class “city”, did you set correctly in the detectnet_v2_train_resnet18_kitti.txt ?

yes, only one class: “city”

this is the detectnet_v2_train_resnet18_kitti.txt:

random_seed: 42 dataset_config { data_sources { tfrecords_path: "/workspace/tao-experiments/data/tfrecords/kitti_trainval/*" image_directory_path: "/workspace/tao-experiments/data/training" } image_extension: "jpg" target_class_mapping { key: "city" value: "city" } validation_fold: 0 } augmentation_config { preprocessing { output_image_width: 1248 output_image_height: 384 min_bbox_width: 1.0 min_bbox_height: 1.0 output_image_channel: 3 } spatial_augmentation { hflip_probability: 0.5 zoom_min: 1.0 zoom_max: 1.0 translate_max_x: 8.0 translate_max_y: 8.0 } color_augmentation { hue_rotation_max: 25.0 saturation_shift_max: 0.20000000298 contrast_scale_max: 0.10000000149 contrast_center: 0.5 } } postprocessing_config { target_class_config { key: "car" value { clustering_config { clustering_algorithm: DBSCAN dbscan_confidence_threshold: 0.9 coverage_threshold: 0.00499999988824 dbscan_eps: 0.20000000298 dbscan_min_samples: 0.0500000007451 minimum_bounding_box_height: 20 } } } target_class_config { key: "cyclist" value { clustering_config { clustering_algorithm: DBSCAN dbscan_confidence_threshold: 0.9 coverage_threshold: 0.00499999988824 dbscan_eps: 0.15000000596 dbscan_min_samples: 0.0500000007451 minimum_bounding_box_height: 20 } } } target_class_config { key: "pedestrian" value { clustering_config { clustering_algorithm: DBSCAN dbscan_confidence_threshold: 0.9 coverage_threshold: 0.00749999983236 dbscan_eps: 0.230000004172 dbscan_min_samples: 0.0500000007451 minimum_bounding_box_height: 20 } } } } model_config { pretrained_model_file: "/workspace/tao-experiments/detectnet_v2/pretrained_resnet18/pretrained_detectnet_v2_vresnet18/resnet18.hdf5" num_layers: 18 use_batch_norm: true objective_set { bbox { scale: 35.0 offset: 0.5 } cov { } } arch: "resnet" } evaluation_config { validation_period_during_training: 10 first_validation_epoch: 30 minimum_detection_ground_truth_overlap { key: "car" value: 0.699999988079 } minimum_detection_ground_truth_overlap { key: "cyclist" value: 0.5 } minimum_detection_ground_truth_overlap { key: "pedestrian" value: 0.5 } evaluation_box_config { key: "car" value { minimum_height: 20 maximum_height: 9999 minimum_width: 10 maximum_width: 9999 } } evaluation_box_config { key: "cyclist" value { minimum_height: 20 maximum_height: 9999 minimum_width: 10 maximum_width: 9999 } } evaluation_box_config { key: "pedestrian" value { minimum_height: 20 maximum_height: 9999 minimum_width: 10 maximum_width: 9999 } } average_precision_mode: INTEGRATE } cost_function_config { target_classes { name: "car" class_weight: 1.0 coverage_foreground_weight: 0.0500000007451 objectives { name: "cov" initial_weight: 1.0 weight_target: 1.0 } objectives { name: "bbox" initial_weight: 10.0 weight_target: 10.0 } } target_classes { name: "cyclist" class_weight: 8.0 coverage_foreground_weight: 0.0500000007451 objectives { name: "cov" initial_weight: 1.0 weight_target: 1.0 } objectives { name: "bbox" initial_weight: 10.0 weight_target: 1.0 } } target_classes { name: "pedestrian" class_weight: 4.0 coverage_foreground_weight: 0.0500000007451 objectives { name: "cov" initial_weight: 1.0 weight_target: 1.0 } objectives { name: "bbox" initial_weight: 10.0 weight_target: 10.0 } } enable_autoweighting: true max_objective_weight: 0.999899983406 min_objective_weight: 9.99999974738e-05 } training_config { batch_size_per_gpu: 4 num_epochs: 120 learning_rate { soft_start_annealing_schedule { min_learning_rate: 5e-06 max_learning_rate: 5e-04 soft_start: 0.10000000149 annealing: 0.699999988079 } } regularizer { type: L1 weight: 3.00000002618e-09 } optimizer { adam { epsilon: 9.99999993923e-09 beta1: 0.899999976158 beta2: 0.999000012875 } } cost_scaling { initial_exponent: 20.0 increment: 0.005 decrement: 1.0 } checkpoint_interval: 10 } bbox_rasterizer_config { target_class_config { key: "car" value { cov_center_x: 0.5 cov_center_y: 0.5 cov_radius_x: 0.40000000596 cov_radius_y: 0.40000000596 bbox_min_radius: 1.0 } } target_class_config { key: "cyclist" value { cov_center_x: 0.5 cov_center_y: 0.5 cov_radius_x: 1.0 cov_radius_y: 1.0 bbox_min_radius: 1.0 } } target_class_config { key: "pedestrian" value { cov_center_x: 0.5 cov_center_y: 0.5 cov_radius_x: 1.0 cov_radius_y: 1.0 bbox_min_radius: 1.0 } } deadzone_radius: 0.400000154972 } 

Is there something wrong with output_image_width or output_image_height?

There are some other info like

postprocessing_config {
target_class_config {
key: “car”

The sample spec file are training 3 classes. But you are training 1 class.
A simple way is that you can replace all the “car” with “city”. And then remove other 2 classes in original spec file.

For output_image_width or output_image_height, firstly, make sure your training images have the same resolution. Then set it in output_image_width or output_image_height.

after doing this, it still can’t work.

when running !tao detectnet_v2 train, it prints:

2022-02-16 15:07:22,015 [INFO] root: Registry: ['nvcr.io'] 2022-02-16 15:07:22,180 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit-tf:v3.21.11-tf1.15.4-py3 2022-02-16 15:07:22,822 [WARNING] tlt.components.docker_handler.docker_handler: Docker will run the commands as root. If you would like to retain your local host permissions, please add the "user":"UID:GID" in the DockerOptions portion of the "/root/.tao_mounts.json" file. You can obtain your users UID and GID by using the "id -u" and "id -g" commands on the terminal. Using TensorFlow backend. WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them. Using TensorFlow backend. WARNING:tensorflow:From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/cost_function/cost_auto_weight_hook.py:43: The name tf.train.SessionRunHook is deprecated. Please use tf.estimator.SessionRunHook instead. 2022-02-16 07:07:31,654 [WARNING] tensorflow: From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/cost_function/cost_auto_weight_hook.py:43: The name tf.train.SessionRunHook is deprecated. Please use tf.estimator.SessionRunHook instead. WARNING:tensorflow:From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/checkpoint_saver_hook.py:25: The name tf.train.CheckpointSaverHook is deprecated. Please use tf.estimator.CheckpointSaverHook instead. 2022-02-16 07:07:31,776 [WARNING] tensorflow: From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/checkpoint_saver_hook.py:25: The name tf.train.CheckpointSaverHook is deprecated. Please use tf.estimator.CheckpointSaverHook instead. WARNING:tensorflow:From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py:69: The name tf.logging.set_verbosity is deprecated. Please use tf.compat.v1.logging.set_verbosity instead. 2022-02-16 07:07:31,778 [WARNING] tensorflow: From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py:69: The name tf.logging.set_verbosity is deprecated. Please use tf.compat.v1.logging.set_verbosity instead. WARNING:tensorflow:From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py:69: The name tf.logging.INFO is deprecated. Please use tf.compat.v1.logging.INFO instead. 2022-02-16 07:07:31,779 [WARNING] tensorflow: From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py:69: The name tf.logging.INFO is deprecated. Please use tf.compat.v1.logging.INFO instead. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/horovod/tensorflow/__init__.py:117: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead. 2022-02-16 07:07:31,787 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/horovod/tensorflow/__init__.py:117: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/horovod/tensorflow/__init__.py:143: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead. 2022-02-16 07:07:31,787 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/horovod/tensorflow/__init__.py:143: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead. 2022-02-16 07:07:33,022 [INFO] iva.common.logging.logging: Log file already exists at /workspace/tao-experiments/detectnet_v2/experiment_dir_unpruned/status.json 2022-02-16 07:07:33,023 [INFO] __main__: Loading experiment spec at /workspace/tao-experiments/detectnet_v2/specs/detectnet_v2_train_resnet18_kitti.txt. 2022-02-16 07:07:33,028 [INFO] iva.detectnet_v2.spec_handler.spec_loader: Merging specification from /workspace/tao-experiments/detectnet_v2/specs/detectnet_v2_train_resnet18_kitti.txt 2022-02-16 07:07:33,271 [INFO] __main__: Cannot iterate over exactly 327 samples with a batch size of 4; each epoch will therefore take one extra step. WARNING:tensorflow:From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/cost_function/cost_auto_weight_hook.py:107: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead. 2022-02-16 07:07:33,274 [WARNING] tensorflow: From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/cost_function/cost_auto_weight_hook.py:107: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead. WARNING:tensorflow:From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/cost_function/cost_auto_weight_hook.py:110: The name tf.get_variable is deprecated. Please use tf.compat.v1.get_variable instead. 2022-02-16 07:07:33,274 [WARNING] tensorflow: From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/cost_function/cost_auto_weight_hook.py:110: The name tf.get_variable is deprecated. Please use tf.compat.v1.get_variable instead. WARNING:tensorflow:From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/cost_function/cost_auto_weight_hook.py:113: The name tf.assign is deprecated. Please use tf.compat.v1.assign instead. 2022-02-16 07:07:33,277 [WARNING] tensorflow: From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/cost_function/cost_auto_weight_hook.py:113: The name tf.assign is deprecated. Please use tf.compat.v1.assign instead. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead. 2022-02-16 07:07:33,305 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4138: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead. 2022-02-16 07:07:33,307 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4138: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:1834: The name tf.nn.fused_batch_norm is deprecated. Please use tf.compat.v1.nn.fused_batch_norm instead. 2022-02-16 07:07:33,334 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:1834: The name tf.nn.fused_batch_norm is deprecated. Please use tf.compat.v1.nn.fused_batch_norm instead. WARNING:tensorflow:From /opt/nvidia/third_party/keras/tensorflow_backend.py:187: The name tf.nn.avg_pool is deprecated. Please use tf.nn.avg_pool2d instead. 2022-02-16 07:07:34,723 [WARNING] tensorflow: From /opt/nvidia/third_party/keras/tensorflow_backend.py:187: The name tf.nn.avg_pool is deprecated. Please use tf.nn.avg_pool2d instead. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:174: The name tf.get_default_session is deprecated. Please use tf.compat.v1.get_default_session instead. 2022-02-16 07:07:35,039 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:174: The name tf.get_default_session is deprecated. Please use tf.compat.v1.get_default_session instead. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:199: The name tf.is_variable_initialized is deprecated. Please use tf.compat.v1.is_variable_initialized instead. 2022-02-16 07:07:35,040 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:199: The name tf.is_variable_initialized is deprecated. Please use tf.compat.v1.is_variable_initialized instead. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:206: The name tf.variables_initializer is deprecated. Please use tf.compat.v1.variables_initializer instead. 2022-02-16 07:07:35,419 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:206: The name tf.variables_initializer is deprecated. Please use tf.compat.v1.variables_initializer instead. 2022-02-16 07:07:44,398 [INFO] iva.detectnet_v2.objectives.bbox_objective: Default L1 loss function will be used. __________________________________________________________________________________________________ Layer (type) Output Shape Param # Connected to ================================================================================================== input_1 (InputLayer) (None, 3, 3024, 4032 0 __________________________________________________________________________________________________ conv1 (Conv2D) (None, 64, 1512, 201 9472 input_1[0][0] __________________________________________________________________________________________________ bn_conv1 (BatchNormalization) (None, 64, 1512, 201 256 conv1[0][0] __________________________________________________________________________________________________ activation_1 (Activation) (None, 64, 1512, 201 0 bn_conv1[0][0] __________________________________________________________________________________________________ block_1a_conv_1 (Conv2D) (None, 64, 756, 1008 36928 activation_1[0][0] __________________________________________________________________________________________________ block_1a_bn_1 (BatchNormalizati (None, 64, 756, 1008 256 block_1a_conv_1[0][0] __________________________________________________________________________________________________ block_1a_relu_1 (Activation) (None, 64, 756, 1008 0 block_1a_bn_1[0][0] __________________________________________________________________________________________________ block_1a_conv_2 (Conv2D) (None, 64, 756, 1008 36928 block_1a_relu_1[0][0] __________________________________________________________________________________________________ block_1a_conv_shortcut (Conv2D) (None, 64, 756, 1008 4160 activation_1[0][0] __________________________________________________________________________________________________ block_1a_bn_2 (BatchNormalizati (None, 64, 756, 1008 256 block_1a_conv_2[0][0] __________________________________________________________________________________________________ block_1a_bn_shortcut (BatchNorm (None, 64, 756, 1008 256 block_1a_conv_shortcut[0][0] __________________________________________________________________________________________________ add_1 (Add) (None, 64, 756, 1008 0 block_1a_bn_2[0][0] block_1a_bn_shortcut[0][0] __________________________________________________________________________________________________ block_1a_relu (Activation) (None, 64, 756, 1008 0 add_1[0][0] __________________________________________________________________________________________________ block_1b_conv_1 (Conv2D) (None, 64, 756, 1008 36928 block_1a_relu[0][0] __________________________________________________________________________________________________ block_1b_bn_1 (BatchNormalizati (None, 64, 756, 1008 256 block_1b_conv_1[0][0] __________________________________________________________________________________________________ block_1b_relu_1 (Activation) (None, 64, 756, 1008 0 block_1b_bn_1[0][0] __________________________________________________________________________________________________ block_1b_conv_2 (Conv2D) (None, 64, 756, 1008 36928 block_1b_relu_1[0][0] __________________________________________________________________________________________________ block_1b_bn_2 (BatchNormalizati (None, 64, 756, 1008 256 block_1b_conv_2[0][0] __________________________________________________________________________________________________ add_2 (Add) (None, 64, 756, 1008 0 block_1b_bn_2[0][0] block_1a_relu[0][0] __________________________________________________________________________________________________ block_1b_relu (Activation) (None, 64, 756, 1008 0 add_2[0][0] __________________________________________________________________________________________________ block_2a_conv_1 (Conv2D) (None, 128, 378, 504 73856 block_1b_relu[0][0] __________________________________________________________________________________________________ block_2a_bn_1 (BatchNormalizati (None, 128, 378, 504 512 block_2a_conv_1[0][0] __________________________________________________________________________________________________ block_2a_relu_1 (Activation) (None, 128, 378, 504 0 block_2a_bn_1[0][0] __________________________________________________________________________________________________ block_2a_conv_2 (Conv2D) (None, 128, 378, 504 147584 block_2a_relu_1[0][0] __________________________________________________________________________________________________ block_2a_conv_shortcut (Conv2D) (None, 128, 378, 504 8320 block_1b_relu[0][0] __________________________________________________________________________________________________ block_2a_bn_2 (BatchNormalizati (None, 128, 378, 504 512 block_2a_conv_2[0][0] __________________________________________________________________________________________________ block_2a_bn_shortcut (BatchNorm (None, 128, 378, 504 512 block_2a_conv_shortcut[0][0] __________________________________________________________________________________________________ add_3 (Add) (None, 128, 378, 504 0 block_2a_bn_2[0][0] block_2a_bn_shortcut[0][0] __________________________________________________________________________________________________ block_2a_relu (Activation) (None, 128, 378, 504 0 add_3[0][0] __________________________________________________________________________________________________ block_2b_conv_1 (Conv2D) (None, 128, 378, 504 147584 block_2a_relu[0][0] __________________________________________________________________________________________________ block_2b_bn_1 (BatchNormalizati (None, 128, 378, 504 512 block_2b_conv_1[0][0] __________________________________________________________________________________________________ block_2b_relu_1 (Activation) (None, 128, 378, 504 0 block_2b_bn_1[0][0] __________________________________________________________________________________________________ block_2b_conv_2 (Conv2D) (None, 128, 378, 504 147584 block_2b_relu_1[0][0] __________________________________________________________________________________________________ block_2b_bn_2 (BatchNormalizati (None, 128, 378, 504 512 block_2b_conv_2[0][0] __________________________________________________________________________________________________ add_4 (Add) (None, 128, 378, 504 0 block_2b_bn_2[0][0] block_2a_relu[0][0] __________________________________________________________________________________________________ block_2b_relu (Activation) (None, 128, 378, 504 0 add_4[0][0] __________________________________________________________________________________________________ block_3a_conv_1 (Conv2D) (None, 256, 189, 252 295168 block_2b_relu[0][0] __________________________________________________________________________________________________ block_3a_bn_1 (BatchNormalizati (None, 256, 189, 252 1024 block_3a_conv_1[0][0] __________________________________________________________________________________________________ block_3a_relu_1 (Activation) (None, 256, 189, 252 0 block_3a_bn_1[0][0] __________________________________________________________________________________________________ block_3a_conv_2 (Conv2D) (None, 256, 189, 252 590080 block_3a_relu_1[0][0] __________________________________________________________________________________________________ block_3a_conv_shortcut (Conv2D) (None, 256, 189, 252 33024 block_2b_relu[0][0] __________________________________________________________________________________________________ block_3a_bn_2 (BatchNormalizati (None, 256, 189, 252 1024 block_3a_conv_2[0][0] __________________________________________________________________________________________________ block_3a_bn_shortcut (BatchNorm (None, 256, 189, 252 1024 block_3a_conv_shortcut[0][0] __________________________________________________________________________________________________ add_5 (Add) (None, 256, 189, 252 0 block_3a_bn_2[0][0] block_3a_bn_shortcut[0][0] __________________________________________________________________________________________________ block_3a_relu (Activation) (None, 256, 189, 252 0 add_5[0][0] __________________________________________________________________________________________________ block_3b_conv_1 (Conv2D) (None, 256, 189, 252 590080 block_3a_relu[0][0] __________________________________________________________________________________________________ block_3b_bn_1 (BatchNormalizati (None, 256, 189, 252 1024 block_3b_conv_1[0][0] __________________________________________________________________________________________________ block_3b_relu_1 (Activation) (None, 256, 189, 252 0 block_3b_bn_1[0][0] __________________________________________________________________________________________________ block_3b_conv_2 (Conv2D) (None, 256, 189, 252 590080 block_3b_relu_1[0][0] __________________________________________________________________________________________________ block_3b_bn_2 (BatchNormalizati (None, 256, 189, 252 1024 block_3b_conv_2[0][0] __________________________________________________________________________________________________ add_6 (Add) (None, 256, 189, 252 0 block_3b_bn_2[0][0] block_3a_relu[0][0] __________________________________________________________________________________________________ block_3b_relu (Activation) (None, 256, 189, 252 0 add_6[0][0] __________________________________________________________________________________________________ block_4a_conv_1 (Conv2D) (None, 512, 189, 252 1180160 block_3b_relu[0][0] __________________________________________________________________________________________________ block_4a_bn_1 (BatchNormalizati (None, 512, 189, 252 2048 block_4a_conv_1[0][0] __________________________________________________________________________________________________ block_4a_relu_1 (Activation) (None, 512, 189, 252 0 block_4a_bn_1[0][0] __________________________________________________________________________________________________ block_4a_conv_2 (Conv2D) (None, 512, 189, 252 2359808 block_4a_relu_1[0][0] __________________________________________________________________________________________________ block_4a_conv_shortcut (Conv2D) (None, 512, 189, 252 131584 block_3b_relu[0][0] __________________________________________________________________________________________________ block_4a_bn_2 (BatchNormalizati (None, 512, 189, 252 2048 block_4a_conv_2[0][0] __________________________________________________________________________________________________ block_4a_bn_shortcut (BatchNorm (None, 512, 189, 252 2048 block_4a_conv_shortcut[0][0] __________________________________________________________________________________________________ add_7 (Add) (None, 512, 189, 252 0 block_4a_bn_2[0][0] block_4a_bn_shortcut[0][0] __________________________________________________________________________________________________ block_4a_relu (Activation) (None, 512, 189, 252 0 add_7[0][0] __________________________________________________________________________________________________ block_4b_conv_1 (Conv2D) (None, 512, 189, 252 2359808 block_4a_relu[0][0] __________________________________________________________________________________________________ block_4b_bn_1 (BatchNormalizati (None, 512, 189, 252 2048 block_4b_conv_1[0][0] __________________________________________________________________________________________________ block_4b_relu_1 (Activation) (None, 512, 189, 252 0 block_4b_bn_1[0][0] __________________________________________________________________________________________________ block_4b_conv_2 (Conv2D) (None, 512, 189, 252 2359808 block_4b_relu_1[0][0] __________________________________________________________________________________________________ block_4b_bn_2 (BatchNormalizati (None, 512, 189, 252 2048 block_4b_conv_2[0][0] __________________________________________________________________________________________________ add_8 (Add) (None, 512, 189, 252 0 block_4b_bn_2[0][0] block_4a_relu[0][0] __________________________________________________________________________________________________ block_4b_relu (Activation) (None, 512, 189, 252 0 add_8[0][0] __________________________________________________________________________________________________ output_bbox (Conv2D) (None, 4, 189, 252) 2052 block_4b_relu[0][0] __________________________________________________________________________________________________ output_cov (Conv2D) (None, 1, 189, 252) 513 block_4b_relu[0][0] ================================================================================================== Total params: 11,197,893 Trainable params: 11,188,165 Non-trainable params: 9,728 __________________________________________________________________________________________________ 2022-02-16 07:07:44,442 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Serial augmentation enabled = False 2022-02-16 07:07:44,442 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Pseudo sharding enabled = False 2022-02-16 07:07:44,443 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Max Image Dimensions (all sources): (0, 0) 2022-02-16 07:07:44,443 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: number of cpus: 48, io threads: 96, compute threads: 48, buffered batches: 4 2022-02-16 07:07:44,443 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: total dataset size 327, number of sources: 1, batch size per gpu: 4, steps: 82 WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/autograph/converters/directives.py:119: The name tf.set_random_seed is deprecated. Please use tf.compat.v1.set_random_seed instead. 2022-02-16 07:07:44,499 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/autograph/converters/directives.py:119: The name tf.set_random_seed is deprecated. Please use tf.compat.v1.set_random_seed instead. WARNING:tensorflow:Entity <bound method DriveNetTFRecordsParser.__call__ of <iva.detectnet_v2.dataloader.drivenet_dataloader.DriveNetTFRecordsParser object at 0x7f06e6b72518>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method DriveNetTFRecordsParser.__call__ of <iva.detectnet_v2.dataloader.drivenet_dataloader.DriveNetTFRecordsParser object at 0x7f06e6b72518>>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code 2022-02-16 07:07:44,559 [WARNING] tensorflow: Entity <bound method DriveNetTFRecordsParser.__call__ of <iva.detectnet_v2.dataloader.drivenet_dataloader.DriveNetTFRecordsParser object at 0x7f06e6b72518>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method DriveNetTFRecordsParser.__call__ of <iva.detectnet_v2.dataloader.drivenet_dataloader.DriveNetTFRecordsParser object at 0x7f06e6b72518>>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code 2022-02-16 07:07:44,584 [INFO] iva.detectnet_v2.dataloader.default_dataloader: Bounding box coordinates were detected in the input specification! Bboxes will be automatically converted to polygon coordinates. 2022-02-16 07:07:44,892 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: shuffle: True - shard 0 of 1 2022-02-16 07:07:44,900 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: sampling 1 datasets with weights: 2022-02-16 07:07:44,900 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: source: 0 weight: 1.000000 WARNING:tensorflow:Entity <bound method Processor.__call__ of <modulus.blocks.data_loaders.multi_source_loader.processors.asset_loader.AssetLoader object at 0x7f07707ec860>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method Processor.__call__ of <modulus.blocks.data_loaders.multi_source_loader.processors.asset_loader.AssetLoader object at 0x7f07707ec860>>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code 2022-02-16 07:07:44,918 [WARNING] tensorflow: Entity <bound method Processor.__call__ of <modulus.blocks.data_loaders.multi_source_loader.processors.asset_loader.AssetLoader object at 0x7f07707ec860>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method Processor.__call__ of <modulus.blocks.data_loaders.multi_source_loader.processors.asset_loader.AssetLoader object at 0x7f07707ec860>>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code 2022-02-16 07:07:45,357 [INFO] __main__: Found 327 samples in training set WARNING:tensorflow:From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/rasterizers/bbox_rasterizer.py:347: The name tf.bincount is deprecated. Please use tf.math.bincount instead. 2022-02-16 07:07:45,482 [WARNING] tensorflow: From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/rasterizers/bbox_rasterizer.py:347: The name tf.bincount is deprecated. Please use tf.math.bincount instead. WARNING:tensorflow:From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/training/training_proto_utilities.py:89: The name tf.train.get_or_create_global_step is deprecated. Please use tf.compat.v1.train.get_or_create_global_step instead. 2022-02-16 07:07:45,608 [WARNING] tensorflow: From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/training/training_proto_utilities.py:89: The name tf.train.get_or_create_global_step is deprecated. Please use tf.compat.v1.train.get_or_create_global_step instead. WARNING:tensorflow:From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/training/training_proto_utilities.py:36: The name tf.train.AdamOptimizer is deprecated. Please use tf.compat.v1.train.AdamOptimizer instead. 2022-02-16 07:07:45,626 [WARNING] tensorflow: From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/training/training_proto_utilities.py:36: The name tf.train.AdamOptimizer is deprecated. Please use tf.compat.v1.train.AdamOptimizer instead. WARNING:tensorflow:From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/cost_function/cost_functions.py:17: The name tf.log is deprecated. Please use tf.math.log instead. 2022-02-16 07:07:45,803 [WARNING] tensorflow: From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/cost_function/cost_functions.py:17: The name tf.log is deprecated. Please use tf.math.log instead. WARNING:tensorflow:From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/cost_function/cost_auto_weight_hook.py:235: The name tf.assign_add is deprecated. Please use tf.compat.v1.assign_add instead. 2022-02-16 07:07:45,814 [WARNING] tensorflow: From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/cost_function/cost_auto_weight_hook.py:235: The name tf.assign_add is deprecated. Please use tf.compat.v1.assign_add instead. WARNING:tensorflow:From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/model/detectnet_model.py:591: The name tf.summary.scalar is deprecated. Please use tf.compat.v1.summary.scalar instead. 2022-02-16 07:07:45,818 [WARNING] tensorflow: From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/model/detectnet_model.py:591: The name tf.summary.scalar is deprecated. Please use tf.compat.v1.summary.scalar instead. 2022-02-16 07:07:47,480 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Serial augmentation enabled = False 2022-02-16 07:07:47,481 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Pseudo sharding enabled = False 2022-02-16 07:07:47,481 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Max Image Dimensions (all sources): (0, 0) 2022-02-16 07:07:47,481 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: number of cpus: 48, io threads: 96, compute threads: 48, buffered batches: 4 2022-02-16 07:07:47,481 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: total dataset size 53, number of sources: 1, batch size per gpu: 4, steps: 14 WARNING:tensorflow:Entity <bound method DriveNetTFRecordsParser.__call__ of <iva.detectnet_v2.dataloader.drivenet_dataloader.DriveNetTFRecordsParser object at 0x7f06e6b724a8>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method DriveNetTFRecordsParser.__call__ of <iva.detectnet_v2.dataloader.drivenet_dataloader.DriveNetTFRecordsParser object at 0x7f06e6b724a8>>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code 2022-02-16 07:07:47,498 [WARNING] tensorflow: Entity <bound method DriveNetTFRecordsParser.__call__ of <iva.detectnet_v2.dataloader.drivenet_dataloader.DriveNetTFRecordsParser object at 0x7f06e6b724a8>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method DriveNetTFRecordsParser.__call__ of <iva.detectnet_v2.dataloader.drivenet_dataloader.DriveNetTFRecordsParser object at 0x7f06e6b724a8>>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code 2022-02-16 07:07:47,539 [INFO] iva.detectnet_v2.dataloader.default_dataloader: Bounding box coordinates were detected in the input specification! Bboxes will be automatically converted to polygon coordinates. 2022-02-16 07:07:47,992 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: shuffle: False - shard 0 of 1 2022-02-16 07:07:47,998 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: sampling 1 datasets with weights: 2022-02-16 07:07:47,998 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: source: 0 weight: 1.000000 WARNING:tensorflow:Entity <bound method Processor.__call__ of <modulus.blocks.data_loaders.multi_source_loader.processors.asset_loader.AssetLoader object at 0x7f0754308278>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method Processor.__call__ of <modulus.blocks.data_loaders.multi_source_loader.processors.asset_loader.AssetLoader object at 0x7f0754308278>>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code 2022-02-16 07:07:48,016 [WARNING] tensorflow: Entity <bound method Processor.__call__ of <modulus.blocks.data_loaders.multi_source_loader.processors.asset_loader.AssetLoader object at 0x7f0754308278>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method Processor.__call__ of <modulus.blocks.data_loaders.multi_source_loader.processors.asset_loader.AssetLoader object at 0x7f0754308278>>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code 2022-02-16 07:07:48,289 [INFO] __main__: Found 53 samples in validation set WARNING:tensorflow:From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/validation_hook.py:40: The name tf.summary.FileWriterCache is deprecated. Please use tf.compat.v1.summary.FileWriterCache instead. 2022-02-16 07:07:48,900 [WARNING] tensorflow: From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/validation_hook.py:40: The name tf.summary.FileWriterCache is deprecated. Please use tf.compat.v1.summary.FileWriterCache instead. 2022-02-16 07:07:50,232 [INFO] __main__: Checkpoint interval: 10 WARNING:tensorflow:From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py:109: The name tf.train.Scaffold is deprecated. Please use tf.compat.v1.train.Scaffold instead. 2022-02-16 07:07:50,233 [WARNING] tensorflow: From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py:109: The name tf.train.Scaffold is deprecated. Please use tf.compat.v1.train.Scaffold instead. WARNING:tensorflow:From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/common/graph/initializers.py:14: The name tf.local_variables_initializer is deprecated. Please use tf.compat.v1.local_variables_initializer instead. 2022-02-16 07:07:50,233 [WARNING] tensorflow: From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/common/graph/initializers.py:14: The name tf.local_variables_initializer is deprecated. Please use tf.compat.v1.local_variables_initializer instead. WARNING:tensorflow:From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/common/graph/initializers.py:15: The name tf.tables_initializer is deprecated. Please use tf.compat.v1.tables_initializer instead. 2022-02-16 07:07:50,234 [WARNING] tensorflow: From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/common/graph/initializers.py:15: The name tf.tables_initializer is deprecated. Please use tf.compat.v1.tables_initializer instead. WARNING:tensorflow:From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/common/graph/initializers.py:16: The name tf.get_collection is deprecated. Please use tf.compat.v1.get_collection instead. 2022-02-16 07:07:50,235 [WARNING] tensorflow: From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/common/graph/initializers.py:16: The name tf.get_collection is deprecated. Please use tf.compat.v1.get_collection instead. WARNING:tensorflow:From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/utils.py:59: The name tf.train.LoggingTensorHook is deprecated. Please use tf.estimator.LoggingTensorHook instead. 2022-02-16 07:07:50,238 [WARNING] tensorflow: From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/utils.py:59: The name tf.train.LoggingTensorHook is deprecated. Please use tf.estimator.LoggingTensorHook instead. WARNING:tensorflow:From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/utils.py:60: The name tf.train.StopAtStepHook is deprecated. Please use tf.estimator.StopAtStepHook instead. 2022-02-16 07:07:50,238 [WARNING] tensorflow: From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/utils.py:60: The name tf.train.StopAtStepHook is deprecated. Please use tf.estimator.StopAtStepHook instead. WARNING:tensorflow:From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/utils.py:73: The name tf.train.StepCounterHook is deprecated. Please use tf.estimator.StepCounterHook instead. 2022-02-16 07:07:50,238 [WARNING] tensorflow: From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/utils.py:73: The name tf.train.StepCounterHook is deprecated. Please use tf.estimator.StepCounterHook instead. INFO:tensorflow:Create CheckpointSaverHook. 2022-02-16 07:07:50,238 [INFO] tensorflow: Create CheckpointSaverHook. WARNING:tensorflow:From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/utils.py:99: The name tf.train.SummarySaverHook is deprecated. Please use tf.estimator.SummarySaverHook instead. 2022-02-16 07:07:50,238 [WARNING] tensorflow: From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/utils.py:99: The name tf.train.SummarySaverHook is deprecated. Please use tf.estimator.SummarySaverHook instead. WARNING:tensorflow:From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/training/utilities.py:140: The name tf.train.SingularMonitoredSession is deprecated. Please use tf.compat.v1.train.SingularMonitoredSession instead. 2022-02-16 07:07:52,780 [WARNING] tensorflow: From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/training/utilities.py:140: The name tf.train.SingularMonitoredSession is deprecated. Please use tf.compat.v1.train.SingularMonitoredSession instead. INFO:tensorflow:Graph was finalized. 2022-02-16 07:07:53,764 [INFO] tensorflow: Graph was finalized. INFO:tensorflow:Restoring parameters from /tmp/tmpe9rr80sp/model.ckpt-0 2022-02-16 07:07:54,389 [INFO] tensorflow: Restoring parameters from /tmp/tmpe9rr80sp/model.ckpt-0 Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call return fn(*args) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1350, in _run_fn target_list, run_metadata) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.NotFoundError: 2 root error(s) found. (0) Not found: Key cost_sums/city-bbox not found in checkpoint [[{{node save/RestoreV2}}]] (1) Not found: Key cost_sums/city-bbox not found in checkpoint [[{{node save/RestoreV2}}]] [[save/RestoreV2/_637]] 0 successful operations. 0 derived errors ignored. During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/saver.py", line 1290, in restore {self.saver_def.filename_tensor_name: save_path}) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 956, in run run_metadata_ptr) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1180, in _run feed_dict_tensor, options, run_metadata) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1359, in _do_run run_metadata) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1384, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.NotFoundError: 2 root error(s) found. (0) Not found: Key cost_sums/city-bbox not found in checkpoint [[node save/RestoreV2 (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) ]] (1) Not found: Key cost_sums/city-bbox not found in checkpoint [[node save/RestoreV2 (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) ]] [[save/RestoreV2/_637]] 0 successful operations. 0 derived errors ignored. Original stack trace for 'save/RestoreV2': File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py", line 838, in <module> File "<decorator-gen-2>", line 2, in main File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/utilities/timer.py", line 46, in wrapped_fn File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py", line 827, in main File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py", line 708, in run_experiment File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py", line 644, in train_gridbox File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py", line 153, in run_training_loop File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/training/utilities.py", line 143, in get_singular_monitored_session File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 1104, in __init__ stop_grace_period_secs=stop_grace_period_secs) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 727, in __init__ self._sess = self._coordinated_creator.create_session() File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 878, in create_session self.tf_sess = self._session_creator.create_session() File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 638, in create_session self._scaffold.finalize() File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 229, in finalize self._saver = training_saver._get_saver_or_default() # pylint: disable=protected-access File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/saver.py", line 599, in _get_saver_or_default saver = Saver(sharded=True, allow_empty=True) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/saver.py", line 828, in __init__ self.build() File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/saver.py", line 840, in build self._build(self._filename, build_save=True, build_restore=True) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/saver.py", line 878, in _build build_restore=build_restore) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/saver.py", line 502, in _build_internal restore_sequentially, reshape) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/saver.py", line 381, in _AddShardedRestoreOps name="restore_shard")) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/saver.py", line 328, in _AddRestoreOps restore_sequentially) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/saver.py", line 575, in bulk_restore return io_ops.restore_v2(filename_tensor, names, slices, dtypes) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/gen_io_ops.py", line 1696, in restore_v2 name=name) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/op_def_library.py", line 794, in _apply_op_helper op_def=op_def) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/util/deprecation.py", line 513, in new_func return func(*args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 3357, in create_op attrs, op_def, compute_device) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 3426, in _create_op_internal op_def=op_def) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 1748, in __init__ self._traceback = tf_stack.extract_stack() During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/saver.py", line 1300, in restore names_to_keys = object_graph_key_mapping(save_path) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/saver.py", line 1618, in object_graph_key_mapping object_graph_string = reader.get_tensor(trackable.OBJECT_GRAPH_PROTO_KEY) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/pywrap_tensorflow_internal.py", line 915, in get_tensor return CheckpointReader_GetTensor(self, compat.as_bytes(tensor_str)) tensorflow.python.framework.errors_impl.NotFoundError: Key _CHECKPOINTABLE_OBJECT_GRAPH not found in checkpoint During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py", line 849, in <module> File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py", line 838, in <module> File "<decorator-gen-2>", line 2, in main File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/utilities/timer.py", line 46, in wrapped_fn File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py", line 827, in main File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py", line 708, in run_experiment File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py", line 644, in train_gridbox File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py", line 153, in run_training_loop File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/training/utilities.py", line 143, in get_singular_monitored_session File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 1104, in __init__ stop_grace_period_secs=stop_grace_period_secs) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 727, in __init__ self._sess = self._coordinated_creator.create_session() File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 878, in create_session self.tf_sess = self._session_creator.create_session() File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 647, in create_session init_fn=self._scaffold.init_fn) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/session_manager.py", line 290, in prepare_session config=config) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/session_manager.py", line 204, in _restore_checkpoint saver.restore(sess, checkpoint_filename_with_path) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/saver.py", line 1306, in restore err, "a Variable name or other graph key that is missing") tensorflow.python.framework.errors_impl.NotFoundError: Restoring from checkpoint failed. This is most likely due to a Variable name or other graph key that is missing from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error: 2 root error(s) found. (0) Not found: Key cost_sums/city-bbox not found in checkpoint [[node save/RestoreV2 (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) ]] (1) Not found: Key cost_sums/city-bbox not found in checkpoint [[node save/RestoreV2 (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) ]] [[save/RestoreV2/_637]] 0 successful operations. 0 derived errors ignored. Original stack trace for 'save/RestoreV2': File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py", line 838, in <module> File "<decorator-gen-2>", line 2, in main File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/utilities/timer.py", line 46, in wrapped_fn File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py", line 827, in main File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py", line 708, in run_experiment File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py", line 644, in train_gridbox File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py", line 153, in run_training_loop File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/training/utilities.py", line 143, in get_singular_monitored_session File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 1104, in __init__ stop_grace_period_secs=stop_grace_period_secs) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 727, in __init__ self._sess = self._coordinated_creator.create_session() File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 878, in create_session self.tf_sess = self._session_creator.create_session() File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 638, in create_session self._scaffold.finalize() File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 229, in finalize self._saver = training_saver._get_saver_or_default() # pylint: disable=protected-access File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/saver.py", line 599, in _get_saver_or_default saver = Saver(sharded=True, allow_empty=True) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/saver.py", line 828, in __init__ self.build() File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/saver.py", line 840, in build self._build(self._filename, build_save=True, build_restore=True) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/saver.py", line 878, in _build build_restore=build_restore) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/saver.py", line 502, in _build_internal restore_sequentially, reshape) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/saver.py", line 381, in _AddShardedRestoreOps name="restore_shard")) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/saver.py", line 328, in _AddRestoreOps restore_sequentially) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/saver.py", line 575, in bulk_restore return io_ops.restore_v2(filename_tensor, names, slices, dtypes) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/gen_io_ops.py", line 1696, in restore_v2 name=name) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/op_def_library.py", line 794, in _apply_op_helper op_def=op_def) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/util/deprecation.py", line 513, in new_func return func(*args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 3357, in create_op attrs, op_def, compute_device) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 3426, in _create_op_internal op_def=op_def) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 1748, in __init__ self._traceback = tf_stack.extract_stack() ERROR:tensorflow:================================== Object was never used (type <class 'tensorflow.python.framework.ops.Tensor'>): <tf.Tensor 'IsVariableInitialized_411:0' shape=() dtype=bool> If you want to mark it as used call its "mark_used()" method. It was originally created here: File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/training/utilities.py", line 143, in get_singular_monitored_session File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 1104, in __init__ stop_grace_period_secs=stop_grace_period_secs) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 727, in __init__ self._sess = self._coordinated_creator.create_session() File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/core/build_wheel.runfiles/ai_infra/moduluspy/modulus/hooks/hooks.py", line 285, in begin File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/util/tf_should_use.py", line 198, in wrapped return _add_should_use_warning(fn(*args, **kwargs)) ================================== 2022-02-16 07:07:55,622 [ERROR] tensorflow: ================================== Object was never used (type <class 'tensorflow.python.framework.ops.Tensor'>): <tf.Tensor 'IsVariableInitialized_411:0' shape=() dtype=bool> If you want to mark it as used call its "mark_used()" method. It was originally created here: File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/training/utilities.py", line 143, in get_singular_monitored_session File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 1104, in __init__ stop_grace_period_secs=stop_grace_period_secs) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 727, in __init__ self._sess = self._coordinated_creator.create_session() File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/core/build_wheel.runfiles/ai_infra/moduluspy/modulus/hooks/hooks.py", line 285, in begin File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/util/tf_should_use.py", line 198, in wrapped return _add_should_use_warning(fn(*args, **kwargs)) ================================== 2022-02-16 15:07:57,849 [INFO] tlt.components.docker_handler.docker_handler: Stopping container. 

Can you try again with a new result folder?

With a new result folder, run !tao detectnet_v2 train

it turns:

INFO:tensorflow:Graph was finalized. 2022-02-16 07:28:43,412 [INFO] tensorflow: Graph was finalized. INFO:tensorflow:Running local_init_op. 2022-02-16 07:28:45,823 [INFO] tensorflow: Running local_init_op. INFO:tensorflow:Done running local_init_op. 2022-02-16 07:28:46,707 [INFO] tensorflow: Done running local_init_op. INFO:tensorflow:Saving checkpoints for step-0. 2022-02-16 07:28:55,297 [INFO] tensorflow: Saving checkpoints for step-0. Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call return fn(*args) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1350, in _run_fn target_list, run_metadata) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[4,512,189,252] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [[{{node gradients/resnet18_nopool_bn_detectnet_v2/output_cov/convolution_grad/Conv2DBackpropInput}}]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py", line 838, in <module> File "<decorator-gen-2>", line 2, in main File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/utilities/timer.py", line 46, in wrapped_fn File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py", line 827, in main File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py", line 708, in run_experiment File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py", line 644, in train_gridbox File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py", line 155, in run_training_loop File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 754, in run run_metadata=run_metadata) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 1360, in run raise six.reraise(*original_exc_info) File "/usr/local/lib/python3.6/dist-packages/six.py", line 696, in reraise raise value File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 1345, in run return self._sess.run(*args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 1418, in run run_metadata=run_metadata) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 1176, in run return self._sess.run(*args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 956, in run run_metadata_ptr) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1180, in _run feed_dict_tensor, options, run_metadata) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1359, in _do_run run_metadata) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1384, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[4,512,189,252] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [[node gradients/resnet18_nopool_bn_detectnet_v2/output_cov/convolution_grad/Conv2DBackpropInput (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) ]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. Original stack trace for 'gradients/resnet18_nopool_bn_detectnet_v2/output_cov/convolution_grad/Conv2DBackpropInput': File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py", line 838, in <module> File "<decorator-gen-2>", line 2, in main File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/utilities/timer.py", line 46, in wrapped_fn File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py", line 827, in main File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py", line 708, in run_experiment File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py", line 619, in train_gridbox File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py", line 474, in build_training_graph File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/model/detectnet_model.py", line 602, in build_training_graph File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/training/train_op_generator.py", line 59, in get_train_op File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/training/train_op_generator.py", line 74, in _get_train_op_without_cost_scaling File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/optimizer.py", line 419, in minimize grad_loss=grad_loss) File "/usr/local/lib/python3.6/dist-packages/horovod/tensorflow/__init__.py", line 253, in compute_gradients gradients = self._optimizer.compute_gradients(*args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/optimizer.py", line 537, in compute_gradients colocate_gradients_with_ops=colocate_gradients_with_ops) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/gradients_impl.py", line 158, in gradients unconnected_gradients) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/gradients_util.py", line 703, in _GradientsHelper lambda: grad_fn(op, *out_grads)) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/gradients_util.py", line 362, in _MaybeCompile return grad_fn() # Exit early File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/gradients_util.py", line 703, in <lambda> lambda: grad_fn(op, *out_grads)) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/nn_grad.py", line 596, in _Conv2DGrad data_format=data_format), File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/gen_nn_ops.py", line 1407, in conv2d_backprop_input name=name) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/op_def_library.py", line 794, in _apply_op_helper op_def=op_def) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/util/deprecation.py", line 513, in new_func return func(*args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 3357, in create_op attrs, op_def, compute_device) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 3426, in _create_op_internal op_def=op_def) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 1748, in __init__ self._traceback = tf_stack.extract_stack() ...which was originally created as op 'resnet18_nopool_bn_detectnet_v2/output_cov/convolution', defined at: File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py", line 838, in <module> [elided 5 identical lines from previous traceback] File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py", line 474, in build_training_graph File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/model/detectnet_model.py", line 576, in build_training_graph File "/usr/local/lib/python3.6/dist-packages/keras/engine/base_layer.py", line 457, in __call__ output = self.call(inputs, **kwargs) File "/usr/local/lib/python3.6/dist-packages/keras/engine/network.py", line 564, in call output_tensors, _, _ = self.run_internal_graph(inputs, masks) File "/usr/local/lib/python3.6/dist-packages/keras/engine/network.py", line 721, in run_internal_graph layer.call(computed_tensor, **kwargs)) File "/usr/local/lib/python3.6/dist-packages/keras/layers/convolutional.py", line 171, in call dilation_rate=self.dilation_rate) File "/opt/nvidia/third_party/keras/tensorflow_backend.py", line 113, in conv2d data_format=tf_data_format, File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/nn_ops.py", line 921, in convolution name=name) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/nn_ops.py", line 1032, in convolution_internal name=name) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/gen_nn_ops.py", line 1071, in conv2d data_format=data_format, dilations=dilations, name=name) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/op_def_library.py", line 794, in _apply_op_helper op_def=op_def) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/util/deprecation.py", line 513, in new_func return func(*args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 3357, in create_op attrs, op_def, compute_device) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 3426, in _create_op_internal op_def=op_def) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 1748, in __init__ self._traceback = tf_stack.extract_stack() During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py", line 841, in <module> AttributeError: module 'logging' has no attribute 'getLoggger' 2022-02-16 15:32:29,408 [INFO] tlt.components.docker_handler.docker_handler: Stopping container. 

I just use a V100, should I add more?

Can you share your training spec?

This is detectnet_v2_train_resnet18_kitti.txt:

random_seed: 42 dataset_config { data_sources { tfrecords_path: "/workspace/tao-experiments/data/tfrecords/kitti_trainval/*" image_directory_path: "/workspace/tao-experiments/data/training" } image_extension: "jpg" target_class_mapping { key: "city" value: "city" } validation_fold: 0 } augmentation_config { preprocessing { output_image_width: 4032 output_image_height: 3024 min_bbox_width: 1.0 min_bbox_height: 1.0 output_image_channel: 3 } spatial_augmentation { hflip_probability: 0.5 zoom_min: 1.0 zoom_max: 1.0 translate_max_x: 8.0 translate_max_y: 8.0 } color_augmentation { hue_rotation_max: 25.0 saturation_shift_max: 0.20000000298 contrast_scale_max: 0.10000000149 contrast_center: 0.5 } } postprocessing_config { target_class_config { key: "city" value { clustering_config { clustering_algorithm: DBSCAN dbscan_confidence_threshold: 0.9 coverage_threshold: 0.00499999988824 dbscan_eps: 0.20000000298 dbscan_min_samples: 0.0500000007451 minimum_bounding_box_height: 20 } } } } model_config { pretrained_model_file: "/workspace/tao-experiments/detectnet_v2/pretrained_resnet18/pretrained_detectnet_v2_vresnet18/resnet18.hdf5" num_layers: 18 use_batch_norm: true objective_set { bbox { scale: 35.0 offset: 0.5 } cov { } } arch: "resnet" } evaluation_config { validation_period_during_training: 10 first_validation_epoch: 30 minimum_detection_ground_truth_overlap { key: "city" value: 0.699999988079 } evaluation_box_config { key: "city" value { minimum_height: 20 maximum_height: 9999 minimum_width: 10 maximum_width: 9999 } } average_precision_mode: INTEGRATE } cost_function_config { target_classes { name: "city" class_weight: 1.0 coverage_foreground_weight: 0.0500000007451 objectives { name: "cov" initial_weight: 1.0 weight_target: 1.0 } objectives { name: "bbox" initial_weight: 10.0 weight_target: 10.0 } } enable_autoweighting: true max_objective_weight: 0.999899983406 min_objective_weight: 9.99999974738e-05 } training_config { batch_size_per_gpu: 4 num_epochs: 120 learning_rate { soft_start_annealing_schedule { min_learning_rate: 5e-06 max_learning_rate: 5e-04 soft_start: 0.10000000149 annealing: 0.699999988079 } } regularizer { type: L1 weight: 3.00000002618e-09 } optimizer { adam { epsilon: 9.99999993923e-09 beta1: 0.899999976158 beta2: 0.999000012875 } } cost_scaling { initial_exponent: 20.0 increment: 0.005 decrement: 1.0 } checkpoint_interval: 10 } bbox_rasterizer_config { target_class_config { key: "city" value { cov_center_x: 0.5 cov_center_y: 0.5 cov_radius_x: 0.40000000596 cov_radius_y: 0.40000000596 bbox_min_radius: 1.0 } } deadzone_radius: 0.400000154972 } 

So, all of your training images are 4032x3024, right?

I change detectnet_v2_train_resnet18_kitti.txt to this:

random_seed: 42 dataset_config { data_sources { tfrecords_path: "/workspace/tao-experiments/data/tfrecords/kitti_trainval/*" image_directory_path: "/workspace/tao-experiments/data/training" } image_extension: "jpg" target_class_mapping { key: "city" value: "city" } validation_fold: 0 } augmentation_config { preprocessing { output_image_width: 1080 output_image_height: 1440 min_bbox_width: 1.0 min_bbox_height: 1.0 output_image_channel: 3 } spatial_augmentation { hflip_probability: 0.5 zoom_min: 1.0 zoom_max: 1.0 translate_max_x: 8.0 translate_max_y: 8.0 } color_augmentation { hue_rotation_max: 25.0 saturation_shift_max: 0.20000000298 contrast_scale_max: 0.10000000149 contrast_center: 0.5 } } postprocessing_config { target_class_config { key: "city" value { clustering_config { clustering_algorithm: DBSCAN dbscan_confidence_threshold: 0.9 coverage_threshold: 0.00499999988824 dbscan_eps: 0.20000000298 dbscan_min_samples: 0.0500000007451 minimum_bounding_box_height: 20 } } } } model_config { pretrained_model_file: "/workspace/tao-experiments/detectnet_v2/pretrained_resnet18/pretrained_detectnet_v2_vresnet18/resnet18.hdf5" num_layers: 18 use_batch_norm: true objective_set { bbox { scale: 35.0 offset: 0.5 } cov { } } arch: "resnet" } evaluation_config { validation_period_during_training: 10 first_validation_epoch: 30 minimum_detection_ground_truth_overlap { key: "city" value: 0.699999988079 } evaluation_box_config { key: "city" value { minimum_height: 20 maximum_height: 9999 minimum_width: 10 maximum_width: 9999 } } average_precision_mode: INTEGRATE } cost_function_config { target_classes { name: "city" class_weight: 1.0 coverage_foreground_weight: 0.0500000007451 objectives { name: "cov" initial_weight: 1.0 weight_target: 1.0 } objectives { name: "bbox" initial_weight: 10.0 weight_target: 10.0 } } enable_autoweighting: true max_objective_weight: 0.999899983406 min_objective_weight: 9.99999974738e-05 } training_config { batch_size_per_gpu: 4 num_epochs: 120 learning_rate { soft_start_annealing_schedule { min_learning_rate: 5e-06 max_learning_rate: 5e-04 soft_start: 0.10000000149 annealing: 0.699999988079 } } regularizer { type: L1 weight: 3.00000002618e-09 } optimizer { adam { epsilon: 9.99999993923e-09 beta1: 0.899999976158 beta2: 0.999000012875 } } cost_scaling { initial_exponent: 20.0 increment: 0.005 decrement: 1.0 } checkpoint_interval: 10 } bbox_rasterizer_config { target_class_config { key: "city" value { cov_center_x: 0.5 cov_center_y: 0.5 cov_radius_x: 0.40000000596 cov_radius_y: 0.40000000596 bbox_min_radius: 1.0 } } deadzone_radius: 0.400000154972 } 

and make all the data to (1440x1080), then it becomes this:

2022-02-16 17:41:05,156 [INFO] root: Registry: ['nvcr.io'] 2022-02-16 17:41:05,370 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit-tf:v3.21.11-tf1.15.4-py3 2022-02-16 17:41:06,014 [WARNING] tlt.components.docker_handler.docker_handler: Docker will run the commands as root. If you would like to retain your local host permissions, please add the "user":"UID:GID" in the DockerOptions portion of the "/root/.tao_mounts.json" file. You can obtain your users UID and GID by using the "id -u" and "id -g" commands on the terminal. Using TensorFlow backend. WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them. Using TensorFlow backend. WARNING:tensorflow:From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/cost_function/cost_auto_weight_hook.py:43: The name tf.train.SessionRunHook is deprecated. Please use tf.estimator.SessionRunHook instead. 2022-02-16 09:41:14,946 [WARNING] tensorflow: From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/cost_function/cost_auto_weight_hook.py:43: The name tf.train.SessionRunHook is deprecated. Please use tf.estimator.SessionRunHook instead. WARNING:tensorflow:From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/checkpoint_saver_hook.py:25: The name tf.train.CheckpointSaverHook is deprecated. Please use tf.estimator.CheckpointSaverHook instead. 2022-02-16 09:41:15,071 [WARNING] tensorflow: From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/checkpoint_saver_hook.py:25: The name tf.train.CheckpointSaverHook is deprecated. Please use tf.estimator.CheckpointSaverHook instead. WARNING:tensorflow:From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py:69: The name tf.logging.set_verbosity is deprecated. Please use tf.compat.v1.logging.set_verbosity instead. 2022-02-16 09:41:15,073 [WARNING] tensorflow: From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py:69: The name tf.logging.set_verbosity is deprecated. Please use tf.compat.v1.logging.set_verbosity instead. WARNING:tensorflow:From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py:69: The name tf.logging.INFO is deprecated. Please use tf.compat.v1.logging.INFO instead. 2022-02-16 09:41:15,073 [WARNING] tensorflow: From /opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py:69: The name tf.logging.INFO is deprecated. Please use tf.compat.v1.logging.INFO instead. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/horovod/tensorflow/__init__.py:117: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead. 2022-02-16 09:41:15,082 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/horovod/tensorflow/__init__.py:117: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/horovod/tensorflow/__init__.py:143: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead. 2022-02-16 09:41:15,082 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/horovod/tensorflow/__init__.py:143: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead. 2022-02-16 09:41:16,572 [INFO] __main__: Loading experiment spec at /workspace/tao-experiments/detectnet_v2/specs/detectnet_v2_train_resnet18_kitti.txt. 2022-02-16 09:41:16,574 [INFO] iva.detectnet_v2.spec_handler.spec_loader: Merging specification from /workspace/tao-experiments/detectnet_v2/specs/detectnet_v2_train_resnet18_kitti.txt 2022-02-16 09:41:16,579 [INFO] iva.detectnet_v2.spec_handler.spec_loader: Spec file validation failed. Experiment Spec Setting Error: output_image_width should % 16. Wrong value: 1080 2022-02-16 09:41:16,579 [INFO] __main__: Training was interrupted. Time taken to run __main__:main: 0:00:01.505742. 2022-02-16 17:41:19,300 [INFO] tlt.components.docker_handler.docker_handler: Stopping container. 

Please see DetectNet_v2 - NVIDIA Docs

If all of your images are 4032x3024, you can train a 1008x752 model with below setting.

output_image_width: 1008 output_image_height: 752 enable_auto_resize: True 

So, I change detectnet_v2_train_resnet18_kitti.txt to this:

random_seed: 42 dataset_config { data_sources { tfrecords_path: "/workspace/tao-experiments/data/tfrecords/kitti_trainval/*" image_directory_path: "/workspace/tao-experiments/data/training" } image_extension: "jpg" target_class_mapping { key: "city" value: "city" } validation_fold: 0 } augmentation_config { preprocessing { output_image_width: 1440 output_image_height: 1080 enable_auto _resize: True min_bbox_width: 1.0 min_bbox_height: 1.0 output_image_channel: 3 } spatial_augmentation { hflip_probability: 0.5 zoom_min: 1.0 zoom_max: 1.0 translate_max_x: 8.0 translate_max_y: 8.0 } color_augmentation { hue_rotation_max: 25.0 saturation_shift_max: 0.20000000298 contrast_scale_max: 0.10000000149 contrast_center: 0.5 } } postprocessing_config { target_class_config { key: "city" value { clustering_config { clustering_algorithm: DBSCAN dbscan_confidence_threshold: 0.9 coverage_threshold: 0.00499999988824 dbscan_eps: 0.20000000298 dbscan_min_samples: 0.0500000007451 minimum_bounding_box_height: 20 } } } } model_config { pretrained_model_file: "/workspace/tao-experiments/detectnet_v2/pretrained_resnet18/pretrained_detectnet_v2_vresnet18/resnet18.hdf5" num_layers: 18 use_batch_norm: true objective_set { bbox { scale: 35.0 offset: 0.5 } cov { } } arch: "resnet" } evaluation_config { validation_period_during_training: 10 first_validation_epoch: 30 minimum_detection_ground_truth_overlap { key: "city" value: 0.699999988079 } evaluation_box_config { key: "city" value { minimum_height: 20 maximum_height: 9999 minimum_width: 10 maximum_width: 9999 } } average_precision_mode: INTEGRATE } cost_function_config { target_classes { name: "city" class_weight: 1.0 coverage_foreground_weight: 0.0500000007451 objectives { name: "cov" initial_weight: 1.0 weight_target: 1.0 } objectives { name: "bbox" initial_weight: 10.0 weight_target: 10.0 } } enable_autoweighting: true max_objective_weight: 0.999899983406 min_objective_weight: 9.99999974738e-05 } training_config { batch_size_per_gpu: 4 num_epochs: 120 learning_rate { soft_start_annealing_schedule { min_learning_rate: 5e-06 max_learning_rate: 5e-04 soft_start: 0.10000000149 annealing: 0.699999988079 } } regularizer { type: L1 weight: 3.00000002618e-09 } optimizer { adam { epsilon: 9.99999993923e-09 beta1: 0.899999976158 beta2: 0.999000012875 } } cost_scaling { initial_exponent: 20.0 increment: 0.005 decrement: 1.0 } checkpoint_interval: 10 } bbox_rasterizer_config { target_class_config { key: "city" value { cov_center_x: 0.5 cov_center_y: 0.5 cov_radius_x: 0.40000000596 cov_radius_y: 0.40000000596 bbox_min_radius: 1.0 } } deadzone_radius: 0.400000154972 } 

And the error becomes this:

2022-02-17 02:03:31,533 [INFO] __main__: Loading experiment spec at /workspace/tao-experiments/detectnet_v2/specs/detectnet_v2_train_resnet18_kitti.txt. 2022-02-17 02:03:31,538 [INFO] iva.detectnet_v2.spec_handler.spec_loader: Merging specification from /workspace/tao-experiments/detectnet_v2/specs/detectnet_v2_train_resnet18_kitti.txt Traceback (most recent call last): File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py", line 849, in <module> File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py", line 838, in <module> File "<decorator-gen-2>", line 2, in main File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/utilities/timer.py", line 46, in wrapped_fn File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py", line 827, in main File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py", line 678, in run_experiment File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/spec_handler/spec_loader.py", line 124, in load_experiment_spec File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/spec_handler/spec_loader.py", line 101, in load_proto File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/spec_handler/spec_loader.py", line 87, in _load_from_file File "/usr/local/lib/python3.6/dist-packages/google/protobuf/text_format.py", line 725, in Merge allow_unknown_field=allow_unknown_field) File "/usr/local/lib/python3.6/dist-packages/google/protobuf/text_format.py", line 793, in MergeLines return parser.MergeLines(lines, message) File "/usr/local/lib/python3.6/dist-packages/google/protobuf/text_format.py", line 818, in MergeLines self._ParseOrMerge(lines, message) File "/usr/local/lib/python3.6/dist-packages/google/protobuf/text_format.py", line 837, in _ParseOrMerge self._MergeField(tokenizer, message) File "/usr/local/lib/python3.6/dist-packages/google/protobuf/text_format.py", line 967, in _MergeField merger(tokenizer, message, field) File "/usr/local/lib/python3.6/dist-packages/google/protobuf/text_format.py", line 1042, in _MergeMessageField self._MergeField(tokenizer, sub_message) File "/usr/local/lib/python3.6/dist-packages/google/protobuf/text_format.py", line 967, in _MergeField merger(tokenizer, message, field) File "/usr/local/lib/python3.6/dist-packages/google/protobuf/text_format.py", line 1042, in _MergeMessageField self._MergeField(tokenizer, sub_message) File "/usr/local/lib/python3.6/dist-packages/google/protobuf/text_format.py", line 934, in _MergeField (message_descriptor.full_name, name)) google.protobuf.text_format.ParseError: 18:5 : Message type "AugmentationConfig.Preprocessing" has no field named "enable_auto". 2022-02-17 10:03:34,054 [INFO] tlt.components.docker_handler.docker_handler: Stopping container. 

I can’t enable auto_resize? All my data is 1440x1080.

Sorry, should be

enable_auto_resize: True

There is a typo.