Skip to content

Conversation

eqy
Copy link

@eqy eqy commented Jul 26, 2024

In the spirit of warming up for JIT compilation, add a warmup iteration in case the very last batch has a different size that may unwittingly trigger recompilation

@eqy
Copy link
Author

eqy commented Aug 29, 2024

This also has the effect of including autotuning time for the last batch if it is uneven as torch.backends.cudnn.benchmark = True is set

@rwightman
Copy link
Collaborator

@eqy there's a small issue here, which is a bit to explain which is why this sat...

Validation script should work in streaming mode without a defined length, I believe it used to work but I actually made it a bit too strict (I need to fix). So the partial batch check must catch the situation where dataset length isn't defined.

Comment lines 307/308 in reader_wds.py

 #if not self.num_samples: # raise RuntimeError(f'Invalid split definition, num_samples not specified.') 

..and then below should work:

python validate.py --data-dir 'pipe:curl -s -H "Authorization: Bearer $HFT" -f -L https://huggingface.co/datasets/timm/imagenet-1k-wds/resolve/main/' --dataset wds/ --split 'imagenet1k-validation-{00..10}.tar' Validating in float32. AMP not enabled. Loading pretrained weights from Hugging Face hub (timm/dpn92.mx_in1k) [timm/dpn92.mx_in1k] Safe alternative available for 'pytorch_model.bin' (as 'model.safetensors'). Loading weights using safetensors. Model dpn92 created, param count: 37668392 Data processing configuration for current model + dataset:	input_size: (3, 224, 224)	interpolation: bicubic	mean: (0.48627450980392156, 0.4588235294117647, 0.40784313725490196)	std: (0.23482446870963955, 0.23482446870963955, 0.23482446870963955)	crop_pct: 0.875	crop_mode: center Test: [ 0/0] Time: 2.018s (2.018s, 126.88/s) Loss: 0.9466 (0.9466) Acc@1: 77.734 ( 77.734) Acc@5: 94.141 ( 94.141) Test: [ 10/0] Time: 0.329s (0.507s, 504.63/s) Loss: 0.7221 (0.8023) Acc@1: 80.469 ( 79.936) Acc@5: 97.266 ( 95.241) Test: [ 20/0] Time: 0.330s (0.438s, 584.59/s) Loss: 0.7316 (0.8055) Acc@1: 82.812 ( 80.283) Acc@5: 95.703 ( 94.754) Test: [ 30/0] Time: 0.328s (0.409s, 626.20/s) Loss: 0.6582 (0.7941) Acc@1: 83.203 ( 80.262) Acc@5: 95.703 ( 95.030) * Acc@1 80.079 (19.921) Acc@5 95.031 (4.969) --result { "model": "dpn92", "top1": 80.0791, "top1_err": 19.9209, "top5": 95.0314, "top5_err": 4.9686, "param_count": 37.67, "img_size": 224, "crop_pct": 0.875, "interpolation": "bicubic" } 
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants