Developed a 100% Swift version https://github.com/otmb/TopDownPoseEstimation
TopDown Pose Estimation on iOS
- BBox: Yolov7-tiny
- Pose Estimation: ViTPose
$ git clone https://github.com/mbotsu/TopDownPoseExample.git $ cd TopDownPoseExample/TopDownPoseExample $ curl -OL https://github.com/mbotsu/KeypointDecoder/releases/download/0.0.1/vitpose-b256x192_fp16.mlmodel $ curl -OL https://github.com/mbotsu/KeypointDecoder/releases/download/0.0.1/yolov7-tiny_fp16.mlmodel - ViTPose to CoreML
- Yolov7 to CoreML
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.529 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.679 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.614 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.479 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.614 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.593 Average Recall (AR) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.702 Average Recall (AR) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.665 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.528 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.684 | Model | AP | AP50 | AP75 | AP(M) | AP(L) |
|---|---|---|---|---|---|
| VitPose-b + Yolov7-tiny | 52.9 | 67.9 | 61.4 | 47.9 | 61.4 |
