Markerless tracking of user-defined features with deep learning
- Human - level tracking with as few as 200 labeled frames.
- No dynamics - could be even better with a Kalman filter.
- Uses a Google-trained DCN, 50 or 101 layers deep.
- Network has a distinct read-out layer per feature to localize the probability of a body part to a pixel location.
- Uses the DeeperCut network architecture / algorithm for pose estimation.
- These deep features were trained on ImageNet
- Trained on examples with both only the readout layers (rest fixed per ResNet), as well as end-to-end; latter performs better, unsurprising.
|