Discovering hidden factors of variation in deep networks
- Well, they are not really that deep ...
- Use a VAE to encode both a supervised signal (class labels) as well as unsupervised latents.
- Penalize a combination of the MSE of reconstruction, logits of the classification error, and a special cross-covariance term to decorrelate the supervised and unsupervised latent vectors.
-
- Cross-covariance penalty:
-
- Tested on
- MNIST -- discovered style / rotation of the characters
- Toronto faces database -- seven expressions, many individuals; extracted eigen-emotions sorta.
- Multi-PIE --many faces, many viewpoints ; was able to vary camera pose and illumination with the unsupervised latents.
|