Dimensionality reduction by learning and invariant mapping
 Raia Hadsell, Sumit Chopra, Yann LeCun
 Central idea: learn and invariant mapping of the input by minimizing mapped distance (e.g. the distance between outputs) when the samples are categorized as the same (same numbers in MNIST eg), and maximizing mapped distance when the samples are categorized as distant.
 Two loss functions for same vs different.
 This is an attractionrepulsion spring analogy.
 Use gradient descent to change the weights to satisfy these two competing losses.
 Resulting constitutional neural nets can extract camera pose information from the NORB dataset.

 Surprising how simple analogies like this, when iterated across a great many samples, pull out intuitively correct invariances.
