m8ta
You are not authenticated, login.
text: sort by
tags: modified
type: chronology
{1564}
hide / / print
ref: -2008 tags: t-SNE dimensionality reduction embedding Hinton date: 01-25-2022 20:39 gmt revision:2 [1] [0] [head]

“Visualizing data using t-SNE”

  • Laurens van der Maaten, Geoffrey Hinton.
  • SNE: stochastic neighbor embedding, Hinton 2002.
  • Idea: model the data conditional pairwise distribution as a gaussian, with one variance per data point, p(x i|x j) p(x_i | x_j)
  • in the mapped data, this pairwise distribution is modeled as a fixed-variance gaussian, too, q(y i|y j) q(y_i | y_j)
  • Goal is to minimize the Kullback-Leibler divergence Σ iKL(p i||q i) \Sigma_i KL(p_i || q_i) (summed over all data points)
  • Per-data point variance is found via binary search to match a user-specified perplexity. This amounts to setting a number of nearest neighbors, somewhere between 5 and 50 work ok.
  • Cost function is minimized via gradient descent, starting with a random distribution of points yi, with plenty of momentum to speed up convergence, and noise to effect simulated annealing.
  • Cost function is remarkably simple to reduce, gradient update: δCδy i=2Σ j(p j|iq ji+p i|jq i|j)(y iy j) \frac{\delta C}{\delta y_i} = 2 \Sigma_j(p_{j|i} - q_{j-i} + p_{i|j} - q_{i|j})(y_i - y_j)
  • t-SNE differs from SNE (above) in that it addresses difficulty in optimizing the cost function, and crowding.
    • Uses a simplified symmetric cost function (symmetric conditional probability, rather than joint probability) with simpler gradients
    • Uses the student’s t-distribution in the low-dimensional map q to reduce crowding problem.
  • The crowding problem is roughly resultant from the fact that, in high-dimensional spaces, the volume of the local neighborhood scales as r m r^m , whereas in 2D, it’s just r 2 r^2 . Hence there is cost-incentive to pushing all the points together in the map -- points are volumetrically closer together in high dimensions than they can be in 2D.
    • This can be alleviated by using a one-DOF student distribution, which is the same as a Cauchy distribution, to model the probabilities in map space.
  • Smart -- they plot the topology of the gradients to gain insight into modeling / convergence behavior.
  • Don’t need simulated annealing due to balanced attractive and repulsive effects (see figure).
  • Enhance the algorithm further by keeping it compact at the beginning, so that clusters can move through each other.
  • Look up: d-bits parity task by Bengio 2007

{1450}
hide / / print
ref: -2015 tags: conjugate light electron tomography mouse visual cortex fluorescent label UNC cryoembedding date: 03-11-2019 19:37 gmt revision:1 [0] [head]

PMID-25855189 Mapping Synapses by Conjugate Light-Electron Array Tomography

  • Use aligned interleaved immunofluorescence imaging follwed by array EM (FESEM). 70nm thick sections.
  • Of IHC, tissue must be dehydrated & embedded in a resin.
  • However, the dehydration disrupts cell membranes and ultrastructural details viewed via EM ...
  • Hence, EM microscopy uses osmium tetroxide to cross-link the lipids.
  • ... Yet that also disrupt / refolds the poteins, making IHC fail.
  • Solution is to dehydrate & embed at cryo temp, -70C, where the lipids do not dissolve. They used Lowicryl HM-20.
  • We show that cryoembedding provides markedly improved ultrastructure while still permitting multiplexed immunohistochemistry.