m8ta
use https for features.
text: sort by
tags: modified
type: chronology
{1527}
hide / / print
ref: -0 tags: inductive logic programming deepmind formal propositions prolog date: 11-21-2020 04:07 gmt revision:0 [head]

Learning Explanatory Rules from Noisy Data

  • From a dense background of inductive logic programming (ILP): given a set of statements, and rules for transformation and substitution, generate clauses that satisfy a set of 'background knowledge'.
  • Programs like Metagol can do this using search and simplify logic built into Prolog.
    • Actually kinda surprising how very dense this program is -- only 330 lines!
  • This task can be transformed into a SAT problem via rules of logic, for which there are many fast solvers.
  • The trick here (instead) is that a neural network is used to turn 'on' or 'off' clauses that fit the background knowledge
    • BK is typically very small, a few examples, consistent with the small size of the learned networks.
  • These weight matrices are represented as the outer product of composed or combined clauses, which makes the weight matrix very large!
  • They then do gradient descent, while passing the cross-entropy errors through nonlinearities (including clauses themselves? I think this is how recursion is handled.) to update the weights.
    • Hence, SGD is used as a means of heuristic search.
  • Compare this to Metagol, which is brittle to any noise in the input; unsurprisingly, due to SGD, this is much more robust.
  • Way too many words and symbols in this paper for what it seems to be doing. Just seems to be obfuscating the work (which is perfectly good). Again: Metagol is only 330 lines!

{1441}
hide / / print
ref: -2018 tags: biologically inspired deep learning feedback alignment direct difference target propagation date: 03-15-2019 05:51 gmt revision:5 [4] [3] [2] [1] [0] [head]

Assessing the Scalability of Biologically-Motivated Deep Learning Algorithms and Architectures

  • Sergey Bartunov, Adam Santoro, Blake A. Richards, Luke Marris, Geoffrey E. Hinton, Timothy Lillicrap
  • As is known, many algorithms work well on MNIST, but fail on more complicated tasks, like CIFAR and ImageNet.
  • In their experiments, backprop still fares better than any of the biologically inspired / biologically plausible learning rules. This includes:
    • Feedback alignment {1432} {1423}
    • Vanilla target propagation
      • Problem: with convergent networks, layer inverses (top-down) will map all items of the same class to one target vector in each layer, which is very limiting.
      • Hence this algorithm was not directly investigated.
    • Difference target propagation (2015)
      • Uses the per-layer target as h^ l=g(h^ l+1;λ l+1)+[h lg(h l+1;λ l+1)]\hat{h}_l = g(\hat{h}_{l+1}; \lambda_{l+1}) + [h_l - g(h_{l+1};\lambda_{l+1})]
      • Or: h^ l=h l+g(h^ l+1;λ l+1)g(h l+1;λ l+1)\hat{h}_l = h_l + g(\hat{h}_{l+1}; \lambda_{l+1}) - g(h_{l+1};\lambda_{l+1}) where λ l\lambda_{l} are the parameters for the inverse model; g()g() is the sum and nonlinearity.
      • That is, the target is modified ala delta rule by the difference between inverse-propagated higher layer target and inverse-propagated higher level activity.
        • Why? h lh_{l} should approach h^ l\hat{h}_{l} as h l+1h_{l+1} approaches h^ l+1\hat{h}_{l+1} .
        • Otherwise, the parameters in lower layers continue to be updated even when low loss is reached in the upper layers. (from original paper).
      • The last to penultimate layer weights is trained via backprop to prevent template impoverishment as noted above.
    • Simplified difference target propagation
      • The substitute a biologically plausible learning rule for the penultimate layer,
      • h^ L1=h L1+g(h^ L;λ L)g(h L;λ L)\hat{h}_{L-1} = h_{L-1} + g(\hat{h}_L;\lambda_L) - g(h_L;\lambda_L) where there are LL layers.
      • It's the same rule as the other layers.
      • Hence subject to impoverishment problem with low-entropy labels.
    • Auxiliary output simplified difference target propagation
      • Add a vector zz to the last layer activation, which carries information about the input vector.
      • zz is just a set of random features from the activation h L1h_{L-1} .
  • Used both fully connected and locally-connected (e.g. convolution without weight sharing) MLP.
  • It's not so great:
  • Target propagation seems like a weak learner, worse than feedback alignment; not only is the feedback limited, but it does not take advantage of the statistics of the input.
    • Hence, some of these schemes may work better when combined with unsupervised learning rules.
    • Still, in the original paper they use difference-target propagation with autoencoders, and get reasonable stroke features..
  • Their general result that networks and learning rules need to be tested on more difficult tasks rings true, and might well be the main point of this otherwise meh paper.

{1445}
hide / / print
ref: -2018 tags: cortex layer martinotti interneuron somatostatin S1 V1 morphology cell type morphological recovery patch seq date: 03-06-2019 02:51 gmt revision:3 [2] [1] [0] [head]

Neocortical layer 4 in adult mouse differs in major cell types and circuit organization between primary sensory areas

  • Using whole-cell recordings with morphological recovery, we identified one major excitatory and seven inhibitory types of neurons in L4 of adult mouse visual cortex (V1).
  • Nearly all excitatory neurons were pyramidal and almost all Somatostatin-positive (SOM+) neurons were Martinotti cells.
  • In contrast, in somatosensory cortex (S1), excitatory cells were mostly stellate and SOM+ cells were non-Martinotti.
  • These morphologically distinct SOM+ interneurons correspond to different transcriptomic cell types and are differentially integrated into the local circuit with only S1 cells receiving local excitatory input.
  • Our results challenge the classical view of a canonical microcircuit repeated through the neocortex.
  • Instead we propose that cell-type specific circuit motifs, such as the Martinotti/pyramidal pair, are optionally used across the cortex as building blocks to assemble cortical circuits.
  • Note preponderance of axons.
  • Classifications:
    • Pyr pyramidal cells
    • BC Basket cells
    • MC Martinotti cells
    • BPC bipolar cells
    • NFC neurogliaform cells
    • SC shrub cells
    • DBC double bouquet cells
    • HEC horizontally elongated cells.
  • Using Patch-seq