 m8ta
 {1517} hide / / print ref: -2015 tags: spiking neural networks causality inference demixing date: 07-22-2020 18:13 gmt revision:1  [head] RubÃ©n Moreno-Bote & Jan Drugowitsch Use linear non-negative mixing plus nose to generate a series of sensory stimuli. Pass these through a one-layer spiking or non-spiking neural network with adaptive global inhibition and adaptive reset voltage to solve this quadratic programming problem with non-negative constraints. N causes, one observation: $\mu = \Sigma_{i=1}^{N} u_i r_i + \epsilon$ , $r_i \geq 0$ -- causes can be present or not present, but not negative. cause coefficients drawn from a truncated (positive only) Gaussian. linear spiking network with symmetric weight matrix $J = -U^TU - \beta I$ (see figure above) That is ... J looks like a correlation matrix! $U$ is M x N; columns are the mixing vectors. U is known beforehand and not learned That said, as a quasi-correlation matrix, it might not be so hard to learn. See ref . Can solve this problem by minimizing the negative log-posterior function: $$L(\mu, r) = \frac{1}{2}(\mu - Ur)^T(\mu - Ur) + \alpha1^Tr + \frac{\beta}{2}r^Tr$$ That is, want to maximize the joint probability of the data and observations given the probabilistic model $p(\mu, r) \propto exp(-L(\mu, r)) \Pi_{i=1}^{N} H(r_i)$ First term quadratically penalizes difference between prediction and measurement. second term, alpha is a L1 regularization term, and third term w beta is a L2 regularization. The negative log-likelihood is then converted to an energy function (linear algebra): $W = -U^T U$ , $h = U^T \mu$ then $E(r) = 0.5 r^T W r - r^T h + \alpha 1^T r + 0.5 \beta r^T r$ This is where they get the weight matrix J or W. If the vectors U are linearly independent, then it is negative semidefinite. The dynamics of individual neurons w/ global inhibition and variable reset voltage serves to minimize this energy -- hence, solve the problem. (They gloss over this derivation in the main text). Next, show that a spike-based network can similarly 'relax' or descent the objective gradient to arrive at the quadratic programming solution. Network is N leaky integrate and fire neurons, with variable synaptic integration kernels. $\alpha$ translates then to global inhibition, and $\beta$ to lowered reset voltage. Yes, it can solve the problem .. and do so in the presence of firing noise in a finite period of time .. but a little bit meh, because the problem is not that hard, and there is no learning in the network.