PMID26621426 Causal Inference and Explaining Away in a Spiking Network
 RubÃ©n MorenoBote & Jan Drugowitsch
 Use linear nonnegative mixing plus nose to generate a series of sensory stimuli.
 Pass these through a onelayer spiking or nonspiking neural network with adaptive global inhibition and adaptive reset voltage to solve this quadratic programming problem with nonnegative constraints.

 N causes, one observation: $\mu = \Sigma_{i=1}^{N} u_i r_i + \epsilon$ ,
 $r_i \geq 0$  causes can be present or not present, but not negative.
 cause coefficients drawn from a truncated (positive only) Gaussian.
 linear spiking network with symmetric weight matrix $J = U^TU  \beta I$ (see figure above)
 That is ... J looks like a correlation matrix!
 $U$ is M x N; columns are the mixing vectors.
 U is known beforehand and not learned
 That said, as a quasicorrelation matrix, it might not be so hard to learn. See ref [44].
 Can solve this problem by minimizing the negative logposterior function: $$ L(\mu, r) = \frac{1}{2}(\mu  Ur)^T(\mu  Ur) + \alpha1^Tr + \frac{\beta}{2}r^Tr $$
 That is, want to maximize the joint probability of the data and observations given the probabilistic model $p(\mu, r) \propto exp(L(\mu, r)) \Pi_{i=1}^{N} H(r_i)$
 First term quadratically penalizes difference between prediction and measurement.
 second term, alpha is a L1 regularization term, and third term w beta is a L2 regularization.
 The negative loglikelihood is then converted to an energy function (linear algebra): $W = U^T U$ , $h = U^T \mu$ then $E(r) = 0.5 r^T W r  r^T h + \alpha 1^T r + 0.5 \beta r^T r$
 This is where they get the weight matrix J or W. If the vectors U are linearly independent, then it is negative semidefinite.
 The dynamics of individual neurons w/ global inhibition and variable reset voltage serves to minimize this energy  hence, solve the problem. (They gloss over this derivation in the main text).
 Next, show that a spikebased network can similarly 'relax' or descent the objective gradient to arrive at the quadratic programming solution.
 Network is N leaky integrate and fire neurons, with variable synaptic integration kernels.
 $\alpha$ translates then to global inhibition, and $\beta$ to lowered reset voltage.

 Yes, it can solve the problem .. and do so in the presence of firing noise in a finite period of time .. but a little bit meh, because the problem is not that hard, and there is no learning in the network.
