m8ta
use https for features. 

{1511}  
 
{1423}  
PMID27824044 Random synaptic feedback weights support error backpropagation for deep learning.
Our proof says that weights W0 and W evolve to equilibrium manifolds, but simulations (Fig. 4) and analytic results (Supple mentary Proof 2) hint at something more specific: that when the weights begin near 0, feedback alignment encourages W to act like a local pseudoinverse of B around the error manifold. This fact is important because if B were exactly W + (the Moore Penrose pseudoinverse of W ), then the network would be performing GaussNewton optimization (Supplementary Proof 3). We call this update rule for the hidden units pseudobackprop and denote it by ∆hPBP = W + e. Experiments with the linear net work show that the angle, ∆hFA ]∆hPBP quickly becomes smaller than ∆hFA ]∆hBP (Fig. 4b, c; see Methods). In other words feedback alignment, despite its simplicity, displays elements of secondorder learning.  
{806}  
I've recently tried to determine the bitrate of conveyed by one gaussian random process about another in terms of the signaltonoise ratio between the two. Assume $x$ is the known signal to be predicted, and $y$ is the prediction. Let's define $SNR(y) = \frac{Var(x)}{Var(err)}$ where $err = xy$ . Note this is a ratio of powers; for the conventional SNR, $SNR_{dB} = 10*log_{10 } \frac{Var(x)}{Var(err)}$ . $Var(err)$ is also known as the meansquarederror (mse). Now, $Var(err) = \sum{ (x  y  sstrch \bar{err})^2 estrch} = Var(x) + Var(y)  2 Cov(x,y)$ ; assume x and y have unit variance (or scale them so that they do), then $\frac{2  SNR(y)^{1}}{2 } = Cov(x,y)$ We need the covariance because the mutual information between two jointly Gaussian zeromean variables can be defined in terms of their covariance matrix: (see http://www.springerlink.com/content/v026617150753x6q/ ). Here Q is the covariance matrix, $Q = \left[ \array{Var(x) & Cov(x,y) \\ Cov(x,y) & Var(y)} \right]$ $MI = \frac{1 }{2 } log \frac{Var(x) Var(y)}{det(Q)}$ $Det(Q) = 1  Cov(x,y)^2$ Then $MI =  \frac{1 }{2 } log_2 \left[ 1  Cov(x,y)^2 \right]$ or $MI =  \frac{1 }{2 } log_2 \left[ SNR(y)^{1}  \frac{1 }{4 } SNR(y)^{2} \right]$ This agrees with intuition. If we have a SNR of 10db, or 10 (power ratio), then we would expect to be able to break a random variable into about 10 different categories or bins (recall stdev is the sqrt of the variance), with the probability of the variable being in the estimated bin to be 1/2. (This, at least in my mind, is where the 1/2 constant comes from  if there is gaussian noise, you won't be able to determine exactly which bin the random variable is in, hence log_2 is an overestimator.) Here is a table with the respective values, including the amplitude (not power) ratio representations of SNR. "
Now, to get the bitrate, you take the SNR, calculate the mutual information, and multiply it by the bandwidth (not the sampling rate in a discrete time system) of the signals. In our particular application, I think the bandwidth is between 1 and 2 Hz, hence we're getting 1.63.2 bits/second/axis, hence 3.26.4 bits/second for our normal 2D tasks. If you read this blog regularly, you'll notice that others have achieved 4bits/sec with one neuron and 6.5 bits/sec with dozens {271}.  
{984}  
IEEE6114258 (pdf) Towards a BrainMachineBrain Interface:Virtual Active Touch Using Randomly Patterned Intracortical Microstimulation.
____References____ O'Doherty, J. and Lebedev, M. and Li, Z. and Nicolelis, M. Towards a Brain #x2013;Machine #x2013;Brain Interface:Virtual Active Touch Using Randomly Patterned Intracortical Microstimulation Neural Systems and Rehabilitation Engineering, IEEE Transactions on PP 99 1 (2011) 