Learning Explanatory Rules from Noisy Data
 From a dense background of inductive logic programming (ILP): given a set of statements, and rules for transformation and substitution, generate clauses that satisfy a set of 'background knowledge'.
 Programs like Metagol can do this using search and simplify logic built into Prolog.
 Actually kinda surprising how very dense this program is  only 330 lines!
 This task can be transformed into a SAT problem via rules of logic, for which there are many fast solvers.
 The trick here (instead) is that a neural network is used to turn 'on' or 'off' clauses that fit the background knowledge
 BK is typically very small, a few examples, consistent with the small size of the learned networks.
 These weight matrices are represented as the outer product of composed or combined clauses, which makes the weight matrix very large!
 They then do gradient descent, while passing the crossentropy errors through nonlinearities (including clauses themselves? I think this is how recursion is handled.) to update the weights.
 Hence, SGD is used as a means of heuristic search.
 Compare this to Metagol, which is brittle to any noise in the input; unsurprisingly, due to SGD, this is much more robust.

 Way too many words and symbols in this paper for what it seems to be doing. Just seems to be obfuscating the work (which is perfectly good). Again: Metagol is only 330 lines!

Inductive Rule Learning on the Knowledge Level.
 2011.
 v2 of their IGOR inductivesynthesis program.
 Quote: The general idea of learning domain specific problem solving strategies is that first some small sample problems are solved by means of some planning or problem solving algorithm and that then a set of generalized rules are learned from this sample experience. This set of rules represents the competence to solve arbitrary problems in this domain.
 My take is that, rather than using heuristic search to discover programs by testing specifications, they use memories of the output to select programs directly (?)
 This is allegedly a compromise between the generateandtest and analytic strategies.
 Description is couched in CSlingo which I am inexperienced in, and is perhaps too highlevel, a sin I too am at times guilty of.
 It seems like a good idea, though the examples are rather unimpressive as compared to MagicHaskeller.
