Learning Explanatory Rules from Noisy Data
- From a dense background of inductive logic programming (ILP): given a set of statements, and rules for transformation and substitution, generate clauses that satisfy a set of 'background knowledge'.
- Programs like Metagol can do this using search and simplify logic built into Prolog.
- Actually kinda surprising how very dense this program is -- only 330 lines!
- This task can be transformed into a SAT problem via rules of logic, for which there are many fast solvers.
- The trick here (instead) is that a neural network is used to turn 'on' or 'off' clauses that fit the background knowledge
- BK is typically very small, a few examples, consistent with the small size of the learned networks.
- These weight matrices are represented as the outer product of composed or combined clauses, which makes the weight matrix very large!
- They then do gradient descent, while passing the cross-entropy errors through nonlinearities (including clauses themselves? I think this is how recursion is handled.) to update the weights.
- Hence, SGD is used as a means of heuristic search.
- Compare this to Metagol, which is brittle to any noise in the input; unsurprisingly, due to SGD, this is much more robust.
-
- Way too many words and symbols in this paper for what it seems to be doing. Just seems to be obfuscating the work (which is perfectly good). Again: Metagol is only 330 lines!
|