Back in 1993, AI pioneer Jürgen Schmidhuber published the paperA Self-Referential Weight Matrix, which he described as a “thought experiment… intended to make a step towards self-referential machine learning by showing the theoretical possibility of self-referential neural networks whose weight matrices (WMs) can learn to implement and improve their own weight change algorithm.” A lack of subsequent practical studies in this area had however left this potentially impactful meta-learning ability unrealized — until now.
In the new paper A Modern Self-Referential Weight Matrix That Learns to Modify Itself, a research team from The Swiss AI Lab, IDSIA, University of Lugano (USI) & SUPSI, and King Abdullah University of Science and Technology (KAUST) presents a scalable self-referential WM (SRWM) that leverages outer products and the delta update rule to update and improve itself, achieving both practical applicability and impressive performance in game environments.
The proposed model is built upon fast weight programmers (FWPs), a scalable and effective method dating back to the ‘90s that can learn to memorize past data and compute fast weight changes via programming instructions that are additive outer products of self-invented activation patterns, aka keys and values for self-attention. In light of their connection to linear variants of today’s popular transformer architectures, FWPs are now witnessing a revival. Recent studies have advanced conventional FWPs with improved elementary programming instructions or update rules invoked by their slow neural net to reprogram the fast neural net, an approach that has been dubbed the “delta update rule.”