Friday, July 13, 2018

Local learning rules to attenuate forgetting in neural networks

Another paper from our work on Restricted Boltzmann Machines (RBMs) from the Hennig lab. 

[get paper PDF]

Main points:
  • We noticed that measures of synaptic importance were available from local firing statistics (at least in Boltzmann machines)
  • We look at an artificial neural network that stores memories and is easy to analyze. (Hopfield nets are the zero temperature limit of a Boltzmann machine).
  • We evaluated whether this local measure of synaptic importance could help stabilize important weights when networks learn multiple things that interfere with each-other
  • Intuition: biological variables, like synapse size, can correlate with useful statistical quantities. This provides tricks for biologically-plausible approximations of algorithms.
  • Intuition: in systems that learn, if a parameter takes on an unusual or surprising value, it is likely that this value was set through learning—and you might want to leave it fixed.

Figure 3b: The average Dice coefficient (a measure of pattern retention) for each pattern as a function of the training epoch, for different learning rules. The simulations show that augmented learning rules have improved retention, compared to the normal Hebb rule, but their behavior differs with increasing network loading.

Abstract

Hebbian synaptic plasticity inevitably leads to interference and forgetting when different, overlapping memory patterns are sequentially stored in the same network. Recent work on artificial neural networks shows that an information-geometric approach can be used to protect important weights to slow down forgetting. This strategy however is biologically implausible as it requires knowledge of the history of previously learned patterns. In this work, we show that a purely local weight consolidation mechanism, based on estimating energy landscape curvatures from locally available statistics, prevents pattern interference. Exploring a local calculation of energy curvature in the sparse-coding limit, we demonstrate that curvature-aware learning rules reduce forgetting in the Hopfield network. We further show that this method connects information-geometric global learning rules based on the Fisher information to local spike-dependent rules accessible to biological neural networks. We conjecture that, if combined with other learning procedures, it could provide a building-block for content-aware learning strategies that use only quantities computable in biological neural networks to attenuate pattern interference and catastrophic forgetting. Additionally, this work clarifies how global information-geometric structure in a learning problem can be exposed in local model statistics, building a deeper theoretical connection between the statistics of single units in a network, and the global structure of the collective learning space.

Many thanks to Michael, Martino, and Matthias. It's still a pre-print for now, but can be referenced as

Deistler, M., Sorbaro, M., Rule, M.E. and Hennig, M.H., 2018. Local learning rules to attenuate forgetting in neural networks. arXiv preprint arXiv:1807.05097.


No comments:

Post a Comment