[get paper PDF]
Main points:
- We noticed that measures of synaptic importance were available from local firing statistics (at least in Boltzmann machines)
- We look at an artificial neural network that stores memories and is easy to analyze. (Hopfield nets are the zero temperature limit of a Boltzmann machine).
- We evaluated whether this local measure of synaptic importance could help stabilize important weights when networks learn multiple things that interfere with each-other
- Intuition: biological variables, like synapse size, can correlate with useful statistical quantities. This provides tricks for biologically-plausible approximations of algorithms.
- Intuition: in systems that learn, if a parameter takes on an unusual or surprising value, it is likely that this value was set through learning—and you might want to leave it fixed.
Abstract
Hebbian synaptic plasticity inevitably leads to interference and forgetting when different, overlapping memory patterns are sequentially stored in the same network. Recent work on artificial neural networks shows that an information-geometric approach can be used to protect important weights to slow down forgetting. This strategy however is biologically implausible as it requires knowledge of the history of previously learned patterns. In this work, we show that a purely local weight consolidation mechanism, based on estimating energy landscape curvatures from locally available statistics, prevents pattern interference. Exploring a local calculation of energy curvature in the sparse-coding limit, we demonstrate that curvature-aware learning rules reduce forgetting in the Hopfield network. We further show that this method connects information-geometric global learning rules based on the Fisher information to local spike-dependent rules accessible to biological neural networks. We conjecture that, if combined with other learning procedures, it could provide a building-block for content-aware learning strategies that use only quantities computable in biological neural networks to attenuate pattern interference and catastrophic forgetting. Additionally, this work clarifies how global information-geometric structure in a learning problem can be exposed in local model statistics, building a deeper theoretical connection between the statistics of single units in a network, and the global structure of the collective learning space.
Many thanks to Michael, Martino, and Matthias. It's still a pre-print for now, but can be referenced as
Deistler, M., Sorbaro, M., Rule, M.E. and Hennig, M.H., 2018. Local learning rules to attenuate forgetting in neural networks. arXiv preprint arXiv:1807.05097.
No comments:
Post a Comment