Going Beyond the Point Neuron: Active Dendrites and Sparse Representations for Continual Learning

Click on the image above to enlarge

Abstract:

Dendrites of pyramidal neurons demonstrate a wide range of linear and non-linear integrative properties [1]. However the majority of artificial neural networks (ANNs) ignore the structural complexity of biological neurons and use simplified point neurons. We propose that active dendrites can help ANNs learn continuously as most ANNs today suffer from catastrophic forgetting, i.e. they are unable to learn new information without erasing what they previously learned. This model is inspired by 1) the biophysics of sustained depolarization following dendritic NMDA spikes, and 2) highly sparse representations. Here, dendritic segments recognize task-specific contextual patterns and modulate the activity of their neuron. A winner-take-all circuit gives preference to up-modulated neurons, and activates a highly sparse subset of neurons. These task-specific subnetworks interfere minimally with each other, and as a result, the network doesn’t forget previous tasks as easily as in standard ANNs.

This approach was compared to context-dependent gating (XdG) [2], a method that turns individual units on/off based on task ID. In XdG, a hardcoded task ID and associated network subset must be provided. The model was also compared to Synaptic Intelligence (SI) [3], a complementary technique that encourages synapses to simultaneously represent multiple tasks.

The dendritic model was tested on permutedMNIST, a standard continual learning scenario (Fig. 1). Instead of hardcoding task ID, it attempts to infer a task-specific context signal, a significantly more challenging task. Dendritic segments learn to recognize different context signals leading to the emergence of task-dependent subnetworks. This model achieves 94.6% and 81.4% accuracy on 10 and 100 consecutive tasks, respectively. When combined with SI, it improves to 97.2% and 91.6% accuracy on 10 and 100 consecutive tasks, respectively. This compares favorably to XdG but without its previously-mentioned limitations. Further analysis demonstrates that the sparsity of representations and number of dendrites positively correlate with overall accuracy. These results suggest that incorporating the structural properties of active dendrites and sparse representations can help improve the accuracy of ANNs in continual learning scenarios.

References
1. Poirazi, P., Papoutsi, A. Illuminating dendritic function with computational models. Nature Reviews Neuroscience 21, 303–321. 2020.
2. Masse, N. Y., Grant, G. D., Freedman, D. J. Alleviating catastrophic forgetting using context-dependent gating and synaptic stabilization. Proceedings of the National Academy of Sciences. 2018, 115(44).
3. Zenke, F., Poole, B., Ganguli, S. Continual learning through synaptic intelligence. Proceedings of the 34th International Conference on Machine Learning. 2017.

Poster Walkthrough