Abstract: Deep networks notoriously suffer from performance deterioration on previous tasks when learning from sequential tasks, i.e., catastrophic forgetting. Recent methods of gradient projection ...
Abstract: In this paper, we study the convergence properties of the natural gradient methods. By reviewing the mathematical condition for the equivalence between the Fisher information matrix and the ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results