Deep Neural Networks obtain outstanding performances on many benchmarks, yet the key ingredient of their success remains unknown. This is mainly due to the high dimensional nature of those objects, which have many parameters and very large inputs. They tend to generalize well on new test sets, implying that those architectures have memorized important attributes of a dataset. During this PhD, we propose to study those attributes both from a theoretical and numerical point of view: what is their nature, how are they learned, how are they stored? We aim at studying two types of mechanisms which can be addressed independently while being neatly connected: the memorization through the symmetries of a supervised or unsupervised task, and the memorization through the data. Our objective is to derive a low-complexity class (in the sense of generalization, sometimes measured via the number of parameters) of models able to reach state-of-the-art performance on ImageNet: how do deep neural networks memorize the good attributes of the data? What is the underlying simplified model of this memorization phenomenon?

Our theoretical work will focus on the notion of symmetry in data and models, and the link between the intrinsic dimensionality of data structures and the complexity classes of deep learning models. Beyond these new theoretical advances, addressing such issues can be useful for at least 2 major applications of machine learning: small data settings and interpretability, because they rely significantly on the complexity of the models used. A specific attention will be given for obtaining a theory which reduces the gap between the numerical experiments and theoretical results, in order to avoid vacuous bounds.

 

PhD student: Louis Fournier

PhD supervisor: Sylvain Lamprier

Research laboratory: MLIA team (lip6)