In the past three years, I have been studying the extrapolation performed by machine learning and deep learning models, and we have written several papers on this subject.
First, we showed that image classification is an extrapolation task, i.e., functional task of image classification models involves extrapolation. In 2019 and 2020, the thinking of research community on this subject was not clear. Many people thought that image classification is predominantly interpolation. When I would mention that testing samples are entirely outside the convex hull of training sets, I usually heard two responses: 1- distance of testing samples to the convex hull of training set is negligible, and therefore, extrapolation performed by the models is negligible. 2- deep learning models learn some low-dimensional manifold from images where all testing samples are contained in the convex hull of training set, so deep learning models interpolate in that low-dimensional space. In the paper below, I empirically showed that both notions are wrong.
After this paper, I continued to have discussions with more senior researchers, but the feedback I received was not sensible to me. I heard that I should not use the word “extrapolation”. I also heard that “extrapolation” and “learning” do not go together. I heard that testing samples falling outside the convex hull of training set is just an uninteresting phenomenon related to high-dimensional space and I should consider moving on from this subject. These arguments seemed non-sense to me, and I focused on demonstrating that extrapolation and learning indeed can go together. In this quest, I studied the literature in pure and applied math and also the literature in cognitive science and psychology on extrapolation and learning. This was a collaboration with Jessica Mollick who is a researcher in cognitive psychology. The result was a paper that we presented at NeurIPS Workshop on Human and Machine Decisions:
The above paper was well-received by the reviewers and set the stage for the project to move forward.
In the meantime, during the entire 2021, I continued to work on different aspects of extrapolation for image classification. I also considered the literature in pure and applied math on extrapolation and approximation theory. I gave talks at three venues:
Workshop on Theory of Over-parameterized Machine Learning, April 2021. Abstract – Video (5 minutes)
Deep Learning Generalization, Extrapolation, Over-parameterization, and Decision Boundaries, Laboratory for Applied Mathematics, Numerical Software, and Statistics, Argonne National Lab. July 2021. (1 hour)
Deep Learning Generalization, Extrapolation, Over-parameterization, and Decision Boundaries, Department of Mathematics, University of Nottingham, August 2021. Video (1 hour)
Over-parameterization: A Necessary Condition for Models that Extrapolate A mathematical piece arguing (through a set of lemmas) that over-parameterization is a necessary condition for models that extrapolate. This was recently presented at the 2022 Workshop on Theory of Over-parameterized Machine Learning.
Should Machine Learning Models Report to Us When They Are Clueless? A policy piece suggesting that AI regulations should include articles to make extrapolation performed by the models transparent. By default, extrapolation performed in social applications of AI is kept hidden.
A Sketching Method for Finding the Closest Point on a Convex Hull An algorithm for projecting query points to a convex hull. The sketching aspect of algorithm along with its use of gradient projection method may solve the problem faster then off-the-shelf algorithms.