Extrapolation Project

In the past three years, I have been studying the extrapolation performed by machine learning and deep learning models, and we have written several papers on this subject.

First, we showed that image classification is an extrapolation task, i.e., functional task of image classification models involves extrapolation. In 2019 and 2020, the thinking of research community on this subject was not clear. Many people thought that image classification is predominantly interpolation. When I would mention that testing samples are entirely outside the convex hull of training sets, I usually heard two responses: 1- distance of testing samples to the convex hull of training set is negligible, and therefore, extrapolation performed by the models is negligible. 2- deep learning models learn some low-dimensional manifold from images where all testing samples are contained in the convex hull of training set, so deep learning models interpolate in that low-dimensional space. In the paper below, I empirically showed that both notions are wrong.

Deep Learning Generalization and the Convex Hull of Training Sets, Presented at NeurIPS 2020 Workshop on Information Geometry and invited to Springer Journal on Information Geometry.

After this paper, I continued to have discussions with more senior researchers, but the feedback I received was not sensible to me. I heard that I should not use the word “extrapolation”. I also heard that “extrapolation” and “learning” do not go together. I heard that testing samples falling outside the convex hull of training set is just an uninteresting phenomenon related to high-dimensional space and I should consider moving on from this subject. These arguments seemed non-sense to me, and I focused on demonstrating that extrapolation and learning indeed can go together. In this quest, I studied the literature in pure and applied math and also the literature in cognitive science and psychology on extrapolation and learning. This was a collaboration with Jessica Mollick who is a researcher in cognitive psychology. The result was a paper that we presented at NeurIPS Workshop on Human and Machine Decisions:

Extrapolation Frameworks in Cognitive Psychology Suitable for Study of Image Classification Models.

Poster - Extrapolation Frameworks in Cognitive Psychology Suitable for Study of Image Classification Models

The above paper was well-received by the reviewers and set the stage for the project to move forward.

In the meantime, during the entire 2021, I continued to work on different aspects of extrapolation for image classification. I also considered the literature in pure and applied math on extrapolation and approximation theory. I gave talks at three venues:

Workshop on Theory of Over-parameterized Machine Learning, April 2021. AbstractVideo (5 minutes)

Deep Learning Generalization, Extrapolation, Over-parameterization, and Decision Boundaries, Laboratory for Applied Mathematics, Numerical Software, and Statistics, Argonne National Lab. July 2021. (1 hour)

Deep Learning Generalization, Extrapolation, Over-parameterization, and Decision Boundaries, Department of Mathematics, University of Nottingham, August 2021. Video (1 hour)

More recent papers on this topic are

Decision Boundaries and Convex Hulls in the Feature Space that Deep Learning Functions Learn from Images A paper providing methods and formulations to study the feature space learned by image classification models. It demonstrates in detail that models still extrapolate in the feature space that they learn from images.

Over-parameterization: A Necessary Condition for Models that Extrapolate A mathematical piece arguing (through a set of lemmas) that over-parameterization is a necessary condition for models that extrapolate. This was recently presented at the 2022 Workshop on Theory of Over-parameterized Machine Learning.

To What Extent Should We Trust AI Models When They Extrapolate? We show that in many applications of AI, extrapolation is not negligible nor predominant. We argue that extrapolation performed by the models is worthy of investigation by human experts.

Should Machine Learning Models Report to Us When They Are Clueless? A policy piece suggesting that AI regulations should include articles to make extrapolation performed by the models transparent. By default, extrapolation performed in social applications of AI is kept hidden.

A Sketching Method for Finding the Closest Point on a Convex Hull An algorithm for projecting query points to a convex hull. The sketching aspect of algorithm along with its use of gradient projection method may solve the problem faster then off-the-shelf algorithms.

to be continued.