TL;DR: How can we build algorithms that learn without human labels, so that we can use AI to help solve scientific challenges humans dont yet know how to solve.
Abstract
How does the human mind make sense of raw information without being taught how to see or hear? This thesis presents a unifying theory that describes how algorithms can learn and discover structure in complex systems, like natural images, audio, language, and video - without human input. This class of algorithms has the possibility to extend our own understanding of the world by helping us to see previously unseen patterns in nature and science. At the core of this thesis' unified theory is the notion that relationships between deep network representations hold the key discover the structure of the world without human input. This work will begin with a few examples of this principle in action; discovering hidden connections that span cultures and millennia in the visual arts, discovering visual objects in large image corpora, classifying every pixel of our visual world, and rediscovering the meaning of words from raw audio, all without human labels. In the latter half of this thesis, we will present two unifying mathematical theories of unsupervised learning. The first will explain why relationships between deep features can rediscover the semantic structure of the natural world by connecting model explainability, cooperative game theory, and deep feature relationships. The second mathematical theory will show that relationships between representations can be used to unify over 20 common machine learning algorithms spanning 100 years of progress in the field of machine learning. In particular, we introduce a single equation that unifies classification, regression, large language modeling, dimensionality reduction, clustering, contrastive learning, and spectral methods. This thesis uses this unified equation as the basis for a "periodic table of representation learning" that predicts the existence of new types of algorithms. We show that one of these predicted algorithms is a state-of-the-art unsupervised image classification technique. Finally, this work will summarize the key findings and share ongoing and future directions.
Papers in This Thesis
MosAIc: Finding Artistic Connections across Culture with Conditional Image Retrieval
We introduce a unifying framework for understanding search engines and other unsupervised algorithms. This connects game theory, model explainability, and feature relationships formally. Explaining why these relationships used in the thesis work so well.
I-Con: A Unifying Framework for Representation Learning
We introduce a single equation that unifies 23+ machine learning algorithms. We use this equation to introduce a periodic table of machine learning. Several of the works of this thesis appear in the table.
Contact
For feedback, questions, or press inquiries please contact Mark Hamilton