by Rob Toews
1. Unsupervised Learning
At a deeper level, supervised learning represents a narrow and circumscribed form of learning. Rather than being able to explore and absorb all the latent information, relationships and implications in a given dataset, supervised algorithms orient only to the concepts and categories that researchers have identified ahead of time.
In contrast, unsupervised learning is an approach to AI in which algorithms learn from data without human-provided labels or guidance.
Unsupervised learning more closely mirrors the way that humans learn about the world: through open-ended exploration and inference, without a need for the “training wheels” of supervised learning. One of its fundamental advantages is that there will always be far more unlabeled data than labeled data in the world (and the former is much easier to come by).
Unsupervised learning is already having a transformative impact in natural language processing. NLP has seen incredible progress recently thanks to a new unsupervised learning architecture known as the Transformer, which originated at Google about three years ago.
2. Federated Learning
The concept of federated learning was first formulated by researchers at Google in early 2017. The standard approach to building machine learning models today is to gather all the training data in one place, often in the cloud, and then to train the model on the data. But this approach is not practicable for much of the world’s data, which for privacy and security reasons cannot be moved to a central data repository. Rather than requiring one unified dataset to train a model, federated learning leaves the data where it is, distributed across numerous devices and servers on the edge. Instead, many versions of the model are sent out—one to each device with training data—and trained locally on each subset of data. The resulting model parameters, but not the training data itself, are then sent back to the cloud. When all these “mini-models” are aggregated, the result is one overall model that functions as if it had been trained on the entire dataset at once.
The original federated learning use case was to train AI models on personal data distributed across billions of mobile devices. More recently, healthcare has emerged as a particularly promising field for the application of federated learning. Beyond healthcare, federated learning may one day play a central role in the development of any AI application that involves sensitive data: from financial services to autonomous vehicles, from government use cases to consumer products of all kinds. Paired with other privacy-preserving techniques like differential privacy and homomorphic encryption, federated learning may provide the key to unlocking AI’s vast potential while mitigating the thorny challenge of data privacy.
Transformers were introduced in a landmark 2017 research paper. Previously, state-of-the-art NLP methods had all been based on recurrent neural networks (e.g., LSTMs). By definition, recurrent neural networks process data sequentially—that is, one word at a time, in the order that the words appear.
Transformers’ great innovation is to make language processing parallelized: all the tokens in a given body of text are analyzed at the same time rather than in sequence. In order to support this parallelization, Transformers rely heavily on an AI mechanism known as attention. Attention enables a model to consider the relationships between words regardless of how far apart they are and to determine which words and phrases in a passage are most important to “pay attention to.”
Transformers have been associated almost exclusively with NLP to date, thanks to the success of models like GPT-3. But just this month, a groundbreaking new paper was released that successfully applies Transformers to computer vision. Many AI researchers believe this work could presage a new era in computer vision. (As well-known ML researcher Oriol Vinyals put it simply, “My take is: farewell convolutions.”)