DARPA posts new LwLL BAA
On August 6, the Defense Advanced Research Projects Agency (DARPA) posted a new broad agency announcement for its Learning with Less Labels (LwLL) program (Solicitation Number: HR001118S0044). Abstracts are due August 21 at 12:00 p.m. ET, and proposals are due by 12:00 p.m. ET on October 2.
DARPA is soliciting innovative research proposals in the area of machine learning and artificial intelligence. Proposed research should investigate innovative approaches that enable revolutionary advances in science, devices, or systems. Specifically excluded is research that primarily results in evolutionary improvements to the existing state of practice.
In supervised machine learning (ML), the ML system learns by example to recognize things, such as objects in images or speech. Humans provide these examples to ML systems during their training in the form of labeled data. With enough labeled data, we can generally build accurate pattern recognition models.
The problem is that training accurate models currently requires lots of labeled data. For tasks like machine translation, speech recognition or object recognition, deep neural networks (DNNs) have emerged as the state of the art, due to the superior accuracy they can achieve. To gain this advantage over other techniques, however, DNN models need more data, typically requiring 109 or 1010 labeled training examples to achieve good performance.
The commercial world has harvested and created large sets of labeled data for training models. These datasets are often created via crowdsourcing: a cheap and efficient way to create labeled data. Unfortunately, crowdsourcing techniques are often not possible for proprietary or sensitive data. Creating data sets for these sorts of problems can result in 100x higher costs and 50x longer time to label.
To make matters worse, machine learning models are brittle, in that their performance can degrade severely with small changes in their operating environment. For instance, the performance of computer vision systems degrades when data is collected from a new sensor and new collection viewpoints. Similarly, dialog and text understanding systems are very sensitive to changes in formality and register. As a result, additional labels are needed after initial training to adapt these models to new environments and data collection conditions. For many problems, the labeled data required to adapt models to new environments approaches the amount required to train a new model from scratch.
The goal of this program is to make the process of training machine learning models more efficient by reducing the amount of labeled data required to build a model by six or more orders of magnitude, and by reducing the amount of data needed to adapt models to new environments to tens to hundreds of labeled examples.
In order to achieve the massive reductions of labeled data needed to train accurate models, the Learning with Less Labels program (LwLL) will divide the effort into two technical areas (TAs). TA1 will focus on the research and development of learning algorithms that learn and adapt efficiently; and TA2 will formally characterize machine learning problems and prove the limits of learning and adaptation.
Full information is available here.