IARPA to hold Proposers’ Day conference for MATERIAL program
On August 1, the Intelligence Advanced Research Projects Activity (IARPA) released a notice publicizing its upcoming Proposers’ Day Conference for the Machine Translation for English Retrieval of Information in Any Language (MATERIAL) Program
The Intelligence Advance Research Projects Activity (IARPA) will host a Proposers’ Day Conference for the MATERIAL Program on September 27, 2016, in anticipation of the release of a new solicitation in support of the program. The Conference will be held from 9:00AM to 5:00PM EDT in the Washington, DC metropolitan area. The purpose of the conference will be to provide information on the MATERIAL Program and the research problems the program aims to address, answer questions from potential proposers and to provide a forum for potential proposers to present their capabilities for teaming opportunities.
This announcement serves as a pre-solicitation notice and is issued solely for information and planning purposes. The Proposers’ Day Conference does not constitute a formal solicitation for proposals or proposal abstracts. Conference attendance is voluntary and is not required to propose to future solicitations (if any) associated with this program. IARPA will not provide reimbursement for any costs incurred to participate in this Proposers’ Day.
The MATERIAL performers will develop an “English-in, English-out” information retrieval system that, given a domain-sensitive English query, will retrieve relevant data from a large multilingual repository and display the retrieved information in English as query-biased summaries. MATERIAL queries will consist of two parts: a domain specification and an English word (or string of words) that capture the information need of an English-speaking user, e.g., “zika virus” in the domain of GOVERNMENT vs. “zika virus” in the domain of HEALTH, or “asperger’s syndrome” in the domain of EDUCATION vs. “asperger’s syndrome” in the domain of SCIENCE. The English summaries produced by the system should convey the relevance of the retrieved information to the domain-limited query to enable an English-speaking user to determine whether the document meets the information needs of the query.
Current methods to produce similar technologies require a substantial investment in training data and/or language specific development and expertise, entailing many months or years of development. A goal of this program is to drastically decrease the time and data needed to field systems capable of fulfilling an English-in, English out task. Limited machine translation and automatic speech recognition training data will be provided from multiple low resource languages to enable performers to learn how to quickly adapt their methods to a wide variety of materials in various genres and domains. As the program progresses, performers will apply and adapt these methods in increasingly shortened time frames to new languages. Program data will include formal and informal genres of text and speech which will not be fully captured by the training data. Image and video are out of scope for this program.
Performers will be evaluated, relative to a baseline system, on their ability to accurately retrieve materials relevant to an English domain-specific query from a database of multi-domain, multi-genre documents in a low resource language, and their ability to convey the relevance of those documents through summaries presented to English speaking domain experts.
Full information is available here.