BAE Systems FAST Labs moves to DARPA SafeDocs phase 3

As announced in December 2019, BAE Systems’ FAST Labs research and development organization was contracted under the SafeDocs program to develop new cyber tools designed to help prevent vulnerabilities in electronic files that can lead to cyberattacks.

Phase 1 and Phase 2 of the SafeDocs program set forth the lofty goal of dramatically improving software’s ability to detect and reject invalid or maliciously crafted input data. Based on the success of its performance in these two phases, FAST Labs was recently awarded a Phase 3 option to collaborate with defense and industry partners to refine its toolset.

“As is often the case with disruptive early technology R&D programs, when we first began work on the Phase 1 contract, our approach was all conjecture,” said David Woolrich, technical director at BAE Systems’ FAST Labs. “At the start, we didn’t have any fully built tools, proven mathematical theorems, or ideas tested against actual data. Now, we have done all three with really solid results.”

FAST Labs’ R&D team created a tool suite to understand and identify safe features of electronic data formats using a Language-Theoretic Security (LangSec) approach developed by DARPA project manager Sergey Bratus and his collaborators. LangSec offers a systematic approach towards parser design/input validation. A parser, which is used to break data inputs down into manageable objects for further processing, can itself contain exploitable flaws and behaviors. The team tied its approach back to the underlying format grammar to classify files and identify areas of improvement in existing parsers.

The FAST Labs R&D team developed a technique that uses multiple existing file parsers that look for a wide range of features to detect malicious payloads in files. For a complex format such as PDFs, which can contain text processing, image rendering, links, and JavaScript, no single parser can analyze all of these features or detect all possible PDF flaws/malicious features. Having a diversity of parsers leaves little room for attackers to hide. FAST Labs’ tools exploit this diversity theoretically and practically.

Additionally, this approach is based on the combination of topological methods and statistics, which are not commonly used in formal computer language design or analysis, so this work is the first-ever attempt in this vein. The result has been extremely successful. In a recent exercise of analyzing one million files, all but 44 were correctly classified. This far exceeds the expectations for a tool at this stage of development.

“These are extraordinary results with far-reaching impact. The files we’re testing are used by everyone in defense and commercial settings, including PDF, JPEG, MPEG, and CSV files,” added Woolrich. “All of these formats are in daily use, so being able to reliably determine the safety of these files helps everyone.”

Source: BAE Systems

Your competitors read IC News each day. Shouldn’t you? Learn more about our subscription options, and keep up with every move in the IC contracting space.