You are here

Digital Data Structure Understanding Tools


Digital Forensic Investigators often encounter structured data in an unknown format. So that an investigator can present accurate and complete evidence to the court, the data needs to be well understood. There has been at least one court case [1] where an incomplete understanding of data recovered from a web browser log was misleading in court. A better understanding of the data structure involved could have prevented presentation of inaccurate testimony in this case.

There are several scenarios where such data can be encountered, e.g., a non-standard file system, a memory acquisition of an unknown OS version (or custom kernel), or a database with unknown schema. An investigator has few automated options for analysis of digital objects within an unstructured environment. For example, file carving is often productive for individual files with a known structure and a known signature identifying specific file types. An investigator can try to unravel the unknown object structure manually, but this is a tedious, error-prone process.

NIST would like to see development of one or more tools that can aid the understanding of the structure of a digital object. Such a tool would be useful for developers of forensic tools used by law enforcement and also research in digital forensics currently conducted at NIST.

There are many different ways to approach this problem. One way such a tool could work would be to take a baseline image of the object and then to apply a series of operations and analyze changes produced by each operation. The tool would be given a description of the applied operation and a list of information to look for within changes, then the tool can examine the previous state before the operation was applied to infer some object parameter.

Depending on the type of data structure being reverse engineered, a set of questions can be posed. For example, reverse engineering a file system might proceed as follows:

• Create a single file, making note of the time. Identifying a list of differences between the state before creating the file and the state after creating the file creates a set of candidate locations for the file name and any meta-data such as recorded time values or file size.

• Append data to the file created above. This operation may reveal the location of a file size field.

• Create some more files. This could reveal the basic structure of directory entries and general layout. In general, the tools would make a small modification to a digital object with known data and then examine the raw object for changes including the known data.

This is just one of many possible approaches. A successful applicant is expected to be creative and innovative.

Phase I expected results:
An architecture of and development plan for an automated method to discover the layout and structure of an unknown data object of interest to a digital forensics investigator.

Phase II expected results:
A demonstration version of a tool that, given an unknown data object, deduces, relevant to a forensic exam, parts of the object. This tool will be marketable to law enforcement and forensic science entities.

NIST may be available to work collaboratively and for consultation, input and discussion.

US Flag An Official Website of the United States Government