Info message
Binary Markup Toolkit website
Welcome to the Binary Markup Toolkit website.
Binary Markup Toolkit is a suite of software tools for use in Digital Forensics. It was developed by James Clark whilst studying
for a Master’s degree in Forensic Computing at the Cyber Technology Institute of De Montfort University, Leicester. The software is available free-of-charge
to bona fide forensic practitioners and researchers. Extracts from the Master’s degree dissertation describing applicatons for the technology are also available.
James also maintains several related tools:
If you would like to use the software or find out more please use the contact form below.
What is Binary Markup Toolkit?
Binary Markup Toolkit (BMTK) processes data into Binary Markup Language (BML). It includes tools to:
- Generate BML from binary data, "raw" image files, storage devices or filesystems
- Convert BML to other convenient forms such as CSV and SQLite3 databases
- Display BML as an annotated hexadecimal dump
- Generate forensic timelines from BML
The currently supported BMTK workflows are shown below:
What is Binary Markup Language?
BML is an XML-based language for describing the provenance of arbitrary binary data. It is human readable and can be authored by hand or generated automatically by software. It describes the location and size of fields within the underlying data. It is data agnostic and can represent a filesystem like FAT or NTFS or an application file format file JPEG. Optionally, it can also describe hierarchical data relationships, field names, interpreted data values/types and descriptions.
BML is designed for forensic computing, protocol debugging, reverse engineering and similar scenarios. Typically, a single BML file is associated with a single binary file, forensic "image" or protocol "dump".
BML can be used like a "microscope" to dig deep into binary data and put artefacts into accurate context. Practitioners can use BML to aid investigations and share findings.
An example of BML, from a Windows shortcut file, is shown below:
Is BML the same as DFXML?
DFXML is another XML based language for use in digital forensics. It was created by Simson Garfinkel and uses specific XML elements to describe certain file system metadata, file locations and Windows Registry values. With some exceptions DFXML does not describe the actual location of binary data such as metadata.
BML takes a different, lower-level, approach and is designed to record the detailed content of arbitrary binary data. BML is therefore typically more verbose than DFXML. In some cases BML and DFXML may be complementary.
BMTK Technical Requirements
BMTK has the following technical requirements:
- Microsoft Windows XP/2003 or later (32-bit or 64-bit)
- 256MB of RAM (4GB or more recommended)
- Approximately 10MB of free storage for BMTK software and documentation
- Sufficient free storage for BML processing (application dependent) *
There is no functional difference between the 32-bit and 64-bit software and the 32-bit build may be used on most 64-bit systems. The 64-bit build may provide enhanced performance on 64-bit systems and is recommended where available. Please contact the author if you need to use BMTK on an operating system prior to Windows XP.
* BML can be verbose and a BML description of binary data can be several times the size of the original data. BML files >10GB are quite common and we suggest a typical system should have at least 100GB free storage for practical use with BMTK.
Annotated hexadecimal dump
BMTK can generate annotated hexadecimal dumps from binary data. The format of these may be very familiar to practitioners who have attended a certain popular UK forensic course. An example for a Windows shortcut (.LNK) file is below:
Binary Markup Toolkit Components
BMTK currently includes the following tools:
Tool | Purpose |
BMCONSOLE |
Process a "raw" image file, storage device or file system into a BML document |
BML2CSV |
Convert a BML document to industry standard CSV format suitable for further processing using other tools |
BML2DB |
Convert a BML document to a SQLite3 database suitable for direct querying or further processing using other tools |
BML2DUMP |
Convert a BML database file (produced by BML2DB) to an annotated hexadecimal dump. This may be used to visualise the data descriptions provided by BML. |
BMTK Agents (plug-ins)
BMTK is extensible and uses agents (or “plug-ins”) to provide format specific processing. For example, an agent may target the JPEG file format or a complete file system such as FAT. BMTK agents are recursive and data identified by one agent (for instance in a filesystem) can be automatically routed to the most appropriate format specific agent. The following agents are currently available:
Agent | Purpose |
MBRDiskAgent | Partition table agent supporting MBR-style partitions (e.g. not GPT) and extended partitions. This agent allows complete disk images to be conveniently processed in a single session. |
FATFSAgent | FAT file system agent supporting Microsoft implementation of FAT12/16/32 and long file name extensions. |
NTFSAgent | NTFS file system agent supported Microsoft NTFS v1.2 (Windows NT v3.51) and later |
MFTAgent | NTFS Master File Table agent for detailed processing of NTFS MFT records. |
INDXAgent | NTFS Indx stream agent for detailed processing of non-resident NTFS indexes (e.g. typically directories) |
UsnJrnlAgent | NTFS agent for detailed processing of filesystem USN Change Journal ($UsnJrnl:$J). The complete source code for this agent is in the SDK. |
WinShellLinkAgent | Windows Shell Link (shortcut) agent |
BMTK Software Development Kit
The BMTK Software Development Kit (SDK) describes how additional BMTK agents can be created to process new data formats. Agents are implemented using standard Windows DLLs. The interface is primarily designed for the C/C++ languages but is compatible with any language than can produce standard DLLs. If you would like to develop a new BMTK agent, or need help developing an agent, please use the contact form below.
BMTK Documentation
The BMTK Quick Start Guide explains how to install BMTK and perform basic BML operations. It includes a number of walkthroughs introducing the software. These include:
- Using BMTK to process a "raw" FAT image file to BML
- Using BMTK to process a "raw" $MFT image file to BML
- Using BMTK to process a complete NTFS file system to BML
- Using BMTK to process a disk image (multiple partitions) to BML
- Using BMTK to convert BML to CSV format
- Using BMTK to convert BML to SQLite3 format
- Using BMTK to convert BML to an annotated hex dump
- Using BMTK to process NTFS USN Change Journal to BML / activity timeline
Binary Markup Toolkit (BMTK) Quick Start Guide
BMTK Research Applications
Two chapters in the final Master’s degree report describe practical research applications for BMTK. The chapters and associated appendices are available from the links below:
Chapter | Summary |
Chapter 7
|
This chapter discusses how BML and the BMTK software can be used to investigate and
explain the behaviour of timestamps in the Microsoft NTFS filesystem.
This work clarifies a long-standing uncertainty in the digital forensic literature and also illustrates how a
similar approach could be used to investigate other filesystem and file format behaviour
using human-readable BML data description.
|
Chapter 8
|
This chapter takes this concept further and explores how BML and BMTK can be used to investigate other forensic artefacts
generated by typical user activity on a computer running Microsoft Windows.
An experiment was conducted to simulate a realistic forensic case scenario and then BML
and BTMK are used to develop a timeline of past user actions. The scenario incorporated
several distinct elements and illustrated how BML data descriptions can be used to
establish links between them. The detailed analysis procedure and findings are presented
in a form similar to a forensic practitioner log.
This experiment identified several features of the NTFS USN Change Journal and Windows
Shell Link (shortcut) files that can be used to investigate historic activity using filesystem
metadata alone. The discussion highlights these and also describes suggestions for further
work that could be carried out.
|
Who can use BMTK? What's the catch?
BMTK is available free of free-of-charge to bona fide forensic practitioners working in law enforcement, academia or similar. You are free to use the software for any purpose. I would welcome bug reports or suggestions for new features. Please tell me if you think the software is useful or useless. I won’t be offended!
Download BMTK / Ask Question
If you would like to use BMTK, report a bug, make a suggestion or just ask a question please use the form below: