Knowledge base > Wiki pages > Data Formats

While DICOM is a widespread standard for storing and transmitting image data in many medical applications, currently no standardized format for raw USCT acquisition data and meta information exists.

There are four main reasons to agree on a standardized format for USCT data to foster exchange and to facilitate collaborations:

Data Exchange

A well-defined, and community driven data format clearly facilitates exchanging complex data sets.
The format needs to provide APIs for all common programming languages (Matlab, Python, C/C++, Fortran) and tools to validate the correctness of a file.

Efficiency
Storage might be cheap, but data access and processing of the data can become a bottleneck for large acquisition systems.

Data Organization
Including meta information (e.g., water temperature, transducer locations, …) in addition to raw data (e.g., A-scans) is crucial for all imaging methods.
Without having a flexible data container, this easily results in formats that are hard to maintain, to reproduce, and to exchange.

Provenance
A critical aspect of science is the ability to reproduce results and modern data formats should enable and encourage this.

A custom-tailored USCT data format will help to address these challenges.

HDF5

As an example, we propose to follow a similar approach like the Adaptable Seismic Data Format (ASDF), which is based on HDF5.

HDF5 is a versatile data model that can represent very complex data objects and a wide variety of metadata. It is used as the standard data container in many scientific disciplines. For instance, Matlab .mat files (Version 7.3) are based on HDF5.

Examples for the organization of such files for USCT applications will follow.