Raw data files
Of the variety of image formats, HDF5, CBF, SMV (ADSC) are the most popular and standardized. Where possible, any other formats should be converted to CBF before archiving. CBF logically separates into two standards: one for describing the image encoding, and one for the metadata.
Archiving raw data
As part of ensuring that raw data are Findable, Accessible, Interoperable and Re-usable (FAIR), raw data files should be deposited in open-access persistent archives that assign a Digital Object Identifier (DOI) to a data set
(for example Zenodo
For raw image data, the recommended archive formats are: zip, tar+gzip, bzip2, xz, HDF5 (with standard compression filters).
Documenting metadata – imgCIF
Using tools that can read a collection of raw data files, a description of the associated metadata, including references to the locations of the raw data (binary) files, can be captured in plain text as an 'imgCIF'
. An imgCIF file provides data in CIF format, which is both machine-readable and human-readable, with a comprehensive set of tags (defined in the imgCIF dictionary) for describing detector geometries and other experimental parameters, thus facilitating FAIR 're-usability'
. The archived raw data, imgCIF metadata, and ultimately the published Raw Data Letter are all linked via DOIs
Publishing raw data
A Word template for writing a Raw Data Letter is available here. Submission of the Raw Data Letter to IUCrData should be accompanied by the imgCIF describing the archived raw data
. The submitted Raw Data Letter along with a checkCIF validation report will be subject to peer review before publication.