File formats and software used in research data depend on how researchers collect and analyse their data. They are often discipline-specific, acknowledged within the respective domain. In general, this leads to a very heterogeneous concerning file format environment.
The use of open formats, instead of vendor specific, proprietary formats, is mandatory for long-term digital preservation as it ensures accessibility and reuse. This means that data may need to be converted from the actual work format used in research to an appropriate format for preservation.
The mdw Repository preserves deposited data in their default incoming formats (incl. open or closed formats) as master renditions though an open rendition must be available for preview, indexing and ensuring long term accessibility of the data as it can be migrated to current formats over time.
In general, data deposited in the mdw Repository should be
Metadata support of file formats is an issue when the metadata must be embedded into the actual research data in order to 'travel' with the file itself so the data can be better understood by the designated community (self-documentation).
The following section provides an overview on the preferred formats for long term preservation in the mdw Repository.
|Adobe Portable Document Format (PDF/A), OpenDocument Text (.odt)
eXtensible Mark-up Language (XML) text (.xml) - according to an appropriate Document Type Definition (DTD) or schema (XSD)
Hypertext Mark-up Language (HTML) (.html)
plain text data
ASCII (.txt) - UTF-8 encoding
|MS Word (.doc, .docx)
Rich Text Format (.rtf)
Adobe Portable Document Format (.pdf) - only if no PDF/A can be produced
|OpenDocument Presentation (.odp)||MS Powerpoint (.ppt, .pptx)|
|TIFF version 6 uncompressed (.tif)||JPEG (.jpeg, .jpg) - only if created in this format !!!
TIFF (other versions) (.tif, .tiff) - only if required by specific analysis software as master files
Adobe Portable Document Format (PDF/A) (.pdf)
standard applicable RAW image format (.raw) - as master files
Photoshop files (.psd) - only if required by specific analysis software as master files
|Scalable vector graphics format (.svg)||Encapsulated Postscript files (.eps)
Adobe Illustrator files (.ai) - only if required by specific software as master files
|Waveform Audio Format (WAV) (.wav)||MPEG-1 Audio Layer 3 (.mp3) - only if created in this format|
Audio editing project files to be defined still.
Please contact firstname.lastname@example.org in case other audio formats are required.
Please contact email@example.com for current video formats.
Especially these software packages tend to provide proprietary file formats by default though most current products have export facilities to open formats (e.g. XML based data export options). MAXQDA, for example, can export the whole project including the raw data, coding tree, coded data, and associated data (mainly memos and notes) in open formats. Thus, researchers should deposit the working MAXQDA (closed) master project file together with open variants of the data in order to ensure long term accessibility.
|SPSS portable format (.por)||proprietary SPSS format (.sav)|
|MAXQDA XML export||proprietary MAXQDA format|
|comma-separated values (CSV) file (.csv),
OpenDocument Spreadsheet (.ods)
|MS Excel (.xls/.xlsx)|
In general, the following archive types are supported: TAR, GZIP, ZIP.
Please contact firstname.lastname@example.org for supported ISO images of entire computer images.