Skip to Main Content
Turku University Library

Research Data

Research Data Archives

Storing the research data ensures that the data is available, understandable and usable also in the future.In general, it is recommended that researchers use discipline-specific publication channels and archives in their research projects (such as Horizon2020 projects) whenever they are available. In those channels, the data has the correct context and structure. When choosing the archive, it should be assessed whether it serves the openness and reuse of the data well enough. Some of the repositories are suitable also for long term preserving even if there is no intention to open the data.

Good storage services are e.g.:

Discipline-specific data archives search:

Choosing the File Format

File formats and programs are changing constantly. How to choose a file format that is accessible after several years and which would help to avoid additional conversions?

The Finnish Social Science Data Archive encourages researchers to save at least one copy of their files in a format that is commonly used. This way, it is more likely that the files can be accessed in the future even though programs change and are updated. MIT Libraries provide instructions on selecting a file format on their research data webpages.

Table of great, not bad and avoidable file formats by format genre.

Great text file formats from the point of view of accessibility include .txt, .odt, .xml, and .html. Formats that are not too bad include .pdf, .rtf, and .docx. A text file format to avoid is .doc.

Great audio file formats include .flac and .wav. Not bad ones include .ogg and .mp3. Audio file formats to avoid are .wma, .ra, .ram, and any compressed audio file formats.

For video file formats, .mp2 and .mp4, and also MKV formats are great. Formats to avoid include .wmv, .mov, .avi and any compressed file formats.

For image files, great formats include .tif, .png, .svg, and .jpg. As an image file format, .gif is not bad. Image file formats to avoid include .psd and any compressed formats.

For data files, .sql, .csv and .xml are great formats. A not bad one is .xlsx. Data file formats to avoid are .xls and any proprietary DB formats.

Persistent Identifiers (PIDs)

The University of Turku recommends that each researcher create their own ORCID identifier. It´s useful in situations where, for example, the researcher changes their name, there are different spellings for the name, or there are several researchers with the same name. For more information see UTUCris guide.

As an identifier for the published research data can be used e.g.:

Archiving/Disposing of Research Data

Defining the value of research data and considering what is done with the data after the research project ends should be decided at the beginning of the project. When considering the permanent storage of the data, you have to take into consideration the Data Protection Regulation and the requirements of the research funders.

According to the Data Protection Regulation, data including personal information cannot be retained any longer than what is required by its purpose of use. When the purpose for using the data ends, i.e. when the research project ends, you have to consider what should be done with the data. Is there a reason for storing the data or should it be disposed of?