Skip to Main Content
Turku University Library

Research Council of Finland DMP

1. General description of data

What kinds of data is your research based on? What data will be collected, produced or reused? What file formats will the data be in? Additionally, give a rough estimate of the size of the data produced/collected.

The research council of Finland recommends using a table format to present different datasets.

It is crucial to demonstrate an understanding of the different types of data used in your research and the various data management actions required by each type. Ensure that the description matches your research plan.

Then detail each dataset separately in the table.

  • Estimated file size
  • Origin of the data
    • Previously collected data that is being reused in this project
    • Data collected specifically for this project
    • Data produced in the research project
  • Storage format
    • Possible file formats are .csv, .txt, .docx, .xslx, .tif. It is important that the file formats used for data storage support the potential reuse of the data.
  • Storage location during the research
    • Seafile, Taltio, GitLab
  • Data owner
    • In contract research, UTU
  • Metadata (i.e., how to ensure the description of this particular dataset)
    • Readme files, hardware metadata, electronic logs
  • Can the data can be opened after the research is completed?
    • At least metadata should be opened
  • Where the dataset will be stored after the research?
    • Discipline-specific data repository (repo)
  • Does the dataset contain sensitive or confidential information? If you collect and process personal data or otherwise confidential or protected data, list these in the table or list according to sensitivity.

Special or uncommon software should also be described in this section, specially if the software is coded or produced in your project.

 

Types of datasets for example:

Dataset Name and type

Source

Size  File Format Storage during the project Owner Metadata Can the data be opened and where Does the data include sensitive/ personal information
1 Mass cytometry produced for the project 1 GB .fcs Seafile UTU readme yes, specific repository No
2 Lab notes Produced during the project < 10 MB .doc, .pdf, .txt Electronic lab notebook UTU / PI program generates no No
3 Interview transcripts produced for the project < 6 MB .txt Sefile (encrypted) UTU readme, DDI yes, Finnish Social Science Data Archive No, anonymised
4 Interview recordings produced for the project < 5 GB .mp4 Seafile (encrypted) UTU readme, DDI no yes, identified personal information
5                  

Columns marked with green are the ones the Academy requires, adding yellow data to the table in section 1.1 makes it easier to fill in the rest of the DMP.


In section 1.2, it is explained how the data is kept high-quality and error-free. In research data management, quality refers to so-called technical and external factors, rather than addressing how well the data content fits the research question. 

Write down what the research group has agreed on e.g.:

  • Common practices to avoid errors resulting from different recording and analysis methods.
  • Version control processes, such as naming conventions or the use of git platforms.
  • Transcription verification process.
  • How data integrity is ensured when sharing information.

Example sentences that can be customised with your own details:

  • Our research group/I have established common procedures to ensure data integrity, minimising errors arising from inconsistent recording and analysis methods. We/I only use standardized protocols and formats.
  • Our research consortium/I promote(s) the use of standardised procedures for data management and analysis to foster consistency and accuracy across all studies, thereby reducing the potential for errors resulting from differing methodologies.
  • Our research group/I employ(s) version control processes, utilising Git as the platform of choice, along with standardised naming conventions, to ensure traceability and minimize errors in code development.