A researcher should consider, even before starting the research, how they will handle and use the research data, and what will happen to it after the research is completed. Careful planning is worthwhile, as it improves the quality and consistency of the data, saves time and effort during the research process, and prevents data loss and unintentional changes. Good management of research data helps the researcher meet the requirements of funders, enhances the reliability and reproducibility of the research, and facilitates the opening and publishing of the data.
When handling research data, it is good to consider at least the following aspects:
- What kind of data is it, and in what formats does it exist?
- How is the data documented and described? What metadata standards are used for its description? How can the data be found and accessed?
- Quality and consistency of the data: careful documentation of the collection and handling phases
- Legal and ethical issues: copyright and other intellectual property rights, ownership and usage rights, licenses, transfer of rights, authorship, consent from research subjects, and sustainability considerations.
- Data protection: personal data and other confidential information.
- Who is responsible for the data? Who has the right to use the data and make decisions regarding it?
- What resources are required for good data management: financial and human resources?
- Data storage, preservation, and backup: storage and transfer solutions, file formats that enable long-term use, logical and clear naming of files and data variables.
- Protecting the data: access rights and data security.
- Data storage after the research: retention period, storage location, data deletion
- Sharing, opening, and reuse of the data: will the entire dataset, parts of it, or only the metadata be opened? Where will the data be published?
- Measures related to opening the data: e.g., anonymization of sensitive information, consent from research subjects for data opening, persistent identifier (PID).
- Using existing research data requires the license defined by the original user of the data and high-quality metadata. Using well-known data repositories helps in finding reliable research data.
- By citing data, you give credit to the original producer of the data and make the data you used discoverable and your research verifiable.
- Possible requirements from funders, publishers, and data repositories regarding research data.
- The stages of data handling can be documented in a document called a Data Management Plan (DMP).
The result of good research data management is a high-quality dataset that can also be used in the future and cited.