As the methods of open science are becoming increasingly popular, it is possible to utilise other materials collected and created during the research process, such as research data. From the point of view of a student, available and completed data can, for example, speed up the completion of a thesis. You should always follow good scientific practice when using available research data.

Why must research data be cited?

Research data is a source of information just like any other source. In order to acknowledge the researcher who originally produced and collected the research data, the data has to be cited just like any other source, such as scientific articles and publications. By citing the data, you give merit to the producer of the original data, make the data you have used easy to find, and ensure that your research is verifiable.

How to cite research data?

The same practices that are used in referencing research publications are used when citing data. At least the author, name of the data set, publication year, edition or version, and retrieval location, i.e. the permanent identifier of the data, must be included in the reference. Please note that several archives have their own instructions for citing and also, for example, the publisher’s requirements can dictate the manner of citation.

Examples on Citing Data

Citations when the dataset has an author:

Nystedt, Ursula (Jyväskylän yliopisto): Erityislapsiperheiden vertaistukiverkostot 2014 [sähköinen tietoaineisto]. Versio 1.0 (2015-12-18). Yhteiskuntatieteellinen tietoarkisto [jakaja].

Citations when the dataset has been created by an organisation:

Nuorisoasiain neuvottelukunta (Nuora) & Nuorisotutkimusseura: Nuorisobarometri 2015 [sähköinen tietoaineisto]. Versio 1.0 (2016-01-22). Yhteiskuntatieteellinen tietoarkisto [jakaja].

In the text, you have to include the year the data was collected as well as the author or the name of the research dataset when there is no author:

Over 80% of Finns considered that the climate change is entirely or mainly caused by humans (Ekholm et. al. 2006).

About a quarter of all Finns do not walk or cycle during their commute or when running errands (ISSP 2007: Vapaa-aika ja urheilu).

More examples are also available on the website of the Finnish Social Science Data Archive (in English)  and  Finnish Social Science Data Archive (in Finnish).

A general rule is to include enough information for the reader to locate the dataset, part of a database, or object to which the research refers.

The UK Data Service recommends using a title that indicates the subject matter but also geography, and time period that the data covers.

Example: Richardson, Elizabeth A. (2009). Carstairs deprivation scores for Scotland by CATT2, 1981, 1991, 2001 [Dataset]. University of Edinburgh. School of GeoSciences.


Research Data Archives

Good storage services for searching the ready public datasets on different topics are e.g.:

Discipline-specific data archives search:

Data Citation Index