Compiling and disclosing data
Data requests
The pseudonymisation and anonymisation of data
Disclosing and verifying data
After the appropriate permits have been granted, the research data may be compiled. Health and welfare data is disclosed with or without identifiers, depending on the research needs and the data used. In many cases direct identifiers will be replaced with pseudoidentifiers. The data may also be anonymised to prevent the research subjects from being identified directly or indirectly. The pseudonymisation or anonymisation of the data impacts the ability to make later additions or changes to the data.

The sections below give further information regarding the compiling and disclosing of research data.

Expand all / Collapse all

Data requests

The permit decision will determine the terms of disclosure and describe whether the data is disclosed including identifiers, pseudonymised or anonymised. It will also detail whether the data will be disclosed for use in the applicant’s research environment or in a remote access environment. The permit decision does not usually lead directly to the data’s collection being started; the data requests are made separately after the permit is granted.

Data requests are a simple affair if data is only sought from one authority. In the case of requesting data from several authorities, it is often beneficial to prepare a description of the data compiling process. The description can also be used to indicate whether the applicant would like to combine the data from different sources themselves, or if they would like to receive data that is already combined.

An accurate description of the process is most beneficial when the collection of the data requires phasing, first to collect data to identify the research subjects with the data about the subjects to be collected later. A good process description also makes it easier for the authorities to cooperate and ensure successful collection. Further talks between the applicant and the authorities to specify matters are often required during the data compiling process.

The pseudonymisation and anonymisation of data

Anonymisation and pseudonymisation refer to the processing of data to either completely remove the data allowing individuals to be identified or modifying said data so that it cannot be interpreted correctly without a separate code key.

In pseudonymised data, the identifying data, e.g. personal identity codes, are replaced with observation identifiers specific to the data. In this kind of data, individuals can only be identified using a code key. The persons may still be identified indirectly using other data. Pseudonymised personal data is still considered personal data, according to the personal data protection regulations, and must be handled and protected with care equal to identifiable personal data.

Anonymous data does not use identifiers, and individuals cannot be identified based on one or more data items available in the dataset or elsewhere. For example, the removal of personal identity codes or exact addresses is not sufficient anonymisation. Anonymisation is always permanent and the no new individual-level data can be added to dataset after it has been compiled.

The authorities may set terms for the pseudonymisation or anonymisation of the data they disclose.

Disclosing and verifying data

The research data compiled from the health and welfare data collected from different sources is disclosed to the use permit’s recipient either directly or through a remote access system where the data can be handled securely.

The research data must be reviewed carefully before analysis is begun. Erroneous data must be returned for corrections as soon as possible to minimise the delay to the analysis.