Curation levels for research data

Data curation involves quality assurance, documentation, standardization, and formatting of data. The research data archive in Sikt operates with five curation levels, where levels 1 to 4 meet the FAIR principles.

Data curation usually involves adding metadata, creating and having control over different versions of data, aggregating or recoding data, and forming new data collections based on different data sources.

Curation is also about ensuring data quality and ensuring that data handling and processing takes place in accordance with laws, rules and guidelines.

Curation level 1 

  • Data selected for value-adding curation in a long-term perspective, with a focus on creating relationships within and across time series/data collections at the variable level. 
  • Data is prepared for secondary use in research. 
  • Data is converted into long-term preservation formats and guaranteed to be available for at least 50 years. 
  • Quantitative matrix data, with relevant geographical coverage, included in long consistent time series. 
  • Data in this category is part of the main data collection of the research data archive in Sikt, and undergoes extensive curation at the variable level. 

Curation level 2 

  • Data selected for descriptive curation in a long-term perspective, with a focus on creating relationships within and across time series/data collections and between datasets with the same theme at the study level. 
  • Data is prepared for secondary use in research. 
  • Data is converted into long-term formats and guaranteed to be available for at least 50 years. 
  • Quantitative matrix data with relevant geographical coverage, usually cross-sectional studies or short time series. 
  • Data in this category undergoes basic curation at the variable level. 

Curation level 3 

  • Data selected for descriptive curation for replication and reproducibility. Metadata is preserved in a long-term perspective. 
  • Data undergoes necessary quality checks but is transmitted in the same format as it was deposited. 
  • Data will not (in principle) be curated and archived for long-term preservation. Sikt guarantees that this data is available for at least 10 years. They are backed up (only at the bit level) and made available and visible in the Sikt Data Catalogue. 
  • Data in this category is primarily qualitative data in the form of video, images, sound, and text, or matrix data with insufficient documentation. 
  • Data in this category is curated at project level. 

Curation level 4 

  • Data selected solely for distribution ("delivery only"). 
  • For example, where data is retrieved from third parties via APIs/web services and delivered to end-users via a SIkt interface. 

Curation level 5 

  • Data that is made searchable in the Sikt Data Catalogue ("discovery only"). 
  • Data is not formally archived in the Research Data Archive in Sikt; it will only be archived elsewhere, such as institutional archives. 
  • Sikt can create or harvest metadata records to make these data more discoverable. 
  • Data in this category often has special access conditions related to legal and ethical frameworks. 

Archiving data with Sikt