ungeFAIR – Developing an UNspecific and GEneric subset of FAIR criteria

Description:

FAIR is a framework for describing the findability and reusability of research data. It is widely claimed that research data should be FAIR, and there are multiple projects, initiatives and standards, as well as FAIR screening tools and workflows. However, so far none of these tools and workflows have been comprehensively applied to datasets produced by researchers of a particular research institution. Indeed, there are hardly any collections of datasets for which information on their FAIRness is available at all. Importantly, no validation for existing screening tools and workflows has been undertaken so far. This might be due to the conceptual and practical complexity of the task:

  • FAIR criteria can be fulfilled on levels of human- and machine-readability, and a validation of one by the other is conceptually impossible (see footnote)
  • FAIR criteria were originally only loosely defined, and the convergence onto definitions which can be operationalized is still ongoing; also, in the process, individual criteria can be broken up further
  • As long as criteria are not of high granularity, their practical assessment requires multiple design and prioritization decisions
  • For multiple FAIR criteria, only researchers from the respective field can reasonably assess compliance

The ungeFAIR project aims to overcome these obstacles to define a subset of FAIR criteria which is so unspecific and generic that any person with higher education can check compliance reliably and within viable time.
For each of these criteria, the goals of the project are:

  1. Draft standard operating procedure (SOP) which details the steps to take for assessing the criterion
  2. Develop the SOPs by rounds of feedback from test assessors
  3. Validate the SOPs by calculating interrater reliability
  4. Implement the SOPs as workflows in the Numbat software

These workflows are then to be shared with the community, and could be applied to assessing the FAIRness of datasets from Charité and/or other researchers.

In addition to ungeFAIR proper, the project also aims to apply the FAIR screening tool F-UJI to datasets of Charité researchers, quantify FAIRness for machines, compare the outcome to the manual screening and validate F-UJI insofar conceptually possible (see footnote).

Footnote:

Validation can be conceptualized as high agreement between human raters for human-readability. However, strictly speaking, data from a human check cannot serve to validate machine-readability. Nevertheless, the human readout can for the moment be used to assess, if not formally validate, the FAIRness of data for machines at least for the specific screening tool applied.

This holds true given two conditions:

  • human extraction of information on FAIRness is more complete than by machines, and
  • datasets are typically selected for reuse by humans directly and not through algorithms. These two are currently fulfilled, but this might change in the future.