Zum Seiteninhalt springen

The aims of the QUEST work package are two-fold. First, we aim to expand the open science dashboard currently developed at QUEST [intranet link] to include indicators of data reusability, based on the FAIR criteria. For this, we will screen datasets published by Charité researchers using existing, externally developed tools, and where necessary we will adjust these tools to our needs. We will also assess some criteria manually, where needed. We will then validate the results and include the resulting indicators into the Charité open science dashboard. Second, we will take up ideas for open science indicators for other fields of scholarship, as developed by the OABB and disciplinary stakeholders. We will use – and possibly adopt - available screening tools to also collect these indicators. Ultimately, we will support the OABB with the generation of pilot dashboards for the two participating disciplines, based on the indicators thus collected.

FAIR is a framework for describing the findability and reusability of research data. It is widely claimed that research data should be FAIR (and thus, reusable) for both human and machines, and there are multiple projects, initiatives and standards, as well as FAIR screening tools and workflows. However, so far none of these tools and workflows have been comprehensively applied to datasets produced by researchers of a particular research institution. Indeed, there are hardly any collections of datasets for which information on their FAIRness is available at all. Our goal is to assess FAIR compliance (‘FAIRness’) for the whole body of datasets shared by Charité researchers, thus increasing the need to automate assessments, where possible. Correspondingly, the prime goal of the project BUA Open Science Dashboards is the application of screening tools, which output the compliance of datasets with FAIR criteria. In this case, FAIRness and thus data reusability is understood from the point of view of automated or programmatic access, i.e., “FAIR for machines”. Where necessary, these screening tools can be adopted to our needs, which might make manual steps necessary.

At the same time, the project also builds on the piloting ungeFAIR project, which started in 2020. It was the goal of ungeFAIR to define a subset of FAIR criteria sufficiently unspecific and generic (hence, ungeFAIR) to allow a person without disciplinary knowledge to assess reusability according to a subset of criteria. In this case, FAIRness is understood as reusability for humans, which directly access individual datasets. To include criteria for the FAIRness of datasets for humans in the Charité dashboard, it is necessary to define a smaller subset of FAIR criteria which does not require disciplinary knowledge to be assessed, and for these to develop and validate assessment workflows. These will be ultimately implemented as browser-based extraction forms using Numbat, used to assess datasets shared by Charité researchers (see also Open Data LOM), and shared with others for adoption and reuse. In this process, we want to overcome the following obstacles, which have so far impeded the development of reliable, validated FAIR assessments by humans:

  • FAIR criteria were originally only loosely defined, and the convergence onto definitions which can be operationalized is still ongoing; also, in the process, individual criteria can be broken up further
  • As long as criteria are not of high granularity, their practical assessment requires multiple design and prioritization decisions
  • For multiple FAIR criteria, only researchers from the respective field can reasonably assess compliance

Importantly, where we will have both automated and manual detection of FAIR criteria at our disposal, we will undertake to gauge the quality of automated screening tools. A mutual validation of machine-readability and human-readability is conceptually impossible (see footnote), but nevertheless, given certain borderline conditions, manual assessment can help to gauge the quality of automated assessments.

Links:

Footnote:

Validation can be conceptualized as high agreement between human raters for human-readability. However, strictly speaking, data from a human check cannot serve to validate machine-readability. Nevertheless, the human readout can for the moment be used to assess, if not formally validate, the FAIRness of data for machines at least for the specific screening tool applied.

This holds true given two conditions:

  • human extraction of information on FAIRness is more complete than by machines, and 
  • datasets are typically selected for reuse by humans directly and not through algorithms. These two are currently fulfilled, but this might change in the future.

Logo