The identification of open data in the institutions' body of publications is done semi-automatically using an algorithm developed by QUEST (ODDPub Link, OSF project page). The publications identified to be relevant are then reviewed manually. The underlying data used for determining if publications qualify as open data for LOM/IOM 2024 are confirmed publications by lead and last authors who have LOM eligibility (Charité) or IOM eligibility (BIH) (cut-off date of June 30th, 2023). In addition, the full version of publications must be available via institutional access. The criteria for qualifying as open data are freely accessible original data, a machine-readable format and an explicit reference to open data in the full version. The inclusion and exclusion criteria are valid for the LOM/IOM allocation in 2024 (publication period 2020–2022).
Details on the criteria
For the allocation of the open data incentive as an additional indicator of the LOM or IOM, the following criteria regarding data and publications need to be fulfilled. Additional Charité or BIH -specific criteria for the respective researcher’s eligibility to receive LOM or IOM are described elsewhere.
The criteria for the open data incentive as of 2024 are as follows:
Research data have been made freely accessible by researchers of the Charité/BIH
OR the data have been shared with restricted access and meet the following requirements:
- Data is stored in an external repository (or archive, database, registry)
- A standardized access route is named, i.e. the access requirements, the procedure for a request and the responsible persons or offices are described
- The reason for the restricted access is stated or is directly evident from the data being subject to data protection
- Access is possible for all academic researchers – at least from the European Economic Area
- Co-authorship of articles is not a condition for the provision of the data
- The access to the data is free of charge or maximally requiring compensation of expenses
The data can be raw, primary, or secondary data (e.g. from analyses of freely available datasets, meta-analyses, or health technology assessments); the data would thus allow the analytical replication (retracing of analysis steps) for at least a part of the study’s results; reporting of statistical values (means, standard deviations, p-values etc.) is not sufficient.
Data have been shared in the context of an article publication; thus, stand-alone datasets without reference to an article are not considered.
The data can also be found independently of the publication; thus, Supplementary Materials are only permissible if they are stored in a repository (archive) and can also be found via this repository.
The publication contained an explicit reference to the dataset(s); a reference to e.g. supplementary materials without further explanation is not sufficient, nor is a reference to a database without naming the data set, accession code or exact search settings.
The data are indeed available and can be accessed at the time of checking (for data under embargo, this must expire no later than July 31st).
Data have been shared in a machine-readable format; for tables e.g. CSV, Excel or Word files, but not PDFs or image formats.
The open data definition applied does not include:
- Analysis scripts, computer programs, and other methods, materials, and protocols, even if their development was the goal of the research project and/or their presentation was the focus of the publication; if data has been collected and shared for development or validation, these can, however, fall under the open data definition.
- Data contained within the article text itself, as long as these are not embedded tables, which can be accessed as digital objects for themselves as well.
- Image, audiovisual, and other data which primarily serve illustrative purposes.
- Data supporting case reports, unless these were shared in repositories (archives) of the respective discipline.
- For systematic reviews and meta-analyses: lists of sources or other general information on the studies, such as survey method or number of participants (eligible for LOM/IOM, however, are datasets newly compiled from the original literature that ensure traceability of the analysis, such as extracted text passages or statistical values).
- Data which is only accessible on request or when fulfilling certain requirements, unless it is personal or otherwise sensitive data that is available via a standardized access route (also see above; this does not include, for example, availability "upon request").
- Data from data collections of consortia ("data pools"), if it is unclear whether the authors themselves have contributed to the pool.
- Data for which only a "private link" is shared, so that it cannot be found in the repository, but is only accessible via the publication.
- Data shared before the LOM/IOM period under consideration (this is to ensure that only a limited number of articles can be rewarded for sharing of a particular dataset).
Due to potential misunderstandings, we also ask not to confound open data with open access (i.e., the free availability of article publications).
The application of aforementioned criteria always yields some borderline cases. If you are of the opinion that your or your department's publication has mistakenly not been classified as an open data publication, please send a short explanatory note to quest@bih-charite.de, and we will check this and contact you. In addition, the semi-automated search for open data only takes place within the English-language body of literature. If you should have shared data supporting articles in other languages, please inform us about it.
The criteria require continuous adjustment, and will be developed further in coming years. Further criteria for the re-usability of data in the sense of the FAIR criteria (Findable, Accessible, Interoperable, Re-usable) could be applied and/or the sharing of research software ("open code") could be added.