Jump to page content

Automated screening of scientific manuscripts can help authors to identify and fix common problems, such as failing to state whether experiments were blinding or randomized, using potentially misleading bar graphs to present continuous data, or failing to acknowledge study limitations. Tools can screen a manuscript and provide authors with customized feedback in seconds. This makes automated screening a valuable strategy for improving transparency and reproducibility on a large scale, across many fields.

At QUEST, we have developed several new screening tools and are founding members of an international working group that combines many differnt tools into a powerful screening pipeline (ScreenIT).


ODDPub is a text-mining algorithm that parses a set of publications and detects which publications disseminated Open Data or Open Code along with the paper. This tool is tailored towards biomedical science.

Github Link: https://github.com/quest-bih/oddpub

Paper Link: https://doi.org/10.5334/dsj-2020-042


Barzooka is a deep convolutional neural network that screens publication PDFs and checks for bar graphs of continuous data and other common graphing issues. Many different data distributions can lead tot he same bar graph and the actual data may suggest different conclusions from the summary statistics alone. Barzooke also detects more informative alternatives to bar graphs, like dot plots, box plots and histograms.

Tool page: https://quest-barzooka.bihealth.org/

Why you shouldn’t use bar graphs of continuous data, and what to use instead:

1. https://www.ahajournals.org/doi/10.1161/CIRCULATIONAHA.118.037777

2. https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1002128

ScreenIT Pipeline

The Automated Screening Working Group is an international group of tool creators working to improve scientific manuscripts. This group was co-founded by QUEST members. Group members have combined their tools into the ScreenIT pipeline, which screens for common problems that can affect transparency or reporting and provides feedback to authors. Throughout the COVID-19 pandemic, we have been using our automated ScreenIT pipeline to screen COVID-19 preprints on medRxiv and bioRxiv. Public reports are automatically posted via hypothes.is and tweeted out via @SciScoreReports (https://hypothes.is/users/sciscore).

For more details on the international working group and the ScreenIT pipeline, see:

Automated Screening Working Group: https://scicrunch.org/ASWG

Correspondence article on COVID-19 preprint screening: https://www.nature.com/articles/s41591-020-01203-7

The ScreenIT pipeline includes the following tools:

Blinding, randomization, sample-size calculations, sex/gender, ethics and consent statements, resources, RRIDs
http://sciscore.com; RRID:SCR_016251

Open data, open code
https://github.com/quest-bih/oddpub; RRID:SCR_018385

Limitation- Recognizer
Author-acknowledged limitations
https://github.com/kilicogluh/limitation-recognizer; RRID:SCR_018748

Bar graphs of continuous data
https://quest-barzooka.bihealth.org; RRID:SCR_018508

Rainbow color maps (these color maps create visual artifacts and aren’t colorblind safe)
https://jetfighter.ecrlife.org; RRID:SCR_018498

Seek and Blastn
Correct identification of nucleotide sequences
http://scigendetection.imag.fr/TPD52/; RRID:SCR_016625

Checks clinical trial registration numbers from ClinicalTrials.gov
https://github.com/bgcarlisle/TRNscreener; RRID:SCR_019211

Statements on conflicts of interest, funding, or protocol registration
https://github.com/serghiou/rtransparent; RRID:SCR_019276

Citations of retracted publications, or papers with erratums or corrections
http://www.scite.ai/; RRID:SCR_018568