Architecture of the Virtual Research Environment

How is the Virtual Research Environment (VRE) structured? What other infrastructures does the platform interface with and enable researchers to access and use data securely? Here you will find a schematic representation of the VRE and an explanation of its key elements.

Key elements

Research Portal

The primary interface for researchers to access VRE functions and resources, including data capture tools, interactive dashboards and viewers, query tools and analysis workspaces. 

 

Data Gateway API

An application programming interface that enables the Research Portal to exchange data and metadata with VRE systems, and allows the VRE to be interoperable with other data platforms, data sources and systems.

 

Green Room

An environment in which hospital data, such as data derived from electronic health records (EHRs), picture archiving and communication systems (PACS), laboratories, and other sources, can be de-identified, transformed and prepared prior to being transferred to VRE systems and made available for research use.

 

Data Lake

A zone within the VRE in which data of any type can be received, stored, catalogued, and ultimately ingested into platform databases. Standard data models and ontologies are applied to allow datasets to be aggregated and processed. Data can be further de-identified here for broader sharing. Quality assurance, quality control and preprocessing pipelines prepare the data for visualisation and analysis. 

 

Data Warehouse

A set of databases and services which transform diverse data into a unified, federated context. Federation is critical for harmonizing data and metadata so that information about participants and datasets can be queried, visualized, and analyzed across studies, data sources and modalities. The Metadata Repository and Knowledge Graph implement standardized and extensible schemas to represent metadata derived from datasets as well as annotations generated by researchers. 

 

Shared Services

Automated services that are required by VRE components. The Participant Registry includes systems for generating and storing unique pseudonymized identifiers for all research participants contributing data to the VRE. Identity and access management systems allow the Research Portal and other VRE components to assign or validate the identity and permissions of VRE users.

 

Workspaces and Analytics Resources

Workspaces are flexible and interactive environments in which users can access, visualise and analyse their data with a range of analysis and visualisation tools. These are supported by underlying computing infrastructure as well as privacy preserving linkage systems that allow de-identified datasets to be linked and compared. 

 

Charité/BIH Infrastructure and Services

Various IT systems and services are integrated to provide networking, computing, storage and other infrastructure required by the VRE. This includes services that transfer data from existing research and clinical data sources (e.g. REDCap, PACS) into the Green Room; Hadoop and other systems deployed within the Health Data Platform; high performance computing clusters for running preprocessing pipelines; and backup, recovery and data archival systems that ensure that data are kept safe and available at all times.