ailslab: Artificial Intelligence in the Life Sciences
The ailslab focuses on innovative methods from the broader field of artificial intelligence for groundbreaking digital health applications. Roland Eils, head of the research group, has assembled a small but highly innovative team of students and scientists to work on various aspects of digital health. At the heart of this lab is the integration of vast and deep datasets from both healthcare and research to address pressing problems in health and disease. Our goal is to simultaneously deepen our understanding of disease mechanisms and improve the diagnosis and treatment of cardiovascular and cancer patients.
Cardiovascular risk modelling
In this project, we work towards an improved prevention and treatment of patients with established coronary heart disease (CHD). CHD is a major cause of morbidity and mortality. The risk landscape in patients with coronary disease is highly heterogenous, depending on their genetic background, clinical characteristics, cardiovascular risk factors and atherosclerotic disease status. With the availability of effective, but costly novel treatment options (such as PCSK9-inhibitors), there is great need for advanced cardiovascular risk prediction tools to stratify the use of novel treatments, support personalised monitoring windows and increase the efficiency of clinical trial designs by allowing for the selection of individuals at high risk of recurrent events. Here, we aim to improve and personalise prevention and treatment of established coronary heart disease by developing data-driven, neural-network based tools for risk modelling and multi-modal data integration. We investigate representation learning techniques to identify latent factors across data modalities associated with risk and examine lifetime risk more closely. The clinical environment at Charité in synergy with data-collection and patient management platforms allows for a prospective validation and clinical integration of our technologies.
Omni-genetic phenotype models
Genetic contributions to many health-related phenotypes like blood pressure and body mass index often arise from complex interactions between multiple genes. Thus, an understanding of how variants of different genes interact, is of great interest. We plan to make use of recent advances in the field of explainable machine learning to analyse data from large cohort studies. Starting with variants of individual genes, we want to predict how they influence the activity of the systems they are involved in, for example DNA repair. Based on these predictions, we want to predict the activity of more complex systems and repeat this process until the information of all genes is combined in one system describing the effect of all variants. At that point, the information on thousands of genes will be available in a very compressed representation, which will be the basis to predict phenotypes like blood pressure.
Additionally, this approach can be used to predict which cellular systems are affected by a given set of variants resulting in a specific phenotype. For example, for an individual with increased blood pressure and variants in multiple genes, this could be used to predict, whether the DNA repair system is involved in the increased blood pressure. Information like this can be used to check if the model makes its predictions in an intuitive way and can also be the starting point of new research, if strong connections are found.
Open medical data
Machine learning methods hold the promise of great benefits for patients, physicians and researchers but require vast amounts of data. While this data typically exists in large research hospitals, it is generally inaccessible due to legal, ethical and privacy concerns. By building a secure computing framework, following the model-to-data approach, we aim at opening medical data for research purposes.
The core idea is that the sensitive data stays within the hospital’s servers, pseudonymised and protected by existing safeguards. Researchers will work with the data by sending in containerized models, which will then be optimized on the data using on-site high-performance computing resources. While metrics are generally sent back to the researchers, models will only be sent back after thorough privacy checks.
In this framework, the hosting hospital stays in full control over their (patient) data and does not disclose personalised or otherwise sensitive data to researchers while still enabling research for a wide scientific community.
*these authors contributed equally