About me

I am a computer scientist and 4th year Ph.D. student at the Digital Health Center of the Hasso Plattner Institute in Potsdam. My main research field is biomedical natural language processing in low-resource settings, in particular German-language clinical NLP and information extraction for medical evidence synthesis.

I investigate domain-specific weak supervision signals to enable the application of machine learning based NLP in these low-resource scenarios. Morever, I am interested in applications of language technology in the delivery of evidence-based medicine with a particular focus on (precision) oncology.

Since open data is a key enabler for research progress, I am happy to have contributed publicly available datasets through our ongoing efforts in the GGPONC project and previously as part of the BPI Challenge 2018.

As a professional software engineer, I have been working with large-scale Java enterprise systems and 3D computer graphics on modern Mixed Reality HMDs and mobile devices. I am passionate about software design and architecture and their role in AI-enabled components of future software systems.


News

2022

Our paper for the new release of GGPONC 2.0 has been accepted at LREC! The new dataset is currently the largest, freely distributable annotated corpus of German medical text (1.87M tokens, 250K annotations). We also created baseline NER models with HuggingFace transformers.

2021

I have organized a workshop with experts from both the German clinical NLP and the clinical guideline communities, to deepen the dialogue that we have started with our GGPONC project. Check out the event website for details.

Our paper “Knowledge bases and software support for variant interpretation in precision oncology” has been published in Briefings in Bioinformatics! Thanks to all collaborators from the HiGHmed consortium.

2020