About me
I am a computer scientist and 4th year Ph.D. student at the Digital Health Center of the Hasso Plattner Institute in Potsdam. My main research field is biomedical natural language processing in low-resource settings, in particular German-language clinical NLP and information extraction for medical evidence synthesis. Moreover, I am interested in applications of language technology in the delivery of evidence-based medicine with a particular focus on (precision) oncology.
Since open data is a key enabler for research progress, I am happy to have contributed publicly available datasets through our ongoing efforts in the GGPONC project and previously as part of the BPI Challenge 2018.
As a professional software engineer, I have been working with large-scale Java enterprise systems and 3D computer graphics on modern Mixed Reality HMDs and mobile devices. I am passionate about software design and architecture and their role in AI-enabled components of future software systems.
News
2023
We have released xMEN - a Toolkit for Cross-Lingual Medical Entity Normalization on GitHub.
I am happy to announce that two papers got accepted at the International Conference of Artificial Intelligence in Medicine AIME 2023:
2022
We have published an implementation of FairEval on HuggingFace, which allows to get a detailed breakdown of errors for sequence labeling tasks (we consider mostly NER).
I gave a talk at the GIN 2022 Conference in Toronto about “Continuous Surveillance of Clinical Practice Guidelines through Natural Language Processing - Experiences from the GGPONC Project”. It was a very inspiring event with great conversations about the role NLP can play in guideline development and evidence synthesis.
Our contribution to the DisTEMIST shared task reached the 1st place in subtrack 2 (entity linking) and 2nd place in subtrack 1 (entity recognition). These are the results.
Our paper for the new release of GGPONC 2.0 has been accepted at LREC! The new dataset is currently the largest, freely distributable annotated corpus of German medical text (1.87M tokens, 250K annotations). We also created baseline NER models with HuggingFace transformers.
2021
I have organized a workshop with experts from both the German clinical NLP and the clinical guideline communities, to deepen the dialogue that we have started with our GGPONC project. Check out the event website for details.
I am excited that our paper “Controversial Trials First: Identifying Disagreement Between Clinical Guidelines and New Evidence”, presented at the AMIA Annual Symposium received a Distinguished Paper Award.
Our paper “Controversial Trials First: Identifying Disagreement Between Clinical Guidelines and New Evidence” featuring the Next Generation Evidence Browser will be presented at the AMIA Annual Symposium in November ‘21.
Our paper “Knowledge bases and software support for variant interpretation in precision oncology” has been published in Briefings in Bioinformatics! Thanks to all collaborators from the HiGHmed consortium.
2020
We have published an article in the German Digital Health magazine “Gesundhyte.de” about the role of NLP in evidence-based medicine and our GGPONC corpus.
Our paper describing version 1.0 of the GGPONC corpus will be presented at the LOUHI workshop at EMNLP.
Version 1.0 of the GGPONC medical text corpus is now available to researchers: Link