Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Pages

Posts

news

Paper published in Briefings in Bioinformatics

Published:

Our paper “Knowledge bases and software support for variant interpretation in precision oncology” has been published in Briefings in Bioinformatics! Thanks to all collaborators from the HiGHmed consortium.

1st GGPONC User Meeting

Published:

I have organized a workshop with experts from both the German clinical NLP and the clinical guideline communities, to deepen the dialogue that we have started with our GGPONC project. Check out the event website for details.

GGPONC 2.0 @ LREC

Published:

Our paper for the new release of GGPONC 2.0 has been accepted at LREC! The new dataset is currently the largest, freely distributable annotated corpus of German medical text (1.87M tokens, 250K annotations). We also created baseline NER models with HuggingFace transformers.

Talk at GIN Conference in Toronto

Published:

I gave a talk at the GIN 2022 Conference in Toronto about “Continuous Surveillance of Clinical Practice Guidelines through Natural Language Processing - Experiences from the GGPONC Project”. It was a very inspiring event with great conversations about the role NLP can play in guideline development and evidence synthesis.

Two papers accepted at AIME

Published:

I am happy to announce that two papers got accepted at the International Conference of Artificial Intelligence in Medicine AIME 2023:

xMEN Toolkit

Published:

We have released xMEN - a Toolkit for Cross-Lingual Medical Entity Normalization on GitHub.

projects

Mixed Reality

Development of interactive XR applications for HoloLens and other HMDs / mobile devices
Microsoft Hololens

xMEN

Cross-lingual Medical Entity Normalization

publications

GGPONC: A Corpus of German Medical Text with Rich Metadata Based on Clinical Practice Guidelines

Published in LOUHI@EMNLP, 2020

Recommended citation: Florian Borchert*, Christina Lohr*, Luise Modersohn*, Thomas Langer, Markus Follmann, Jan Philipp Sachs, Udo Hahn, Matthieu-P. Schapranow. GGPONC: A Corpus of German Medical Text with Rich Metadata Based on Clinical Practice Guidelines. In Proceedings of the 11th International Workshop on Health Text Mining and Information Analysis, pp. 38–48. Online: Association for Computational Linguistics, 2020. (* equal contribution) [Data Access] [Code]

Knowledge bases and software support for variant interpretation in precision oncology

Published in Briefings in Bioinformatics, 2021

Recommended citation: Florian Borchert*, Andreas Mock*, Aurelie Tomczak*, Jonas Hügel, Samer Alkarkoukly, Alexander Knurr, Anna-Lena Volckmar, Albrecht Stenzinger, Peter Schirmacher, Jürgen Debus, Dirk Jäger, Thomas Longerich, Stefan Fröhling, Roland Eils, Nina Bougatf, Ulrich Sax, Matthieu-P Schapranow. Knowledge Bases and Software Support for Variant Interpretation in Precision Oncology, Briefings in Bioinformatics, Volume 22, Issue 6, November 2021, bbab134 (* equal contribution) IF = 11.6 https://doi.org/10.1093/bib/bbab134.

An Engineering Approach towards Multi-Site Virtual Molecular Tumor Board Software Support

Published in 1st Conference on ICT for Health, Accessibility and Wellbeing, 2021

Recommended citation: Richard Henkenjohann, Benjamin Bergner, Florian Borchert, Nina Bougatf, Hauke Hund, Roland Eils, and Matthieu-P. Schapranow. An Engineering Approach towards Multi-Site Virtual Molecular Tumor Board Software Support. Proceedings of the 1st Conference on ICT for Health, Accessibility and Wellbeing. Springer International Publishing, 2021

GGPONC 2.0 - The German Clinical Guideline Corpus for Oncology: Curation Workflow, Annotation Policy, Baseline NER Taggers

Published in LREC, 2022

Recommended citation: Florian Borchert, Christina Lohr, Luise Modersohn, Jonas Witt, Thomas Langer, Markus Follmann, Matthias Gietzelt, Bert Arnrich, Udo Hahn and Matthieu-P. Schapranow. GGPONC 2.0 - The German Clinical Guideline Corpus for Oncology: Curation Workflow, Annotation Policy, Baseline NER Taggers. LREC 2022 — Proceedings of the Language Resources and Evaluation Conference, pp. 3650‑3660. Marseille, France, European Language Resources Association, 2022 [Data Access] [Code]

HPI-DHC @ BioASQ DisTEMIST: Spanish Biomedical Entity Linking with Pre-trained Transformers and Cross-lingual Candidate Retrieval

Published in CLEF, 2022

Recommended citation: Florian Borchert and Matthieu-P. Schapranow. HPI-DHC @ BioASQ DisTEMIST: Spanish Biomedical Entity Linking with Pre-trained Transformers and Cross-lingual Candidate Retrieval. Proceedings of the Working Notes of CLEF 2022 - Conference and Labs of the Evaluation Forum, pp. 244-258. Bologna, Italy. 🏆 1st place DisTEMIST shared task (entity linking subtrack) [Link] [Code]

Machine Learning Based Prediction of Incident Cases of Crohn’s Disease Using Electronic Health Records from a Large Integrated Health System

Published in AIME, 2023

Recommended citation: Julian Hugo, Susanne Ibing, Florian Borchert, Jan Philipp Sachs, Judy Cho, Ryan C. Ungaro and Erwin P. Böttinger. Machine Learning Based Prediction of Incident Cases of Crohn’s Disease Using Electronic Health Records from a Large Integrated Health System. In: Juarez, J.M., Marcos, M., Stiglic, G., Tucker, A. (eds) Artificial Intelligence in Medicine. AIME 2023. Lecture Notes in Computer Science, vol 13897. Springer, Cham 🏆 Best Student Paper

MEDBERT.de: A Comprehensive German BERT Model for the Medical Domain

Published in Nature Scientific Data, 2023

Recommended citation: Phillip Richter-Pechanski, Philipp Wiesenbach, Dominic M. Schwab, Christina Kiriakou, Mingyang He, Michael M. Allers, Anna S. Tiefenbacher, Nicola Kunz, Anna Martynova, Noemie Spiller, Julian Mierisch, Florian Borchert, Charlotte Schwind, Norbert Frey, Christoph Dieterich & Nicolas A. Geis. A distributable German clinical corpus containing cardiovascular clinical routine doctor’s letters. Scientific Data 10, 207 (2023) [Data Access] IF = 10.8

Programming techniques for improving rule readability for rule-based information extraction natural language processing pipelines of unstructured and semi-structured medical texts

Published in Health Informatics Journal, 2023

Recommended citation: Nektarios Ladas, Florian Borchert, Stefan Franz, Alina Rehberg, Natalia Strauch, Kim Katrin Sommer, Michael Marschollek, Matthias Gietzelt Programming techniques for improving rule readability for rule-based information extraction natural language processing pipelines of unstructured and semi-structured medical texts. Health Informatics Journal; 29(2) (2023)

HPI-DHC @ BC8 SympTEMIST Track: Detection and Normalization of Symptom Mentions with SpanMarker and xMEN

Published in BioCreative, 2023

Recommended citation: Florian Borchert and Matthieu-P. Schapranow. HPI-DHC @ BC8 SympTEMIST Track: Detection and Normalization of Symptom Mentions with SpanMarker and xMEN. Proceedings of the BioCreative VIII Challenge and Workshop: Curation and Evaluation in the Era of Generative Models. New Orleans, USA (2023) 🏆 1st place SympTEMIST shared task (entity linking subtrack) [Code]

MEDBERT.de: A Comprehensive German BERT Model for the Medical Domain

Published in Elsevier, 2024

Recommended citation: Keno K. Bressem, Jens-Michalis Papaioannou, Paul Grundmann, Florian Borchert, Lisa C. Adams, Leonhard Liu, Felix Busch, Lina Xu, Jan P. Loyen, Stefan M. Niehues, Moritz Augustin, Lennart Grosser, Marcus R. Makowski, Hugo JWL. Aerts, Alexander Löser. medBERT.de: A Comprehensive German BERT Model for the Medical Domain. Expert Systems with Applications (2024): 121598 [Hugging Face Model] IF = 8.5