ClinCode - Computer-Assisted Clinical ICD-10 Coding for improving efficiency and quality in healthcare

Description

This project will develop a Computer-Assisted Coding (CAC) tool for ICD-10 coding for Norwegian electronic health records and specifically for the discharge letter. There are over 20 000 ICD-10 diagnosis codes for Norwegian, which are divided into 22 chapters. The codes are hierarchical in 3 levels and each code has a textual description. One or several of these ICD-10 codes are assigned to the patient's discharge summary by the physician, both for medical and for administrative purposes. The process of assigning codes is difficult and time consuming and it is also shown that up to 41 percent of the manually assigned main diagnosis may be wrong or sometimes missing.

The CAC tool will learn from previously manually coded discharges summaries, patient notes (both free text and structured information such as laboratory results, blood values, etc), and assign ICD-codes to unseen discharge summaries. The CAC tool will use Artificial Intelligence methods such as Natural Language Processing and Deep Learning techniques to learn and predict codes. Ranked ICD-10 code suggestions will be presented to the physician such that he or she can can select among them and assign the correct code.

This will enable fast and high quality semi-automatic ICD-10 coding. The CAC tool can also be used for assessing coding quality on historical data for hospital management and health authorities.

The CAC tool will reduce coders workload and improve overall code quality. High-quality codes enable efficient data reuse, promoting fast knowledge generation in healthcare, thereby laying foundations for personalized medicine, more efficient health management, and, subsequently, higher quality of care.

The project builds on the clinical text mining research activities started in the incubator project, NorKlinTekst (HNF1395-18), funded by Helse Nord in 2017.

Related publications

Ngo, Phuong Dinh, Miguel Tejedor, Therese Olsen Svenning, Taridzo Chomutare, Andrius Budrionis and Hercules Dalianis. 2024. Deidentifying a Norwegian clinical corpus - An effort to create a privacy-preserving Norwegian large clinical language model. In the Proceedings of the CALD-pseudo Workshop at the 18th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2024, Malta, pdf.

Chomutare, Taridzo, Anastasios Lamproudis, Andrius Budrionis, Therese Olsen Svenning, Lill Irene Hind, Phuong Dinh Ngo, Karl Øyvind Mikalsen and Hercules Dalianis. 2024. Improving Quality of ICD-10 (International Statistical Classification of Diseases, Tenth Revision) Coding Using AI: Protocol for a Crossover Randomized Controlled Trial. JMIR Research Protocols, 13(1), e54593, pdf.

Lamproudis, Anastasios, Sara Mora, Therese Olsen Svenning, Torbjørn Torsvik, Taridzo Chomutare, Phuong Dinh Ngo and Hercules Dalianis. 2023. De-identifying Norwegian Clinical Text using Resources from Swedish and Danish. Proceedings of AMIA 2023, Annual Symposium, November 11-15. New Orleans, LA, USA, pdf.

Lamproudis, Anastasios, Therese Olsen Svenning, Torbjørn Torsvik, Taridzo Chomutare, Andrius Budrionis, Phuong Dinh Ngo, Thomas Vakili, and Hercules Dalianis. 2023. Using a Large Open Clinical Corpus for Improved ICD-10 Diagnosis Coding. Proceedings of AMIA 2023, Annual Symposium, November 11-15. New Orleans, LA, USA, pdf.

Dalianis, Hercules, Taridzo, Chomutare, Andrius Budrionis and Therese Olsen Svenning. 2022. ClinCode - Computer-Assisted Clinical ICD-10 Coding for improving efficiency and quality in healthcare. Poster presented at The Patient Classification Systems International (PCSI) conference in Reykjavik, Iceland, 27-29 Sept 2022. Best poster award.

Dolk, Alexander, Hjalmar Davidsen, Hercules Dalianis and Thomas Vakili. 2022. Evaluation of LIME and SHAP in Explaining Automatic ICD-10 Classifications of Swedish Gastrointestinal Discharge Summaries, in Proceedings from the 18th Scandinavian Conference on Health Informatics - SHI 2022 in Tromsø, Norway on August 22-24, pp. 166-173, pdf.

Dolk, Alexander and Hjalmar Davidsen. 2022. Evaluation of Post Hoc XAI Models in Explaining Automatic ICD-10 Classifications of Swedish Discharge Summaries, Master Thesis, Stockholm University, pdf.

Budrionis, Andrius, Taridzo Chomutare, Therese Olsen Svenning and Hercules Dalianis. 2022. The Influence of NegEx on ICD-10 Code Prediction in Swedish: How is the Performance of BERT and SVM Models Affected by Negations? in Proceedings from the 18th Scandinavian Conference on Health Informatics - SHI 2022 in Tromsø, Norway on August 22-24, pp. 174-178, pdf

Chomutare, Taridzo, Andrius, Budrionis and Hercules Dalianis. 2022, July. Combining deep learning and fuzzy logic to predict rare ICD-10 codes from clinical notes. In Proceedings from the 2022 IEEE International Conference on Digital Health (ICDH), pp. 163-168, pdf .

Blanco, Alberto, Sonja Remmer, Alicia Pérez, Hercules Dalianis and Arantza Casillas. 2022. Implementation of specialised attention mechanisms: ICD-10 classification of Gastrointestinal discharge summaries in English, Spanish and Swedish. Journal of Biomedical Informatics, html.

Lamproudis, Anastasios, Aron Henriksson and Hercules. Dalianis. 2021. Developing a Clinical Language Model for Swedish: Continued Pretraining of Generic BERT with In-Domain Data. In the Proceeding of RANLP 2021: Recent Advances in Natural Language Processing, 1-3 Sept 2021, Varna, Bulgaria, pdf.

Remmer, Sonja, Anastasios Lamproudis and Hercules Dalianis. 2021. Multi-label Diagnosis Classification of Swedish Discharge Summaries – ICD-10 Code Assignment Using KB-BERT. In the Proceedings of RANLP 21: Recent Advances in Natural Language Processing, 1-3 Sept 2021, Varna, Bulgaria, pdf.

Blanco, Alberto, Sonja Remmer, Alicia Pérez, Hercules Dalianis and Arantza Casillas. 2021. On the contribution of per-ICD attention mechanisms to classify health records in languages with fewer resources than English. In the Proceedings of RANLP 21: Recent Advances in Natural Language Processing, 1-3 Sept 2021, Varna, Bulgaria, pdf.

Remmer, Sonja. 2021. Automatic Diagnosis Code Assignment with KB-BERT — ICD Classification Using Swedish Discharge Summaries, Master Thesis, Stockholm University, pdf.

News

The Second ClinCode Conference in Stockholm, Sept 20, 2022, Twitter and LinkedIn

The First ClinCode Conference in Tromsø, Sept 21, 2021

Goals

This project has two broad objectives. Firstly, this project's outcome, the automatic Computer-Assisted Coding (CAC) tool for ICD-10 will increase coding quality. Secondly, the project aims to minimize the time required for ICD-10 diagnosis and procedure code assignment.
The improved coding quality will be proved in terms of precision, recall and F-score using a gold standard of ICD-10 codes. The gold standard will be created by double annotation by at least two expert coders and the inter-annotator agreement will be calculated.

The CAC-tool will minimize the time required for navigating complex code hierarchies and selecting correct codes. Finally, the combination of the points above brings the improved healthcare to the patient, and better data quality for secondary use. It is achieved both through minimizing the time clinicians spend on administrative tasks and also the better coding quality will make the planning of healthcare better.

Conclusion

The combination of primary and secondary objectives translate well into benefits for patients in short and long term. The CAC tool will minimize the time clinicians spend on documentation; leaving more time for clinical work. Enhanced quality of coding leads to better planning of healthcare. The CAC tool can also be used to validate previous manually assigned ICD-10 codes by hospital management and the health authorities.

The CAC tool can also be used to assess manually assigned ICD-10 codes, and hence correct the statistics and improve healthcare planning.

The publication of the results in scientific as well in popular science publications will spread the methods and results to a broader audience. This will make it possible to reproduce the methods for other languages than Norwegian and Swedish.