The rise of NLP in healthcare
Natural language processing (NLP) has achieved significant success in applications such as translation, speech recognition, text generation, virtual assistants, and chatbots. These applications cover industrial, creative, as well as lifestyle domains, and more recently, also the healthcare sector. Due to an increasing number of patients, rising costs, and larger amounts of data, there is a high demand for automated processing of health-related documents.
Hospitals struggle to provide high-quality care due to the complexity of patient histories and the high volume of medical documents generated during hospital stays, including reports from pathology, radiology, laboratory, surgery, and care documentation. This information is crucial for any decision on diagnostics, therapy, or subsequent care. Significant effort is dedicated to the tasks of writing, filing, sorting, searching, retrieving, issuing, and managing medical records by clinicians. But it is nearly impossible for clinicians to process this bulk of information.
Therefore, it is highly desirable to supply healthcare professionals as well as patients with information contained in these full texts by extracting data, mapping it onto clinical guidelines, or otherwise informing their decisions. Hence, almost all Clinical Decision Support Systems (CDSS) depend on a continuous and reliable processing of clinical text.
Despite the promising capabilities of NLP for enhancing clinical decision-making and operational efficiency, its integration into real-world healthcare settings remains limited due to challenges such as data quality, lack of standardization, and inadequate alignment with clinical workflows. This study aims to address these challenges and provide solutions to facilitate the integration of NLP in clinical environments.
Mapping the patient journey
To better understand the document-intensive nature of the patient journey in a hospital, we employed a case study approach focusing on the patient journey of a lung cancer patient. This is one of the most commonly diagnosed subtypes of cancer, and cancer is the second leading cause of death in the western world.
Typically, a suspicion of lung carcinoma leads to admission to the pneumology department. Multiple tests are conducted to reach a diagnosis. The treatment options for the patient are discussed by a multidisciplinary tumor board and the patient is transferred to the oncology department to undergo the chosen therapy. Once completed, the patient is discharged from the hospital, but may continue to visit for follow-up checks to ensure effective treatment.
The steps in this example are marked with the symbols of the corresponding phases of the general patient journey, shown at the top:
- Admission: The patient gets hospitalized for further diagnostic procedures, usually in the pneumology department.
- Diagnosis: This diagnostic pathway starts with the anamnesis and a physical examination by the physician as well as laboratory tests, followed by a tumor biopsy and a lymph node sampling for histological examination and staging.
- Treatment: If a chemotherapy is recommended, the patient is transferred to the oncology department. Once the consent discussion has been completed, a systemic anticancer therapy such as a chemotherapy and/or an immunotherapy is applied.
- Discharge: Following the systemic therapy, the patient gets discharged.
- After Care: Approximately one week after the application of the therapy, an ambulant laboratory test is recommended. If necessary, the patient returns for the second cycle of chemotherapy after a typical waiting period of two to three weeks.
In addition to these five main stages, we have also included the initial stage of the journey, Smart Home, as Internet of Things (IoT) applications are becoming increasingly relevant in the healthcare sector. Patients may, for example, bring heart rate measurements monitored using their smartwatches, which could be used as an additional diagnostic tool. Another part of the journey that does not directly influence the patient care is Billing, which is a source of multiple unstructured documents.
This hospital journey of a patient generates a significant amount of structured and unstructured documents. Recognizing the document-intensive nature of the patient journey and the potential for unstructured data to impede care, we can begin to explore the benefits of implementing NLP technologies to streamline document handling and improve patient outcomes.
Methodology: Systematic review of clinical NLP research
In order to analyze the transfer of NLP research into the clinical domain and map the actual use of NLP throughout the patient journey, we conducted a systematic review of 8,527 papers based on publication venue, date, and title combined with a keyword search as our selection criteria.
The tagging was performed for two dimensions. The first dimension concentrated on NLP-related tags to map the papers to relevant NLP tasks, models, datasets, and data languages. The second dimension focused on clinical tags, such as general patient journey and patient journey documents.
The final list of publications was then screened with NLP-related and patient journey related tags. A team of four reviewers annotated the papers, and the papers were equally split among the reviewers. Each paper was annotated by two reviewers and in case of doubts, a third reviewer was used for tie-breaks.
In visualizing the results, we employed Python along with its packages including Seaborn, Matplotlib, Pandas, Plotly, and Sankey, ensuring comprehensive data representation.
Insights from the systematic review
Our review involved mapping every paper of our screening process to several NLP-related tags, with the aim of identifying which models, tasks, datasets, and data languages are most commonly used in healthcare NLP research.
Dataset language
Various studies have analyzed or explored datasets consisting of multiple data languages. Through the analysis of 487 papers, we observed that English was the most frequently used dataset language (419). The second and third most used dataset languages were Spanish (36) and Chinese (25). The remaining 237 languages were classified under the ‘Other’ category.
Dataset type
In terms of datasets, we found that patient related data, like electronic health records, were the most commonly used sources of data (27%), followed by clinical studies (20.7%), and forum posts, chat logs, social media datasets (19.1%).
Model type
Transformer-based models were the most commonly used type of NLP model across a variety of tasks (44.94%), followed by recurrent neural networks (RNN) (20.39%). As shown in Figure 3, the use of transformer-based models increased over a four-year period, culminating in a peak in 2021 and 2022.
NLP Task
Certain tasks, such as classification with almost 30%, information extraction with 26.81% and text generation/text summarization which account for 12.52%, were more frequently studied than others.
Patient journey mapping
Analyzing the clinical patient journey, we observe that most of the clinical NLP papers focus on applications during the Diagnosis, Admission and Discharge phase of the patient, while referring to admission notes, radiology reports and discharge letters. In contrast, patient journey steps like Smart Home, After Care, or Billing are less represented in the clinical NLP literature.
Challenges and opportunities in clinical NLP
Our findings reveal several key challenges and opportunities in the field of clinical NLP:
Data language diversity: The dominance of English datasets indicates that other languages are under-researched in the medical domain, leading to an imbalance between non-English and English medical applications. Expanding the scope of research to other languages can uncover new patterns and structures that may not be present in English, leading to new breakthroughs.
Narrowed data sources: Current NLP research tends to focus on specific types of documents, such as radiology reports, discharge letters, and admission notes. There is a significant opportunity in analyzing a wider range of medical documents produced throughout the patient journey, such as care and disease progression documentation.
Algorithmic transparency: Over 70% of papers rely on transformer variants, CNNs or RNNs, which are notoriously hard to interpret. Explainable AI (XAI) methods can address the lack of transparency and build trust in clinical NLP systems.
Bias and fairness: Datasets used for training NLP models may contain biases, leading to unfair and ineffective outcomes for certain demographics. Rigorous validation across diverse patient populations is essential to ensure fairness and equity.
Underexplored patient journey stages: While Diagnosis and Admission are well-researched areas, patient journey steps such as Treatment, Billing, After Care, and Smart Home remain largely underexplored, despite the significant amounts of documents produced in these stages.
Addressing the challenges: Towards trustworthy clinical NLP
To address the challenges identified in our review, we propose the following strategies:
Out-of-distribution generalization: Techniques like subpopulation shift analysis, domain adaptation, and label-aware domain transfer can help NLP models perform well across diverse healthcare settings and patient demographics.
Explainability and interpretability: Explainable AI (XAI) methods, such as feature attribution and interpretable-by-design components, can provide insights into how NLP models arrive at their conclusions, fostering trust among healthcare professionals and patients.
Bias mitigation: Detecting and reducing biases through statistical fairness metrics, debiased word embeddings, and post-processing techniques can ensure non-discrimination of protected groups in clinical decision-making.
Interdisciplinary collaboration: Close collaboration between NLP researchers, clinicians, and healthcare administrators is crucial to ensure that NLP innovations are both technically sound and practically useful in clinical settings.
User-friendly applications: Developing intuitive and easy-to-use NLP applications can facilitate quicker adoption into clinical practice, empowering healthcare professionals to leverage these tools effectively.
By addressing these challenges and harnessing the full potential of clinical NLP, we can unlock new possibilities for improving patient outcomes, streamlining clinical workflows, and advancing medical research.
Conclusion
This systematic review has highlighted the significant potential of NLP in revolutionizing healthcare delivery, from enhancing clinical decision-making to optimizing hospital operations and improving patient care. However, it has also unveiled critical challenges that must be addressed to ensure the responsible and effective integration of NLP in clinical settings.
Key challenges include the need for diverse and representative datasets, algorithmic transparency and explainability, mitigation of biases, and the seamless integration of NLP tools into existing healthcare workflows. By addressing these challenges through interdisciplinary collaboration, user-centric design, and a focus on trustworthy AI, the healthcare sector can fully harness the transformative power of NLP to deliver better patient outcomes and improve the overall efficiency of the system.
As AI and NLP technologies continue to evolve, the future holds immense promise for personalized medicine, advanced drug discovery, and enhanced global health monitoring. By proactively addressing the ethical and practical concerns, the healthcare industry can position itself at the forefront of this technological revolution, ultimately empowering clinicians, patients, and researchers to collaborate in driving meaningful and impactful change.