Background: Identifying bleeding events in the electronic health record (EHR) is challenging and often requires laborious manual chart review. While natural language processing (NLP) models are commonly utilized to identify thrombotic events, these models are less frequently employed to identify major bleeding (MB) or clinically relevant non-major bleeding (CRNMB) events. We sought to compare the sensitivity and specificity of an NLP pipeline and graphical user interface, Clinical Event Detection and Recording System (CEDARS), with those of manual chart review and international classification of diseases (ICD) 10 codes for the identification of bleeding events in patients with cancer.

Methods: CEDARS is an NLP pipeline and interface that uses predetermined keywords to identify potential clinical events in the EHR. In the case of a bleeding endpoint, the user first uploads an EHR corpus into the web interface and enters a Boolean query containing relevant keywords (e.g. “bleed* OR hemorrhage OR epistaxis, …” etc.). CEDARS automatically processes individual patient records by splitting documents into sentences and presents only documents with at least one sentence matching the query. Sentences where keywords are negated are also removed. The user then reviews the selected documents from within the interface and enters required labels (i.e. bleeding type, bleeding site, and date of occurrence).

This study included a subgroup of patients from a retrospective cohort study on bleeding events in lung cancer patients receiving therapeutic anticoagulation (AC) and chemotherapy, EGFR inhibitors, or VEGF inhibitors at Memorial Sloan Kettering Cancer Center (MSKCC) from 2012 to 2022. Patients were indexed on the first day of concurrent AC and anti-neoplastic therapy and were censored for MB, discontinuation of AC or antineoplastic therapy (60 days after last prescription), end of 12-month follow-up, or death.

All documentation in the EHR within the index period was manually reviewed by a physician to identify MBs and CRNMBs, as defined by the International Society on Thrombosis and Haemostasis (ISTH). All patients were then reviewed by a second physician using CEDARS to identify MBs and CRNMBs. Finally, a third physician was blinded to the previous event determinations and was only given the dates of bleeding events identified by both the manual and CEDARS methods and reviewed each chart to adjudicate each bleeding event and create a “gold standard” record of bleeding events. ICD codes indicative of bleeding events (not stratified for MB or CRNMB) and previously validated were also extracted from the EHR (Joos et al., 2019).

Results: The study included 293 patients with an average follow-up of 8.2 months and an average of 112 notes per patient. CEDARS flagged an average of 43 sentences containing keywords per patient. The gold standard identified 64 (22%) total bleeding events, including 33 (11%) MBs and 31 (11%) CRNMBs, and 229 (78%) patients experienced no bleeding events. ICD-10 codes identified a total of 30 bleeding events for a sensitivity of 47% (95% CI 34-60%) and specificity of 90% (95% CI 85-94%).

Manual review identified a total of 40 bleeding events for a sensitivity of 63% (95% CI 50-93%) and specificity of 96% (95% CI 93-98%). The manual review correctly identified 9 MBs for a sensitivity of 27% (95% CI 13-46%) and specificity of 100% and correctly identified 23 CRNMBs for a sensitivity of 74% (95% CI 55-88%) and specificity of 94% (95% CI 90-96%). The NLP-assisted review identified a total of 53 bleeding events for a sensitivity of 83% (95% CI 71-91%) and specificity of 97% (95% CI 93-99%). NLP review correctly identified 23 MBs for a sensitivity of 70% (95% CI 51-84%) and specificity of 99% (95% CI 96-100%) and correctly identified 20 CRNMBs for a sensitivity of 65% (95% CI 45-81%) and specificity of 95% (95% CI 91-97%).

Conclusions: Processing EHR corpora with NLP has the potential to improve the accuracy of chart extraction to identify hemorrhagic events. This NLP-assisted review was effective in identifying overall bleeding events and major bleeding events, as well as in reducing the amount of research labor required. The NLP-assisted review also was superior to ICD-10 codes alone. However, identification of clinically relevant non-major bleeding events by retrospective chart review remains challenging regardless of modality.

This content is only available as a PDF.
Sign in via your Institution