An Artificial Intelligence Model for Predicting Hospital Readmission Using Electronic Health Records Data
Main Article Content
Abstract
This study investigates the application of a machine learning model—specifically the Light Gradient Boosting Machine (LightGBM)—to predict 30-day hospital readmissions using structured electronic health record (EHR) data. Hospital readmissions remain a critical challenge in healthcare systems, often indicating gaps in continuity of care and contributing to higher costs. By leveraging demographic, clinical, and diagnostic variables from 350 anonymized patient records, the model aimed to accurately identify individuals at high risk of readmission. Key features included age, number of previous admissions, length of stay, number of medications, chronic disease status, and gender. Data preprocessing, model training, and evaluation were conducted using Python-based libraries, ensuring both reproducibility and scalability. The model achieved a ROC AUC of 0.89, precision of 0.78, recall of 0.65, and F1 score of 0.71, indicating strong performance and balance between sensitivity and specificity. A confusion matrix analysis confirmed high accuracy in both positive and negative predictions. SHapley Additive exPlanations (SHAP) values were used to enhance interpretability by quantifying the contribution of each variable to the model’s output. Feature importance analysis revealed that age and previous hospitalizations had the greatest impact on prediction, which aligns with prior clinical evidence. The study's results are consistent with existing research and highlight the potential of explainable AI models in healthcare risk prediction. These findings support the integration of machine learning into hospital decision-support systems to enable early intervention strategies, reduce preventable readmissions, and improve overall patient outcomes. Recommendations include adopting predictive models for discharge planning, prioritizing high-risk patient groups, and further validation across broader datasets. The study emphasizes transparency, clinical relevance, and operational feasibility, demonstrating how data-driven tools can be effectively deployed in real-world clinical environments
Article Details
Issue
Section

This work is licensed under a Creative Commons Attribution 4.0 International License.