You are on page 1of 3

Patient Readmission Prediction

Business Challenges:
Readmission of patients is a major concern in US Healthcare Industry. 30-day readmission rate for
Medicare beneficiaries is almost 20%, and as per some estimates these occurrences cost the U.S.
healthcare system as much as $17-$25 billion annually. It not only puts financial stress on payers but also
unnecessary stretches the resources of providers. Additionally, financial penalties are imposed by CMS for
readmission rates.
Business Benefits:
• Predict risk for inpatient readmissions based on developed algorithms using clinical variables
• Enables improved outcomes, quality metrics improvement and minimize penalties
• Facilitate identification of clinical factors and triggers proactively for improved patient care
• Accurate prediction of Patient readmission, and subsequent steps taken to minimize that, is expected to
result in substantial reduction in Patient readmission costs.
Solution Overview:
• The Solution utilizes patients’ demographic and other hospital measured clinical data and predicts the
likelihood of them getting readmitted
• It uses Advanced Analytics and Visualization techniques
• 3 different algorithm were used to develop this model, algorithm which provided better measurement
and correct prediction is selected.
• The solution has been developed using Open Source Tool
• Data is imbalance, so need to balanced it.
• Data imputation and feature selection has been performed to bring data into shape.

1
Patient Readmission Prediction
Solution
• We have around 101766 records & 44 features
• Out of available data around 30% was duplicated so removed.
• Data was imbalance with ratio of 92:8 %.
• Used oversampling technique to balance out data.
• Out of 44 feature were data was missing, such feature were removed.
• Few new feature were derived.
• Lot of data was available in form of ID, so need to perform data imputation.
• Total 5 solution were made & 3 algorithm were used.
• Solution which is selected is having 0.62% of error.
• 5 model contain 2 interaction term model
• Logistic Regression (2 solution), Decision Tree(1 solution) & Random Forest (2 Solution) algorithm
were used.
• Random forest classification on interaction term model was selected.
• Confusion Matrix, OOB, ROC Curve technique are used to validate and select model
• Varimp & Varimpplot techniques are used for feature selection.
• 80:20 ratio has been used for building and testing models.

2
Patient Readmission Prediction
Solution
Confusion Matrix and Statistics

Reference
Prediction 0 1
0 12717 27
1 11 12899

Accuracy : 0.9985

Kappa : 0.997
Sensitivity : 0.9979
Specificity : 0.9991

You might also like