Feature selection and personalized modeling on medical adverse outcome prediction
Abstract
This thesis is about the medical adverse outcome prediction and is composed of three parts, i.e. feature selection, time-to-event prediction and personalized modeling. For feature selection, we proposed a three-stage feature selection method which is an ensemble of filter, embedded and wrapper selection techniques. We combine them in a way to select a both stable and predictive set of features as well as reduce the computation burden. Datasets on two adverse outcome prediction problems, 30-day hip fracture readmission and diabetic retinopathy prognosis are derived from electronic health records and exemplified to prove the effectiveness of the proposed method. With the selected features, we investigated the application of some classical survival analysis models, namely the accelerated failure time models, Cox proportional hazard regression models and mixture cure models on adverse outcome prediction. Unlike binary classifiers, survival analysis methods consider both the status and time-to-event information and provide more flexibility when we are interested in the occurrence of adverse outcome in different time windows. Lastly, we introduced the use of personalized modeling(PM) to predict adverse outcome based on the most similar patients of each query patient. Different from the commonly used global modeling approach, PM builds prediction model on smaller but more similar patient cohort thus leading to a more individual-based prediction and customized risk factor profile. Both static and metric learning distance measures are used to identify similar patient cohort. We show that PM together with feature selection achieves better prediction performance by using only similar patients, compared with using data from all available patients in one-size-fits-all model.
Collections
- OSU Dissertations [11222]