Thesis (Ph.D.)--University of Rochester. School of Medicine & Dentistry. Dept. of Biostatistics and Computational Biology, 2014.
Longitudinal data are sometimes collected with a large number of potential exploratory
variables. In order to get the better statistical inference and make the more
accurate prediction, model selection has become an important procedure for longitudinal
studies. Nevertheless, the inference based on a single model may ignore the uncertainty
introduced by the selection procedure, and therefore underestimate the variability.
As an alternative, model averaging approach combines estimates from different
candidate models in the form of the certain weighted mean to reduce the effect of selection
instability. There has been much literature about model selection and averaging
for cross-sectional data, but more efforts are needed to invest in longitudinal data.
My thesis focuses on model selection and model averaging procedures in the longitudinal
data context. We propose an AIC-type model selection criterion (AIC) incorporating
the generalized estimating equations approach. Specifically, we consider
the difference between the quasi-likelihood of a candidate model and a narrow model
plus a penalty term in order to avoid the complicated integration calculation from the
quasi-likelihood. This criterion actually inherits theoretical asymptotic properties from
AIC.
In the second part, we develop a focused information criterion (QFIC) and a Frequentist
model average (QFMA) procedure on the basis of a quasi-score function incorporating
the generalized estimating equations approach. These methods are shown
to have asymptotic properties. We also conduct intensive simulation studies to examine
the numerical performance of the proposed methods.
The third part aims to apply the focused information criterion to personalized medicine.
Based on the individual level information from clinical observations, demographics,
and genetics, this criterion provides a personalized predictive model to make a prognosis
and diagnosis for an individual subject. Consideration of the heterogeneity of
individuals helps to reduce prediction uncertainty and improve prediction accuracy.
Several real case studies from biomedical research are studied as illustrations.