Repository logo
 

Randomized Quantile Residual for Assessing Generalized Linear Mixed Models with Application to Zero-Inflated Microbiome Data

Date

2018-07-27

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

ORCID

Type

Thesis

Degree Level

Masters

Abstract

In microbiome research, it is often of interest to investigate the impact of clinical and environmental factors on microbial abundance, which is often quantified as the total number of unique operational taxonomic units (OTUs). The important features of OTU count data are the presence of a large number of zeros and skewness in the positive counts. A common strategy to handle excessive zeros is to use zero-inflated models or zero-modified (hurdle) models. Moreover, subjects in microbiome data often have clustering structure, for example humans from the same family or plants from the same plot; as a result, random effects should be included to account for the clustering effects. Model diagnosis is an essential step to ensure that a fitted model is adequate for the data. However, diagnosing zero-inflated counts models is still a challenging research problem. Pearson and deviance residuals are often used in practice for diagnosing counts models, despite wide recognition that these residuals are far from normality when applied to count data. Randomized quantile residual (RQR) was proposed in literature to circumvent the above problems in traditional residuals. The key idea of the RQR is to randomize the lower tail probability into a uniform random number between the discontinuity gap of cumulative density function (CDF). It can be shown that RQRs are normally distributed under the true model. To the best of our knowledge, RQR has not been applied to diagnose zero inflated or modified mixed effects models. In this thesis project, we have developed generic R functions that can compute RQRs for zero-inflated and zero-modified mixed effects models based on fitting outputs of glmmTMB. We have tested our functions using datasets generated from zero-modified Poisson (ZMP) and zero-modified negative binomial (ZMNB) models. Our simulation studies show that RQRs are normally distributed under the true model. In GOF tests, the type 1 error rates are close to the nominal level 0.05, and the powers of rejecting the wrong models are very good. We have also applied RQR to assess 8 models for a real human microbiome OTU dataset and concluded that ZMNB or zero-inflated negative binomial (ZINB) models provide adequate fits to the dataset.

Description

Keywords

Operational taxonomic units (OTUs), randomized quantile residual (RQR), cumulative density function (CDF), zero-modified Poisson (ZMP), zero-modified negative binomial (ZMNB), zero-inflated negative binomial (ZINB).

Citation

Degree

Master of Science (M.Sc.)

Department

Mathematics and Statistics

Program

Mathematics

Citation

Part Of

item.page.relation.ispartofseries

DOI

item.page.identifier.pmid

item.page.identifier.pmcid