Model Choice Considerations for Fitting Concentration-response Curves
Abstract
Many bioassays use inverse prediction, sometimes referred to as calibration, in order to estimate an unknown concentration from a machine response, using a training set. This thesis attempts to evaluate the effect model selection has upon the inverse prediction, specifically with regard to the four and five parameter log logistic models. It is known that the bias-variance trade off relationship is at the center of this decision. We conducted two simulation studies and one cross validation study to evaluate the effects of asymmetry, denoted by f, and the effects of noise, denoted by σ, on the model's accuracy. Our criterion measures were a global MSE measure referred to as S¹ and an absolute relative bias measure referred to as S². Our initial simulations, which involved a direct comparison of the two fitted curves with respect to the true curve, found that the five parameter performed better for both scores (S¹ and S²) when dealing with asymmetric data, as anticipated. Our second set of simulations, which evaluated our fitted curves' accuracy in predicting future simulated samples, found an unexpected trend for the S² scores. The four parameter outperforms the five parameter with regard to the absolute relative bias for a majority of the positive asymmetric (f>1) cases. Our cross validation study, using actual lab data, was conducted to further investigate this finding. The cross validation study confirmed that for highly estimated asymmetric levels (f>3.5), a majority of the cases reported that the four parameter provided a better prediction. From these last two studies, we could conclude that the four parameter provides a better fit for future predictions than the five parameter for almost all levels of asymmetry (excluding f<0.8). There are some possible explanations for these findings. During our relative bias measurements, we had to omit certain points due to extrapolation into unrealistic concentrations. We tried various exclusion criteria but the pattern remained relatively the same. Another possible explanation may be that the fitting algorithm actually passes the asymmetry to the other parameters at higher levels of asymmetry, allowing the four parameter to undermine the f parameter's intention at the higher levels. In the end, while the four parameter appears to perform better for most of the positive asymmetric regions, a further study of this incident is needed to clarify exactly what leads to this result.
Collections
- Biostatistics [215]