• icon+265(0)111 624 222
  • iconresearch@unima.ac.mw
  • iconChirunga-Zomba, Malawi

Are you a UNIMA researcher? Login

Identifying Outlying and Influential Clusters in Multivariate Survival Data Models


Author(s) : Tsirizani M. Kaombe, Samuel O. M. Manda
Emerging Topics in Statistics and Biostatistics

Abstract


In regression analysis, diagnostic statistics serve to assess the quality of fit of the model to data and investigate if there are observations that are not well represented by the model. Additionally, there could be outlier observations in the sense that they deviate from the pattern of the other data points being modelled, or influential observations that, if removed from the dataset, could impact the slope of the fitted regression model. These two types of unusual data points can cause serious problems in regression analysis. The statistics for identifying outlier and influential observations have been adequately studied in linear and linear mixed models and are available for users in most statistical packages. However, not much work has been done on similar methods for the analysis of multivariate survival data. In this chapter, we use martingale-based residuals to derive outlier and influence statistics for multivariate survival data model. We evaluate performance of the proposed statistics using simulation studies. Upon applying the proposed statistics to child survival data from Malawi, in which children were studied in 56 subdistricts, the outlier statistic detected five subdistricts as outliers to under-five mortality, while the influence statistic identified six subdistricts as having influence on the estimate of effect of being female on child survival, depending on the covariates used in the modelling process.


Original language en
Pages (from-to) 377-410
Publication status Published - 2022