In Little's test of MCAR the data are modeled as multidimensional, multivariate normal with mean 'mu' and covariance matrix 'sigma'. The test statistic is the sum of the squared standardized differences between sub-sample means and the expected population means weighted by the variance-covariance matrix and the number of observations. Under the null hypothesis, the test statistic follows a chi-square distribution with \(\sum k_j - k\) degrees of freedom, where \(k_j\) is the number of complete variables for missing data pattern \(j\), and \(k\) is the total number of variables. A statistically significant result provides evidence against MCAR.
When the normality assumption is not satisfied, the test will work for quantitative random variables but not categorical ones. Additionally, the test will not specify which variable(s) are not MCAR, and it fails to identify collinearity among variables. Third, the test can neither prove the MCAR assumption nor rule out the hypothesis of MNAR.
Value
a numeric matrix
- statistic
Chi-squared statistic for Little's test
- df
Degrees of freedom used to compute chi-square statistic
- p_value
P-value for the chi-square statistic
- pattern
Unique missing data patterns found
References
Little, R. J. A. (1988). A test of Missing Completely at Random for multivariate data with missing values. Journal of the American Statistical Association, 83, 1198-1202. https://doi.org/10.2307/2290157
code is adapted from Eric Stemmler: https://web.archive.org/web/20201120030409/https://stats-bayes.com/post/2020/08/14/r-function-for-little-s-test-for-data-missing-completely-at-random/
and naniar's mcar_test
.
Examples
set.seed(123)
data <- data.frame(x1 = stats::rnorm(100),x2 = stats::rnorm(100),y = stats::rnorm(100))
data$x1[sample(1:100, 20)] <- NA
data$x2[sample(1:100, 15)] <- NA
data$y[sample(1:100, 10)] <- NA
check.mcar(data,digits = 3)
#> Little's test of MCAR
#> statistic df p.value missing.patterns
#> 1 5.484 9 0.79 7