The Missing Link: An Equivalence Result For Likelihood Based Methods In Missing Data Problems

Multiple imputation and maximum likelihood estimation (via the expectation-maximization algorithm) are two well-known methods readily used for analysing data with missing values. These two methods are often considered as being distinct from one another, due to their construction for estimation and their theoretical properties. We show that there is a close relationship between the two methods. Specifically, we show that a type of multiple imputation can be understood as a stochastic expectation-maximisation approximation to maximum likelihood. As a result, we can explore the application of a range of likelihood-based tools in the multiple imputation context in order to improve its performance. In particular, we develop information criteria for selecting an imputation model given a set of competing models, and a flexible likelihood ratio test when models are fitted by multiple imputation. We demonstrate our methods on real and simulated datasets.