Fast And Approximate Exhaustive Variable Selection For GLMs With APES

Obtaining maximum likelihood estimates for generalised linear models (GLMs) is computationally intensive and remains as the major obstacle for performing all subsets variable selection. Exhaustive exploration of the model space, even for a moderately large number of covariates, remains a formidable challenge for modern computing capabilities. On the other hand, efficient algorithms for exhaustive searches do exist for linear models, most notably the leaps and bound algorithm and, more recently, the mixed integer optimisation algorithm. In this talk, we present APES (APproximated Exhaustive Search) a new method that approximates all subset selection for a given GLM by reformulating the problem as a linear model. The method works by learning from observational weights in a correct/saturated generalised linear regression model. APES can be used in partnership with any other state-of-the-art linear model selection algorithm, thus enabling (approximate) exhaustive model exploration in dimensions much higher than previously feasible. We will demonstrate that APES model selection is competitive against genuine exhaustive search via simulation studies and applications to health data. Extensions to a robust setting is also possible.