Forecasting Hotspots Of Potentially Preventable Hospitalisations With Spatially Aggregated Longitudinal Health Data: All Subset Model Selection With A Novel Implementation Of Repeated k-Fold Cross-Validation

It is sometimes difficult to target individuals for health intervention due to limited information on their behaviour and risk factors. In such cases place-based interventions targeting geographical ‘hotspots’ with higher than average rates of health service utilisation may be effective. Many studies exist examining predictors of hotspots, but often do not consider that place-based interventions are typically costly and take time to develop and implement, and hotspots often regress to the mean in the short-term. Long-term geographical forecasting of hotspots using validated statistical models is essential in effectively prioritising place-based health interventions.

Existing methods forecasting hotspots tend to prioritise positive predicted value (i.e. correct predictions) at the expense of sensitivity. This work introduces methods to develop models optimising both positive predicted value and sensitivity concurrently. These methods utilise spatially aggregated administrative health data, WA census population data, and ABS geographic boundaries, combining all subset model selection with a novel implementation of repeated cross-validation for longitudinal data. Results from models forecasting 3-year hotspots for four potentially preventable hospitalisations are presented, namely: type II diabetes mellitus, heart failure, high risk foot, and chronic obstructive pulmonary disease (COPD).