An Approach To Poisson Mixed Models For -omics Expression Data

We are interested in regression models for multivariate data from high-throughput biological assays (‘omic’ data). These data have correlations between variables, and may also come from structured experiments, so a generalised linear mixed model is appropriate to fit the experimental variables and different types of omics data. However, the number of variables is often larger than the number of observations: a structured covariance model is necessary and sparsity induction is biologically appropriate. In this presentation we describe an approach to Poisson mixed models, suitable for RNAseq gene expression data, based on transcript-specific random effects with a sparse precision matrix. We show by simulations that the optimal sparseness penalty for regression modelling is not the same as in the usual graph estimation problem and compare some estimation strategies in simulations.