Seminar: Bayesian Variable Selection for Spatial Linear Mixed Models in Ultra-high Dimensional Settings

Seminar: Bayesian Variable Selection for Spatial Linear Mixed Models in Ultra-high Dimensional Settings

Jul 1, 2021 - 9:00 AM
to , -

Abstract: In agricultural field trial experiments, observations are naturally spatially auto-correlated. Variable selection for these types of data is typically performed using a two-stage procedure by first obtaining spatially adjusted effects and then applying a feature selection method developed for independent observations. For enhancing variable selection accuracy and prediction, we develop a Bayesian variable selection method, called SP-SVEN, based on a hierarchical Gaussian linear mixed model where spatial variability and feature effects are jointly taken into account. The well-known spike-and-slab priors are placed on the regression coefficients to achieve sparsity and a first-order Gaussian intrinsic auto-regression prior is assigned to the random spatial effects. The Gaussian conjugacy ensures the availability of the explicit form of the posterior distribution conditional on the two spatial parameters, and numerical integration is used to integrate out these two parameters to obtain the posterior distribution of a given model. The priors on the spatial parameters involved in the intrinsic auto-regression can be determined in different ways. In the situation where some of the varieties are not sequenced, we propose using data from the unsequenced varieties to build a hierarchical mixture model, and use the corresponding posterior distribution of the spatial parameters as their priors for the variable selection model. The use of embedded model-based screening and fast Cholesky updates allows us to develop a highly scalable computational algorithm to rapidly discover the models that have high posterior probabilities. We also develop methods for model weight-adjusted point and interval predictions. We investigate the performance of SP-SVEN through several simulation studies and two real data examples from genome-wide association studies in field trial experiments.