A Stacked Ensemble Learning Framework to Improve Predictive Performance in Genomic Selection for Quantitative Traits

Vol. 6, 2025 - 334597
Expanded abstract
Favorite this paper
How to cite this paper?
Abstract

Genomic Selection (GS) utilizes genome-wide markers to estimate Genetic Estimated Breeding Values (GEBV), optimizing germplasm selection for costly or late-expressed traits. Traditional models, like GBLUP, assume a polygenic architecture, yet the underlying genetic complexity of many traits includes non-additive effects. Since no single model is universally optimal for diverse genetic architecture, stacking ensemble learning, which trains a meta-model using predictions from diverse base models, presents an interesting approach to capture this complexity. This study evaluated various stacking configurations for genomic prediction across 8 simulated traits, encompassing additive, dominance, and epistatic architectures. Using a 5-fold cross-validation scheme, the stacking ensemble demonstrated superior predictive ability in all scenarios. Gains were highly significant, achieving an 83% increment over the best individual model (BayesA with dominance) in complex architectures (100 QTLs, h²=0.3), and a 27.59% gain in oligogenic scenarios with epistasis (10 QTLs, h²=0.6). The success of the stacking strategy can be attributed to careful base learner selection and the use of robust meta-learners (such as penalized regression) to deal input multicollinearity effectively.

Share your ideas or questions with the authors!

Did you know that the greatest stimulus in scientific and cultural development is curiosity? Leave your questions or suggestions to the author!

Sign in to interact

Have a question or suggestion? Share your feedback with the authors!

Institutions
  • 1 Universidade Federal de Viçosa
  • 2 Universidade Federal do Piauí
Track
  • 2. Biometrics, statistics, and quantitative genetics
Keywords
Dominance
Epistasis
GBLUP
Machine Learning