Computomics - Comparing classical and machine learning-based phenotype prediction methods

A comparison of classical and machine learning-based phenotype prediction methods on simulated data and three plant species

Maura John, Florian Haselbeck, Rupashree Dass, Christoph Malisi, Patrizia Ricca, Christian Dreischer, Sebastian J. Schultheiss, Dominik G. Grimm

https://www.frontiersin.org/journals/plant-science/articles/10.3389/fpls.2022.932512/full

John, M., Haselbeck et al. (2022). A comparison of classical and machine learning-based phenotype prediction methods on simulated data and three plant species. Frontiers in Plant Science, 13, 932512.

Welcome back to Make Sense of Science, where we break down complex research into easier-to-understand insights. Today we are discussing a study comparing classical and ML learning-based phenotype prediction methods.

Challenge and Solution

In plant breeding, selecting the best-performing plants is essential for developing new varieties that grow better and faster. Today, breeders can make these decisions by predicting a plant’s performance from its DNA using genomic selection. The key question is which prediction method works best. This study compared twelve phenotype prediction methods, from traditional statistics to modern artificial intelligence, using both computer-simulated and real-world plant data. The main result is clear: simpler models such as Bayes B and Elastic Net were often just as good, or even better, than advanced deep learning models, especially with the smaller or typical datasets seen in breeding programs.

Which Methods Were Used?

The researchers evaluated three families of prediction models:

Classical models: RR-BLUP, Bayes A/B/C
Machine Learning models: LASSO, Elastic Net, Support Vector Regression (SVR), Random Forest (RF), and XGBoost (XGB)
Deep Learning models: Multilayer Perceptron (MLP), Convolutional Neural Network (CNN), and Local Convolutional Neural Network (LCNN)

Classical models are widely used in genomic selection because they are simple, interpretable, and reliable with modest sample sizes. Machine learning methods can capture more complex relationships between markers and traits while retaining some interpretability. Deep learning methods represent modern AI designed to learn complex patterns automatically, but they typically need larger datasets and careful regularization.

To ensure a fair comparison, the team combined simulated and real-world data. Simulated phenotypes were created on real Arabidopsis genotypes, so the causal markers and effect sizes were known. Real datasets came from Arabidopsis thaliana, soy, and corn breeding programs. Each model was fine-tuned with Bayesian optimization, and performance was estimated with nested cross-validation to prevent information leakage. Beyond accuracy, the study examined which SNPs drove predictions using feature importance from linear, Bayesian, and ensemble models, and then checked these selections against GWAS results to confirm biological relevance.

What Did the Study Find?

Simulated Data
- Bayes B consistently had the best prediction performance.
- Elastic Net, LASSO, and SVR were also strong.
- Deep Learning models (MLP, CNN, LCNN) never outperformed simpler methods, even with more data.
- Feature selection worked best with models using L1 regularization (LASSO, Elastic Net), which naturally ignore unimportant markers.
Real-World Data
- No single model was best for all traits.
- Elastic Net led in 3 out of 9 traits, followed closely by other classical ML models.
- Neural networks improved with more data but were still outperformed by simpler models.
- Models often picked markers that matched those found by GWAS, confirming their relevance.

Fig. Simpler models vs deep learning for genomic prediction: for typical breeding datasets, simpler models often win.

On simulated data, Bayes B consistently delivered the highest explained variance. Elastic Net, LASSO, and SVR also performed strongly and were often close to Bayes B. Deep learning models did not outperform the simpler approaches, even when more data were available. Models that use L1 regularization, such as LASSO and Elastic Net, were particularly effective at feature selection because they naturally down-weight or ignore uninformative markers. On real-world data, no single model dominated across all traits. Elastic Net was the best in several cases, with Bayes B, Random Forest, and SVR frequently close behind. Neural networks improved as sample sizes grew but still did not surpass the simpler methods. Importantly, many of the markers highlighted by the best-performing models overlapped with GWAS hits, which supports the biological credibility of their predictions. Together, these results show that in current breeding data settings, well-tuned classical and machine learning models remain hard to beat, while deep learning has yet to show a consistent advantage.

Why Does This Matter? Guidance for Breeders and Agriculture

For breeders applying genomic selection, the takeaway is practical: you can achieve reasonable and useful prediction accuracy with a range of models. Simpler approaches like Bayes B or Elastic Net remain effective, especially for simpler traits, when datasets are moderate in size and interpretability and computational efficiency are priorities.

However, as breeding programs evolve to include more complex traits, diverse environments, and additional data layers such as weather, soil, or management information, more advanced machine learning and AI models become increasingly valuable and offer unique advantages. These approaches can capture nonlinear interactions and data integration challenges that simpler models cannot, providing improved prediction accuracy and deeper biological insight.

Make Sense of Science: A comparison of classical and machine learning-based phenotype prediction methods

A comparison of classical and machine learning-based phenotype prediction methods on simulated data and three plant species

Get in touch with us