To bridge the gaps between basic research and breeding practice in plants, machine learning (ML) holds great promise to integrate biological knowledge and omics data, to ultimately achieve precision-designed plant breeding.
Recent applications of ML in plant research and breeding include data dimensionality reduction, inference of gene-regulation networks, gene discovery and prioritization, plant phenomics analysis, and genomic prediction of plant phenotypes.
High-dimensional biology denotes the integration and analysis of macroscale to microscale biological data, elevating the chance of identifying trait-causative genes utilizable for knowledge-driven molecular design breeding.
In the era of big data, ML is capable of modeling the complex relations of genotypic, phenotypic, and environmental data collected from breeding practice, to achieve data-driven genomic design breeding.
Some of the biological knowledge obtained from fundamental research will be implemented in applied plant breeding. To bridge basic research and breeding practice, machine learning (ML) holds great promise to translate biological knowledge and omics data into precision-designed plant breeding. Here, we review ML for multi-omics analysis in plants, including data dimensionality reduction, inference of gene-regulation networks, and gene discovery and prioritization. These applications will facilitate understanding trait regulation mechanisms and identifying target genes potentially applicable to knowledge-driven molecular design breeding. We also highlight applications of deep learning in plant phenomics and ML in genomic selection-assisted breeding, such as various ML algorithms that model the correlations among genotypes (genes), phenotypes (traits), and environments, to ultimately achieve data-driven genomic design breeding.