QiLiu,Shi-min Zuo,Shasha Peng,Hao Zhang,Ye Peng,Wei Li,Yehui Xiong,Runmao Lin,Zhiming Feng,Huihui Li,Jun Yang,Guo-Liang Wang,Houxiang Kang.Development of Machine Learning Methods for Accurate Prediction of Plant Disease Resistance.
Engineering, 2024,https://doi.org/10.1016/j.eng.2024.03.014
Abstract
The traditional method of screening plants for disease resistance phenotype is both time-consuming and costly. Genomic selection offers a potential solution to improve efficiency, but accurately predicting plant disease resistance remains a challenge. In this study, we evaluated eight different machine learning (ML) methods, including random forest classification (RFC), support vector classifier (SVC), light gradient boosting machine (lightGBM), random forest classification plus kinship (RFC_K), support vector classification plus kinship (SVC_K), light gradient boosting machine plus kinship (lightGBM_K), deep neural network genomic prediction (DNNGP) and densely connected convolutional networks (DenseNet), for predicting plant disease resistance. Our results demonstrate that the three plus kinship (K) methods developed in this study achieved high prediction accuracy. Specifically, these methods achieved accuracies of up to 95% for rice blast (RB), 85% for rice black-streaked dwarf virus (RBSDV), and 85% for rice sheath blight (RSB) when trained and applied to the rice diversity panel I (RDPI). Furthermore, the plus K models performed well in predicting wheat blast (WB) and wheat stripe rust (WSR) diseases, with mean accuracies of up to 90% and 93%, respectively. To assess the generalizability of our models, we applied the trained plus K methods to predict RB disease resistance in an independent population, rice diversity panel II (RDPII). Concurrently, we evaluated the RB resistance of RDPII cultivars using spray inoculation. Comparing the predictions with the spray inoculation results, we found that the accuracy of the plus K methods reached 91%. These findings highlight the effectiveness of the plus K methods (random forest classification plus kinship (RFC_K), support vector classification plus kinship (SVC_K), and light gradient boosting machine plus kinship (lightGBM_K)) in accurately predicting plant disease resistance for RB, RBSDV, RSB, WB, and WSR. The methods developed in this study not only provide valuable strategies for predicting disease resistance but also pave the way for using machine learning to streamline genome-based crop breeding.
Engineering ,IF=12.8
https://www.sciencedirect.com/science/article/pii/S2095809924002431