Ahn E, Baek I, Prom LK, Park S, Kim MS, Meinhardt LW, Magill C
Anthracnose, caused by the hemibiotrophic fungal pathogen Colletotrichum sublineola, is a significant constraint to sorghum production worldwide. Developing resistant cultivars is the most sustainable control strategy, but it requires constant additional sources of resistance genes. Here, we applied machine learning (ML) approaches, specifically Bootstrap Forest and Boosted Tree models, to identify single-nucleotide polymorphisms (SNPs) associated with anthracnose resistance in a panel of Senegalese sorghum accessions using publicly available phenotypic data from seedling and 8-leaf stages. The ML models identified five novel high-importance loci distinct from those found by linear model-based Genome-wide association studies (GWAS), while also reinforcing three candidates detected by both methods. The top candidates found through ML algorithms were leucine-rich repeat (LRR), F-box, aspartic peptidase, and jasmonate O-methyltransferase. Several genes were highlighted by both ML and GWAS, strengthening the evidence for their involvement. This study demonstrates the potential of ML to complement traditional GWAS in identifying candidate genes for complex traits, providing a valuable resource for future functional studies and marker-assisted selection efforts to enhance anthracnose resistance in sorghum. Given the constraints of the available population size, these results are best interpreted as an explanatory framework that highlights potential targets for further investigation and guides future functional validation, rather than as a definitive predictive tool.