A Global Sorghum Pangenome Reference Reveals Genetic Variation Underlying Adaptation and Crop Improvement

A global sorghum (Sorghum bicolor) pangenome constructed from diverse reference genomes and nearly 2,000 resequenced lines reveals extensive structural and gene-content variation underlying domestication, environmental adaptation, and key agronomic traits, providing a powerful genomic framework to accelerate trait discovery and breeding of locally adapted cultivars.

This work establishes a global sorghum pangenome framework that captures major structural and allelic diversity beyond a single reference genome. By making that variation trackable across diverse germplasm, it strengthens trait discovery and supports genomics-enabled breeding of locally adapted varieties. – ShakoorSorghum is ideal for pangenomics: the graph is tractable, its genetic diversity has been well characterized, and the germplasm is diverse. Yet, we still made new discoveries at every locus we explored, including uncovering completely unknown haplotypes at some of sorghum’s most well studied loci. – Lovell 

Sorghum (Sorghum bicolor) is one of the most climate-resilient and phenotypically diverse major crops, adapted to a wide range of environments, agronomic practices, and end uses. However, this diversity also poses challenges for modern breeding, which often relies on relatively homogeneous elite germplasm pools. Temperate commercial sorghum hybrids are typically bred as single-purpose grain, forage, or bioenergy types with reduced photoperiod sensitivity, whereas many traditional varieties cultivated by smallholder farmers in tropical regions are multipurpose and highly photoperiod sensitive. These contrasting gene pools complicate the development of broadly adapted cultivars but highlight the need for decentralized breeding strategies that integrate local environmental conditions, farmer preferences, and globally shared genetic resources. To support this approach, researchers from Colorado State University, HudsonAlpha Institute for Biotechnology, Lawrence Berkeley National Laboratory and collaborating institutions developed an improved reference genome for the widely used cultivar BTx623 using long-read sequencing technologies, substantially increasing assembly contiguity and correcting structural errors present in earlier versions. This updated genome provides a more accurate framework for mapping recombination, identifying candidate genes underlying key traits, and anchoring global analyses of sorghum genetic diversity.

Building on this foundation, the study assembled a 33-genome sorghum pangenome reference combined with high-coverage resequencing data from nearly 2,000 diverse genotypes representing global breeding lines and traditional landraces. The pangenome reference nearly doubled the sequence content represented by the single reference genome and revealed extensive structural variation, including large insertions, inversions, and gene presence–absence variation that contribute to important agronomic traits. Using k-mer–based genotyping and population genomic analyses, the authors identified complex allelic variation associated with domestication genes, drought adaptation, and metabolic pathways such as dhurrin biosynthesis, which is associated with pest defense and dehydration-related adaptation. Landscape genetic analyses further demonstrated that human-mediated gene flow and environmental pressures—particularly drought—shape allele sharing across African sorghum populations. Together, these results establish a comprehensive genomic framework linking genetic variation to phenotypic performance and environmental adaptation. The pangenome reference resource provides a critical platform for trait discovery, marker development, and the integration of locally adapted alleles into breeding programs, ultimately supporting the development of improved sorghum cultivars suited to diverse and changing agricultural environments.

SorghumBase now hosts these pangenome assemblies and associated resources through its tenth release (SB10). SB10 integrates these genomes into a broader comparative framework, including gene family analysis across sorghum and outgroup species, projection of EVA-assigned variant identifiers (rsIDs) across assemblies, and harmonized germplasm identifiers to support cross-study comparisons. These updates enable users to explore structural and functional variation in a pan-genome context, directly linking the results of this study to accessible tools for gene, variant, and germplasm analysis. (See also SorghumBase Release 10 notes)

SorghumBase Examples: 

Figure 1: SorghumBase homology view of SH1 (SORBI_3001G152901), a key domestication gene controlling seed shattering. The comparative gene tree and protein alignment show that SH1 is conserved across sorghum accessions and shares homology with rice OsSh1, another grass gene associated with shattering, as well as with more distant YABBY-family genes such as Arabidopsis YAB2. Conserved aligned domains support a shared developmental function, while sequence differences among homologs reflect evolutionary divergence across grasses and flowering plants.
Figure 2: Gene neighborhood view of the sorghum seed-shattering gene Sh1(SORBI_3001G152901)  in SorghumBase. The neighborhood view displays conservation of the genomic region surrounding Sh1 across related sorghum sequences, while also showing differences in aligned blocks and local gene structure among accessions. These structural differences are consistent with the manuscript’s finding that Sh1 carries multiple deeply diverged haplotypes, including a previously unresolved large insertion and duplicated sequence. Together, the neighborhood view and the pangenome analysis highlight the structural diversity underlying seed-shattering variation in sorghum.
Figure 3: Germplasm view of the sorghum seed-shattering gene Sh1 (SORBI_3001G152901) in SorghumBase. The Germplasm tab displays predicted loss-of-function alleles in Sh1, including frameshift and splice donor variants, across multiple sorghum diversity panels. Listed accessions and allele classes show that potentially disruptive variation at this domestication gene is distributed across diverse germplasm. This complements the manuscript’s finding that Sh1 carries multiple structural haplotypes, highlighting the broad genetic diversity underlying seed-shattering variation in sorghum.
Reference:

Morris GP, Harder AM, Healey AL, McLaughlin CM, Rifkin JL, Cruet-Burgos C, Jenkins JW, Shu S, Spiekerman JJ, VanGessel CJ, Agnew E, Audebert A, Barry K, Baxter I, Beurier G, Boston LB, Boyles RE, Brady SM, Bunting V, Chaparro JM, Courtney C, Dembele JSB, Deshpande S, Diatta C, Eck N, Eveland AL, Faye JM, Flowers D, Fonceka D, Gano B, de Gracia Coquerel M, Goodstein D, Grimwood J, Hudson ME, Kholova J, Johnson K, Johnson KK, Kawa D, Kouressy M, Kresovich S, Lee S, Lemaux PG, Lowery R, Luquet D, Maina F, Mamidi S, McKay JK, Michael TP, Mindaye TT, Mullet J, Ozersky P, Plott C, Prenni JE, Pressoir G, Rami JF, Rife TW, Saxton J, Sine B, Sreedasyam A, Talag J, Teme N, Tuinstra MR, Vadez V, Vogel JP, Walstead R, Wang J, Webber J, Williams M, Xu Y, Mockler TC, Lasky JR, Rice BR, Schmutz J, Shakoor N, Lovell JT. A sorghum pangenome reference improves global crop trait discovery. Nature. 2026 Mar 11. PMID: 41813899. doi: 10.1038/s41586-026-10229-9. Read more

Related Project Websites: 

Shakoor Lab Photo Credit: Donald Danforth Plant Science Center.
Photo Credit: Donald Danforth Plant Science Center.
Sorghum diversity panel in Maricopa, Arizona. Photo Credit: Nadia Shakoor.
Sorghum panicle diversity in the pangenome. Photo Credit: Nadia Shakoor.
Sorghum pangenome lines in the field. Photo Credit: Nadia Shakoor.