Sorghum bicolor BTx623

Sorghum bicolor (L.) Moench subsp. bicolor, is a widely grown cereal crop, particularly in Africa, ranking 5th in global cereal production (FAOSTAT 2008;  It is a C4 grass also used for sugar production, brewing, feedstock, and as a biofuel crop. Its diploid genome (~730 Mbp) has a haploid chromosome number of 10. The inbred variety ‘BTx623’ is the current reference genome for sorghum. It has short stature and an early maturing genotype used primarily to produce grain sorghum hybrids. It is a line susceptible to sugarcane aphid and sensitive to low nitrogen, and therefore often used in functional comparative studies.



Plant Introduction (PI) number for Sorghum bicolor (L.) Moench subsp. bicolor, ‘BTx623’ in the U.S. National Plant Germplasm System (GRIN – Global): PI 564163.

This accession is part of the following population panels:



BTx623. Image source: The GRIN database.


Statistics (Source: NCBI, April 2021)

Sorghum line   BTx623
Assembly information
Assembly name Sorghum_bicolor_NCBIv3
Assembly date Jun 2017
Assembly accession GCA_000003195.3
WGS accession ABXC00000000
Assembly provider
Sequencing description Sequencing technologies: Sanger; Illumina
  Sequencing method
  Genome coverage: 8x
Assembly description Assembly methods: ARACHNE_modified v. 200721016
  Construction of pseudomolecules
Finishing strategy
NCBI submission
Publication: Paterson et al (2009); McCormick et al (2018)
Assembly statistics
Number of contigs 2,688
Total assembly length (Mb) 732
Contig N50 (Mb) 1
Annotations stats
Total number of genes 34,118
Total number of transcripts 47,121
Average gene length 3,714
Exons per transcript 5



The genome assembly of Sorghum bicolor cv. Moench was published in 2009 (Paterson et al, 2009). The present assembly corresponds to v3.1.1 at the US Department of Energy Joint Genome Institute (JGI) described in (McCormick et al, 2018), and is also known as the NCBIv3 assembly. Sequencing by the JGI’s Community Sequencing Program in collaboration with the Plant Genome Mapping Laboratory at the University of Georgia, followed a whole-genome shotgun strategy reaching 8X coverage with scaffolds -where possible- being assigned to the genetic map. JGI did two additional rounds of improvements. The most recent update of release v3.0 included ~351 Mb of finished sorghum sequence. A total of 349 clones were manually inspected, then finished and validated using a variety of technologies including Sanger, 454 and Illumina. They were integrated into chromosomes by aligning to v1.0 assembly. As a result, 4,426 gaps were closed, and a total of 4.96 Mb of sequence was added to the assembly. Overall contiguity (contig N50) increased by a factor of 5.8X from 204.5 Kb to 1.2 Mb. For more details, see Phytozome.

NCBI accession: GCA_000003195.3.




Gene predictions resulted from combining homology-based and ab initio methods with expressed sequences from sorghum, maize and sugarcane, using the JGI annotation pipeline (Goodstein et al, 2012). The SorghumBase browser presents data from the current JGI v3.1.1 release, which comprises the v3.0.1 assembly and v3.1.1 gene set (Feb 2017). Read more at Phytozome.

This is a modern annotation using resources used in the original v1.0 release (Sbi1 assembly and Sbi1.4 gene set) and geneAtlas RNA-seq data. The main genome is in 10 chromosomes with small unmapped pieces, some of which contain annotated genes. The NCBIv3 release (Phytozome v3.1.1) is essentially the same as Phytozome v3.1 except for 82 genes/loci that were inactivated due to 4 scaffolds entirely present in chromosome(s) that were removed.

Repeats were annotated with the Ensembl Genomes repeat feature pipeline (Aken et al, 2016), which uses six classes of repeats loaded from ENA.


Repeat feature Frequency Coverage (Mb) % of the genome covered
Low complexity (Dust) features 685,783 29 4
RepeatMasker (with RepBase library) 455,749 451 62.1
RepeatMasker (with REdat library) 392,778 409 56.2
Tandem repeats (TRF) features 245,654 41 5.7


Nomenclature – Converting Gene IDs

To search for older sorghum gene IDs of the form Sobic.* (MIPS/JGI Sbi1.4), you may want to convert them to SbXXX gene IDs (JGI v2.1) using JGI’s conversion file (password protected). The file provides mapping of Sorghum bicolor from MIPS/JGI Sbi1.4 to v2.1 and higher builds.

Example lines:

Sbi1.4 [Sobic.001G000100] ⇔ v2.1 [Sb01g000200]

#new-locusName    old-locusName

Sobic.001G000100    Sb01g000200

#new-transcriptName    old-transcriptName

Sobic.001G000100.1    Sb01g000200.1


To convert to the Ensembl nomenclature in use at SorghumBase, the following rule applies:

Sobic.* => SORBI_3*


For example:

Sobic.001G544600 = SORBI_3001G544600


Literature References

Aken, Bronwen L., Sarah Ayling, Daniel Barrell, Laura Clarke, Valery Curwen, Susan Fairley, Julio Fernandez Banet, et al. 2016. “The Ensembl Gene Annotation System.” Database: The Journal of Biological Databases and Curation. PMID: 27337980.

Brenton, Zachary W., Elizabeth A. Cooper, Mathew T. Myers, Richard E. Boyles, Nadia Shakoor, Kelsey J. Zielinski, Bradley L. Rauh, William C. Bridges, Geoffrey P. Morris, and Stephen Kresovich. 2016. “A Genomic Resource for the Development, Improvement, and Exploitation of Sorghum for Bioenergy.” Genetics 204 (1): 21–33. PMID: 27356613.

Casa, Alexandra M., Gael Pressoir, Patrick J. Brown, Sharon E. Mitchell, William L. Rooney, Mitchell R. Tuinstra, Cleve D. Franks, and Stephen Kresovich. 2008. “Community Resources and Strategies for Association Mapping in Sorghum.” Crop Science 48 (1): 30–40.

Davidson, Rebecca M., Malali Gowda, Gaurav Moghe, Haining Lin, Brieanne Vaillancourt, Shin-Han Shiu, Ning Jiang, and C. Robin Buell. 2012. “Comparative Transcriptomics of Three Poaceae Species Reveals Patterns of Gene Expression Evolution.” The Plant Journal: For Cell and Molecular Biology 71 (3): 492–502. PMID: 22443345.

Emms, David M., Sarah Covshoff, Julian M. Hibberd, and Steven Kelly. 2016. “Independent and Parallel Evolution of New Genes by Gene Duplication in Two Origins of C4 Photosynthesis Provides New Insight into the Mechanism of Phloem Loading in C4 Species.” Molecular Biology and Evolution 33 (7): 1796–1806. PMID: 27016024.

Goodstein, David M., Shengqiang Shu, Russell Howson, Rochak Neupane, Richard D. Hayes, Joni Fazo, Therese Mitros, et al. 2012. “Phytozome: A Comparative Platform for Green Plant Genomics.” Nucleic Acids Research 40 (Database issue): D1178–86. PMID: 22110026.

Jiao, Yinping, John J. Burke, Ratan Chopra, Gloria Burow, Junping Chen, Bo Wang, Chad Hayes, Yves Emendack, Doreen Ware, and Zhanguo Xin. 2016. “A Sorghum Mutant Resource as an Efficient Platform for Gene Discovery in Grasses.” The Plant Cell. PMID: 27354556.

McCormick, Ryan F., Sandra K. Truong, Avinash Sreedasyam, Jerry Jenkins, Shengqiang Shu, David Sims, Megan Kennedy, et al. 2018. “The Sorghum Bicolor Reference Genome: Improved Assembly, Gene Annotations, a Transcriptome Atlas, and Signatures of Genome Organization.” The Plant Journal: For Cell and Molecular Biology 93 (2): 338–54. PMID: 29161754.

Mace, Emma S., Shuaishuai Tai, Edward K. Gilding, Yanhong Li, Peter J. Prentis, Lianle Bian, Bradley C. Campbell, et al. 2013. “Whole-Genome Sequencing Reveals Untapped Genetic Potential in Africa’s Indigenous Cereal Crop Sorghum.” Nature Communications 4: 2320. PMID: 23982223.

Makita, Yuko, Setsuko Shimada, Mika Kawashima, Tomoko Kondou-Kuriyama, Tetsuro Toyoda, and Minami Matsui. 2015. “MOROKOSHI: Transcriptome Database in Sorghum Bicolor.” Plant & Cell Physiology 56 (1): e6. PMID: 25505007.

Morris, Geoffrey P., Punna Ramu, Santosh P. Deshpande, C. Thomas Hash, Trushar Shah, Hari D. Upadhyaya, Oscar Riera-Lizarazu, et al. 2013. “Population Genomic and Genome-Wide Association Studies of Agroclimatic Traits in Sorghum.” Proceedings of the National Academy of Sciences of the United States of America 110 (2): 453–58. PMID: 23267105.

Olson, Andrew, Robert R. Klein, Diana V. Dugas, Zhenyuan Lu, Michael Regulski, Patricia E. Klein, and Doreen Ware. 2014. “Expanding and Vetting Sorghum Bicolor Gene Annotations through Transcriptome and Methylome Sequencing.” The Plant Genome 7 (2): lantgenome2013.08.0025.

Paterson, A. H., J. E. Bowers, R. Bruggmann, I. Dubchak, J. Grimwood, H. Gundlach, G. Haberer, et al. 2009. “The Sorghum Bicolor Genome and the Diversification of Grasses.” Nature 457 (7229): 551–56. PMID: 19189423.

Turco, Gina M., Kaisa Kajala, Govindarajan Kunde-Ramamoorthy, Chew-Yee Ngan, Andrew Olson, Shweta Deshphande, Denis Tolkunov, et al. 2017. “DNA Methylation and Gene Expression Regulation Associated with Vascularization in Sorghum Bicolor.” The New Phytologist 214 (3): 1213–29. PMID: 28186631.

Xin, Zhanguo, Ming Li Wang, Noelle A. Barkley, Gloria Burow, Cleve Franks, Gary Pederson, and John Burke. 2008. “Applying Genotyping (TILLING) and Phenotyping Analyses to Elucidate Gene Function in a Chemically Induced Sorghum Mutant Population.” BMC Plant Biology. PMID: 18854043.

Wang, Bo, Michael Regulski, Elizabeth Tseng, Andrew Olson, Sara Goodwin, W. Richard McCombie, and Doreen Ware. 2018. “A Comparative Transcriptional Landscape of Maize and Sorghum Obtained by Single-Molecule Sequencing.” Genome Research 28 (6): 921–32. PMID: 29712755

Zheng, Lei-Ying, Xiao-Sen Guo, Bing He, Lian-Jun Sun, Yao Peng, Shan-Shan Dong, Teng-Fei Liu, et al. 2011. “Genome-Wide Patterns of Genetic Variation in Sweet and Grain Sorghum (Sorghum Bicolor).” Genome Biology 12 (11): R114. PMID: 22104744.