Sorghum bicolor Rio

Sorghum bicolor (L.) Moench subsp. bicolor, ‘RIO’ is an archetypal sweet sorghum line. In contrast to the current reference genome, the short-stature, early maturing inbred ‘BTx623′ genotype that is used primarily for production of grain sorghum hybrids, sweet sorghums differ in maturity and grain production, and are most notably characterized by their ability to produce a high concentration of soluble sugars in the stalk (concentrated stem sugars and high biomass yield), which can be extracted for human consumption.

 

Germplasm

Plant Introduction (PI) number for S. bicolor (L.) Moench subsp. bicolor, ‘RIO’ (1) in the U.S. National Plant Germplasm System (GRIN – Global): PI 651496.

This accession is part of the following population panels:

 

Pedigree

PI 651496 was selected from the progeny of a cross between Rex (MN 23) and PI 152959 (MN 1048).

 

Image

There are no images for this accession in the GRIN database.

 

Statistics (Source: NCBI, April 2021)

Sorghum line   Rio
Assembly information
Assembly name SbicolorRio_v2
Assembly date n/a
Assembly accession n/a
WGS accession JADDXV000000000
Assembly provider
Sequencing description Sequencing technologies: Illumina HiSeq 2500
  Sequencing method
  Genome coverage: 74.928x
Assembly description Assembly methods: FALCON v. 2.2
  Construction of pseudomolecules
Finishing strategy
NCBI submission Submitted (26-OCT-2020)
Publication: Cooper et al (2019)
 
Assembly statistics
Number of contigs 3,830
Total assembly length (Mb) 729
Contig N50 (Mb) 0
 
Annotations stats
Total number of genes 35,490
Total number of transcripts 41,048
Average gene length 3,322
Exons per transcript 5

 

Assembly

The Sorghum Rio genome assembly was constructed by Cooper et al (Cooper et al, 2019) using FALCON (Chin et al, 2016) and polished with Quiver (Chin et al, 2013).

The Sorghum Rio v2.1 assembly in SorghumBase corresponds to release v2.0 of Phytozome. A total of 35,627 unique, non-repetitive, non-overlapping 1 KB sequences were generated using the existing Sorghum bicolor v3.0 assembly and aligned to the polished Sorghum Rio assembly. Scaffolds were oriented, ordered, and assembled into 10 chromosomes.

NCBI accession: GCA_015952705.1.

 

Annotation

Genome-guided transcript assemblies were made from close to 1 billion bp of 2x151bp paired-end Illumina RNAseq reads using PERTRAN (Shu, unpublished cited in Cooper et al, 2019). PASA (Haas et al, 2003) alignment assemblies were constructed using the PERTRAN output from the Rio RNAseq data along with sequences from known S. bicolor expressed sequence tags (ESTs) associated with the current reference genome.

As further described in Phytozome, loci were determined by transcript assembly alignments and/or EXONERATE alignments of proteins from Arabidopsis thaliana, soybean, maize, rice, foxtail, Sorghum bicolor BTx623, brachy, grape, and Swiss-Prot proteomes to the repeat-soft-masked Sorghum bicolor Rio genome using RepeatMasker (RepeatMasker Open-3.0 by AFA Smit, R Hubley & P Green, 1996-2011) with up to 2K BP extension on both ends unless extending into another locus on the same strand. Gene models were predicted by homology-based predictors, FGENESH+ (Salamov and Solovyev 2000), FGENESH_EST (similar to FGENESH+, EST as splice site and intron input instead of protein/translated ORF), and GenomeScan (Yeh et al, 2001), PASA assembly ORFs (in-house homology constrained ORF finder) and from AUGUSTUS via BRAKER1 (Hoff et al, 2016). The best scored predictions for each locus were selected using multiple positive factors including EST and protein support, and one negative factor: overlap with repeats. The selected gene predictions were improved by PASA (Haas et al, 2003). PASA-improved gene model proteins were subject to protein homology analysis to above mentioned proteomes to obtain Cscore and protein coverage; PASA-improved transcripts were selected based on Cscore, protein coverage, EST coverage, and its CDS overlapping with repeats. Selected gene models were subject to Pfam analysis and gene models whose protein was more than 30% in Pfam TE domains were removed. For additional details, see Sorghum bicolor Rio v2.1 (Sorghum Rio) in Phytozome v12.1.

 

Literature References

Broadhead, D. M. 1972. “Registration of Rio Sweet Sorghum 1 (reg. No. 113).” Crop Science 12 (5): 716–716. https://doi.org/10.2135/cropsci1972.0011183X001200050068x.

Brenton, Zachary W., Elizabeth A. Cooper, Mathew T. Myers, Richard E. Boyles, Nadia Shakoor, Kelsey J. Zielinski, Bradley L. Rauh, William C. Bridges, Geoffrey P. Morris, and Stephen Kresovich. 2016. “A Genomic Resource for the Development, Improvement, and Exploitation of Sorghum for Bioenergy.” Genetics 204 (1): 21–33. PMID: 27356613. https://doi.org/10.1534/genetics.115.183947.

Casa, Alexandra M., Gael Pressoir, Patrick J. Brown, Sharon E. Mitchell, William L. Rooney, Mitchell R. Tuinstra, Cleve D. Franks, and Stephen Kresovich. 2008. “Community Resources and Strategies for Association Mapping in Sorghum.” Crop Science 48 (1): 30–40. https://doi.org/10.2135/cropsci2007.02.0080.

Chin, Chen-Shan, David H. Alexander, Patrick Marks, Aaron A. Klammer, James Drake, Cheryl Heiner, Alicia Clum, et al. 2013. “Nonhybrid, Finished Microbial Genome Assemblies from Long-Read SMRT Sequencing Data.” Nature Methods 10 (6): 563–69. PMID: 23644548. https://doi.org/10.1038/nmeth.2474.

Chin, Chen-Shan, Paul Peluso, Fritz J. Sedlazeck, Maria Nattestad, Gregory T. Concepcion, Alicia Clum, Christopher Dunn, et al. 2016. “Phased Diploid Genome Assembly with Single-Molecule Real-Time Sequencing.” Nature Methods 13 (12): 1050–54. PMID: 27749838. https://doi.org/10.1038/nmeth.403.

Cooper, Elizabeth A., Zachary W. Brenton, Barry S. Flinn, Jerry Jenkins, Shengqiang Shu, Dave Flowers, Feng Luo, et al. 2019. “A New Reference Genome for Sorghum Bicolor Reveals High Levels of Sequence Similarity between Sweet and Grain Genotypes: Implications for the Genetics of Sugar Metabolism.” BMC Genomics 20 (1): 420. PMID: 31133004. https://doi.org/10.1186/s12864-019-5734-x.

Haas, Brian J., Arthur L. Delcher, Stephen M. Mount, Jennifer R. Wortman, Roger K. Smith Jr, Linda I. Hannick, Rama Maiti, et al. 2003. “Improving the Arabidopsis Genome Annotation Using Maximal Transcript Alignment Assemblies.” Nucleic Acids Research 31 (19): 5654–66. PMID: 14500829. https://doi.org/10.1093/nar/gkg770.

Hoff, Katharina J., Simone Lange, Alexandre Lomsadze, Mark Borodovsky, and Mario Stanke. 2016. “BRAKER1: Unsupervised RNA-Seq-Based Genome Annotation with GeneMark-ET and AUGUSTUS: Table 1.” Bioinformatics. https://doi.org/10.1093/bioinformatics/btv661. PMID: 26559507. https://doi.org/10.1093/bioinformatics/btv661.

Salamov, A. A., and V. V. Solovyev. 2000. “Ab Initio Gene Finding in Drosophila Genomic DNA.” Genome Research 10 (4): 516–22. PMID: 10779491. https://doi.org/10.1101/gr.10.4.516

Yeh, R. F., L. P. Lim, and C. B. Burge. 2001. “Computational Inference of Homologous Gene Structures in the Human Genome.” Genome Research 11 (5): 803–16. PMID: 11337476. https://doi.org/10.1101/gr.175701.