Telomere-to-telomere genome assembly of sorghum.

Li M, Chen C, Wang H, Qin H, Hou S, Yang X, Jian J, Gao P, Liu M, Mu Z

Published: 2 August 2024 in Scientific data
Keywords: No keywords in Pubmed
Pubmed ID: 39095379
DOI: 10.1038/s41597-024-03664-8

"Cuohu Bazi" (CHBZ) is an ancient sorghum variety collected from the fields of China, known for its agronomic traits like dwarf stature, early maturation. In this study, we present the first telomere-to-telomere (T2T) and gap-free genome assembly of CHBZ using PacBio HiFi reads, Oxford Nanopore Technologies, and Hi-C data. The assembled genome comprises 724.85 Mb, effectively resolving all 3,913 gaps that were present in the previous sorghum BTx623 reference genome. Notably, the T2T assembly captures 10 centromeres and all 20 telomeres, providing strong support for their integrity. This assembly is of high quality in terms of contiguity (contig N50: 71.1 Mb), completeness (BUSCO score: 99.01%, k-mer completeness: 98.88%), and correctness (QV: 61.60). Repetitive sequences accounted for 70.41% of the genome and a total of 32,855 protein-coding genes have been annotated. Furthermore, 161 CHBZ-specific presence/absence variants genes have been identified when comparing to BTx623 genome. This study provides valuable insights for future research on sorghum genetics, genomics, and evolutionary history.

Agricultural Germplasm Resources in Shanxi Province - Project of Conservation and Utilization sxzyk202201
Shanxi Province - Basic Research Program 20210302124238