An improved reference of the grapevine genome reasserts the origin of the PN40024 highly homozygous genotype.

Velt A, Frommer B, Blanc S, Holtgräwe D, Duchêne É, Dumas V, Grimplet J, Hugueney P, Kim C, Lahaye M, Matus JT, Navarro-Payá D, Orduña L, Tello-Ruiz MK, Vitulo N, Ware D, Rustenholz C

Published: 27 March 2023 in G3 (Bethesda, Md.)
Keywords: Vitis vinifera, genotype PN40024, improved annotation, long reads, reference genome
Pubmed ID: 36966465
DOI: 10.1093/g3journal/jkad067

The genome sequence of the diploid and highly homozygous Vitis vinifera genotype PN40024 serves as the reference for many grapevine studies. Despite several improvements to the PN40024 genome assembly, its current version PN12X.v2 is quite fragmented and only represents the haploid state of the genome with mixed haplotypes. In fact, being nearly homozygous, this genome contains several heterozygous regions that are yet to be resolved. Taking the opportunity of improvements that long-read sequencing technologies offer to fully discriminate haplotype sequences, an improved version of the reference, called PN40024.v4, was generated. Through incorporating long genomic sequencing reads to the assembly, the continuity of the 12X.v2 scaffolds was highly increased with a total number decreasing from 2,059 to 640 and a reduction in N bases of 88%. Additionally, the full alternative haplotype sequence was built for the first time, the chromosome anchoring was improved and the number of unplaced scaffolds was reduced by half. To obtain a high-quality gene annotation that outperforms previous versions, a liftover approach was complemented with an optimized annotation workflow for Vitis. Integration of the gene reference catalogue and its manual curation have also assisted in improving the annotation, while defining the most reliable estimation of 35,230 genes to date. Finally, we demonstrated that PN40024 resulted from 9 selfings of cv. "Helfensteiner" (cross of cv. "Pinot noir" and "Schiava grossa") instead of a single "Pinot noir". These advances will help maintain the PN40024 genome as a gold-standard reference, also contributing toward the eventual elaboration of the grapevine pangenome.

COST Action - INTEGRAPE CA17111
German Network for Bioinformatics Infrastructure (de.NBI) - BMBF-funded de.NBI Cloud 031A532B
German Network for Bioinformatics Infrastructure (de.NBI) - BMBF-funded de.NBI Cloud 031A533A
German Network for Bioinformatics Infrastructure (de.NBI) - BMBF-funded de.NBI Cloud 031A533B
German Network for Bioinformatics Infrastructure (de.NBI) - BMBF-funded de.NBI Cloud 031A534A
German Network for Bioinformatics Infrastructure (de.NBI) - BMBF-funded de.NBI Cloud 031A535A
German Network for Bioinformatics Infrastructure (de.NBI) - BMBF-funded de.NBI Cloud 031A537A
German Network for Bioinformatics Infrastructure (de.NBI) - BMBF-funded de.NBI Cloud 031A537B
German Network for Bioinformatics Infrastructure (de.NBI) - BMBF-funded de.NBI Cloud 031A537C
German Network for Bioinformatics Infrastructure (de.NBI) - BMBF-funded de.NBI Cloud 031A537D
German Network for Bioinformatics Infrastructure (de.NBI) - BMBF-funded de.NBI Cloud 031A538A
INRAE - Biologie et Amélioration des Plantes no grant ID listed