New T2T assembly of Cryptosporidium parvum IOWA II annotated with Legacy-Compatible Gene identifiers

Cryptosporidium parvum is a significant pathogen causing gastrointestinal infections in humans and animals. It is spread through ingesting contaminated food and water. Despite its global health significance, generating a C. parvum genome sequence has been challenging for many reasons including cloning and challenging subtelomeric regions. A new, gapless, hybrid, telomere-to-telomere genome assembly was created for C. parvum IOWA II, here termed CpBGF. It reveals 8 chromosomes, a genome size of 9,259,183 bp, and resolves complex subtelomeric regions. To facilitate ease of use and consistency with the literature, the chromosomes have been oriented, and genes in this annotation have been given similar gene IDs as those used in the 2004, C. parvum IOWA II reference genome sequence. The new annotation utilized considerable RNA expression evidence including single-molecule Iso-Seq data; thus, untranslated regions, long noncoding RNAs, and antisense RNAs are annotated. The CpBGF genome assembly serves as a valuable resource for understanding the biology, pathogenesis, and transmission of C. parvum, and it facilitates the development of diagnostics, drugs, and vaccines against cryptosporidiosis.
Rodrigo de Paula Baptista, Rui Xiao, Yiran Li, Travis C Glenn, Jessica C Kissinger. Sci Data. 2025 Jun 19;12(1):1039. doi: 10.1038/s41597-025-05364-3.