Dear all,

the next seminar will take place on Wednesday, please see the details below. We look forward to seeing you all!

Title:  A Telomere-to-Telomere (T2T) Mouse Genome Assembly
Speaker:  Bailey Francis (Wellcome Sanger Institute)

Date and time: Wednesday 20/03/2024 - 17:20
Location: MFF UK, Malostranské nám. 25, lecture hall S4
https://bioinformatics.cuni.cz/seminar/

Best wishes,
Petr Danecek

-------------------------------

Bailey Francis
A Telomere-to-Telomere (T2T) Mouse Genome Assembly: Towards the first complete mouse genome


The generation and assembly of a reference genome for C57BL/6J revolutionized our ability to relate sequence to function, enabled genetic screens in mice to be performed on an unprecedented scale, and facilitated the task of creating a complete set of null alleles for all genes. Despite over twenty years of effort, the current mouse reference genome (GRCm39) has over 170 known gaps and unresolved issues. Many important loci such as the major histocompatibility complex (MHC) on chromosome 17, regions on chromosome X such as the pseudoautosomal region (PAR), and Krüppel-associated box (KRAB) domain-containing zinc-finger protein (KZFP) loci on chromosome 2 and 4, remain incomplete or inaccurate. By using a combination of novel high molecular weight DNA extraction methodologies and ultra-long sequencing technologies (PacBio HiFi and ultra-long Oxford Nanopore), we have generated the most complete mouse reference genome to date using mESCs from a C57BL/6J x CAST/EiJ F1 animal. We employed a trio-based genome assembly approach to achieve complete separation of haplotypes to produce two telomere-to-telomere (T2T) mouse reference genomes where the majority of chromosomes were assembled into a single contiguous sequence. This represents a major milestone in the journey towards a fully complete mouse reference genome. Key findings reveal that our C57BL/6J assembly fully closes over 91% of the autosomal gaps in GRCm39 with over 12 Mbp of novel sequence. Additionally, we have shown that our new T2T assemblies significantly improve the representation of previously hard-to-assemble regions when compared to the current reference genomes (e.g. PAR, inversions, KZFPs). Not only do our assemblies unlock some of the most challenging loci in the mouse genome, but with the near-completed genomic sequences for two mouse strains, our work enables comparative analyses in these complex regions for the very first time.