Dear all,
the next seminar will take place on Wednesday *(tomorrow!)*, please see
the details below. We look forward to seeing you all!
Title: Semiempirical Quantum Mechanical Scoring in Structure-based
Drug Design
Speaker: Jan Řezáč (UOCHB AVCR)
Date and time: Wednesday 24.4.2024 - 17:20
Location: MFF UK, Malostranské nám. 25, lecture hall S4 (3rd floor)
https://bioinformatics.cuni.cz/seminar/
Best wishes,
Petr Danecek
Dear all,
the next seminar will take place on Wednesday, please see the details
below. We look forward to seeing you all!
Title: From human population variation to ligand binding sites via
SARS-CoV-2
Speaker: Geoff Barton (University of Dundee)
Date and time: Wednesday 17/04/2024 - 17:20
Location: MFF UK, Malostranské nám. 25, lecture hall S4
https://bioinformatics.cuni.cz/seminar/
Best wishes,
Petr Danecek
-------------------------------
*Geoff Barton (University of Dundee)*
*/From human population variation to ligand binding sites via SARS-CoV-2 /*
In this talk I will present an analysis that compares publicly available
variation data for human with variation seen across all available
protein sequences regardless of species. The analysis confirms patterns
of variation in human are consistent with protein structural features
(e.g. alpha-helix and begta-strand) but highlights structurally and
functionally important sites in around 15,000 human protein domains that
are not found by conventional sequence analysis methods. The identified
sites are enriched in disease-associated variants, ligand binding
residues and protein-protein interaction sites.
I will explain the method and illustrate the new analysis with examples
including the Nuclear Receptor Ligand Binding Domains and G-protein
coupled receptors (GPCRs) which are important therapeutic targets.
The study makes heavy use of the popular Jalview (www.jalview.org)
sequence analysis program developed since 1996 in my group, so I will
also give a brief update on Jalview’s new features for exploring nsSNPs
on alignments and three-dimensional structures including predictions by
AlphaFold.
Dear all,
the next seminar will take place on Wednesday, please see the details
below. We look forward to seeing you all!
Title: Computational methods for prosthetic vision: inferring
functional structure of the brain from its spontaneous activity
Speaker: Karolína Korvasová (Charles University, Faculty of Mathematics
and Physics)
Date and time: Wednesday 10/04/2024 - 17:20
Location: MFF UK, Malostranské nám. 25, lecture hall S4
https://bioinformatics.cuni.cz/seminar/
Best wishes,
Petr Danecek
-------------------------------
*
Karolína Korvasová* (Charles University, Faculty of Mathematics and Physics)
*/Computational methods for prosthetic vision: inferring functional
structure of the brain from its spontaneous activity/*
Being able to infer the functional structure of cortical neural networks
from their spontaneous activity would advance our understanding of
neural dynamics and have important applications in the field of visual
prosthetics, as functional properties of neurons in the visual cortex
cannot be measured directly in blind subjects. We designed a method that
estimates the structure of the orientation preference map in the primary
visual cortex. Using this method, we were able to show that functional,
as well as spatial properties of the sites stimulated with a cortical
visual prosthesis in blind humans determine the perceptual outcome. In
this talk I will first introduce some basic concepts of computational
neuroscience and discuss how biological neural networks can be modeled.
Next, I will briefly present a large-scale model of the primary visual
cortex developed in the group of Ján Antolík (MFF UK) that was used to
design the method that infers functional structure from spontaneous
activity. Finally, I will show the results of this method applied to
electrophysiological recordings from the visual cortex of sighted
non-human primates and blind human volunteers.
Dear all,
the next seminar will take place on Wednesday, please see the details
below. We look forward to seeing you all!
Title: A Telomere-to-Telomere (T2T) Mouse Genome Assembly
Speaker: Bailey Francis (Wellcome Sanger Institute)
Date and time: Wednesday 20/03/2024 - 17:20
Location: MFF UK, Malostranské nám. 25, lecture hall S4
https://bioinformatics.cuni.cz/seminar/
Best wishes,
Petr Danecek
-------------------------------
*Bailey Francis
/A Telomere-to-Telomere (T2T) Mouse Genome Assembly: Towards the first
complete mouse genome /*
The generation and assembly of a reference genome for C57BL/6J
revolutionized our ability to relate sequence to function, enabled
genetic screens in mice to be performed on an unprecedented scale, and
facilitated the task of creating a complete set of null alleles for all
genes. Despite over twenty years of effort, the current mouse reference
genome (GRCm39) has over 170 known gaps and unresolved issues. Many
important loci such as the major histocompatibility complex (MHC) on
chromosome 17, regions on chromosome X such as the pseudoautosomal
region (PAR), and Krüppel-associated box (KRAB) domain-containing
zinc-finger protein (KZFP) loci on chromosome 2 and 4, remain incomplete
or inaccurate. By using a combination of novel high molecular weight DNA
extraction methodologies and ultra-long sequencing technologies (PacBio
HiFi and ultra-long Oxford Nanopore), we have generated the most
complete mouse reference genome to date using mESCs from a C57BL/6J x
CAST/EiJ F1 animal. We employed a trio-based genome assembly approach to
achieve complete separation of haplotypes to produce two
telomere-to-telomere (T2T) mouse reference genomes where the majority of
chromosomes were assembled into a single contiguous sequence. This
represents a major milestone in the journey towards a fully complete
mouse reference genome. Key findings reveal that our C57BL/6J assembly
fully closes over 91% of the autosomal gaps in GRCm39 with over 12 Mbp
of novel sequence. Additionally, we have shown that our new T2T
assemblies significantly improve the representation of previously
hard-to-assemble regions when compared to the current reference genomes
(e.g. PAR, inversions, KZFPs). Not only do our assemblies unlock some of
the most challenging loci in the mouse genome, but with the
near-completed genomic sequences for two mouse strains, our work enables
comparative analyses in these complex regions for the very first time.
Dear all,
the next seminar will take place on Wednesday, please see the details
below. We look forward to seeing you all!
Title: Efficient Search of Microbial Genomes via Phylogenetic Compression
Speaker: Karel Břinda (INRIA)
Date and time: Wednesday 20/12/2023 - 17:20
Location: MFF UK, Malostranské nám. 25, lecture hall S3 (3rd floor)
https://bioinformatics.cuni.cz/seminar/
Best wishes,
Petr Danecek
-------------------------------
*Karel Břinda (INRIA)
/Efficient Search of Microbial Genomes via Phylogenetic Compression/*
Comprehensive collections approaching millions of sequenced genomes have
become central information sources in the life sciences. However, the
rapid growth of these collections makes it effectively impossible to
search these data using tools such as BLAST and its successors. In this
talk, we will present a new technique called phylogenetic compression,
which uses evolutionary history to guide compression and efficiently
search large collections of microbial genomes using existing algorithms
and data structures. We will show that, when applied to modern diverse
collections approaching millions of genomes, lossless phylogenetic
compression improves the compression ratios of assemblies, de Bruijn
graphs, and k-mer indexes by one to two orders of magnitude.
Additionally, we will present a pipeline for a BLAST-like search over
these phylogeny-compressed reference data, and demonstrate it can align
genes, plasmids, or entire sequencing experiments against all sequenced
bacteria until 2019 on ordinary desktop computers within a few hours.
Phylogenetic compression has broad applications in computational biology
and may provide a fundamental design principle for future genomics
infrastructure.
Dear all,
the next seminar will take place on *Tuesday*, please see the details
below. We look forward to seeing you all!
Title: Probing the dynamics of macromolecules and energetics of
molecular interactions with high-performance, fast and accessible
computational methods
Speaker: Rafael Najmanovich (University of Montreal)
Date and time: *Tuesday 12/12/2023 - 10:40am*
Location: *PřF UK, Viničná 7, room 311 (3rd floor)*
https://bioinformatics.cuni.cz/seminar/ [bioinformatics.cuni.cz]
Best wishes,
Petr Danecek
-------------------------------
*Probing the dynamics of macromolecules and energetics of molecular
interactions with high-performance, fast and accessible computational
methods*
*/Rafael Najmanovich (University of Montreal)/*
In this talk I will discuss recent methods developed within our group
for ultra-massive virtual screening, protein engineering and
understanding macromolecular dynamics all based on basic simple
biophysical principles leading to fas, accessible and high-performing
computational methods with examples of their application in the study of
GPCRs, SARS-CoV-2 and Ebola.
Dear all,
the next seminar will take place tomorrow, please see the details below.
We look forward to seeing you all!
Title: Augusta, a Python package for inferring Gene Regulatory and
Boolean Networks using RNA-Seq and data mining
Speaker: Jana Musilová (Brno University of Technology)
Date and time: Wednesday 22/11/2023 - 17:20
Location: MFF UK, Malostranské nám. 25, lecture hall S3
https://bioinformatics.cuni.cz/seminar/
Best wishes,
Petr Danecek
-------------------------------
*Jana Musilová (Brno University of Technology)*
*/Augusta, a Python package for inferring Gene Regulatory and Boolean
Networks using RNA-Seq and data mining/*
Augusta is an open-source Python package for Gene Regulatory Network
(GRN) and Boolean Network (BN) inference, based on a unique approach
combining laboratory-generated data processing with the additional
knowledge incorporation. In detail, the first estimation of a GRN is
inferred by mutual information calculation from the high-throughput gene
expression data. Subsequently, the GRN is refined by predicting
transcription factor binding motifs within the promoters of regulated
genes and integrating verified interactions obtained from curated
databases. The refined GRN is finally transformed into a draft BN by
searching in the curated model database CellCollective and adding
logical rules to individual edges.
Dear all,
the next seminar will take place on Wednesday, please see the details
below. We look forward to seeing you all!
Title: Machine learning approaches for the identification of disease
signatures
Speaker: Carl Hermann (University of Heidelberg)
Date and time: Wednesday 10/5/2023 - 17:20
Location: MFF UK, Malostranské nám. 25, lecture hall S4
https://bioinformatika.mff.cuni.cz/seminar
Best wishes,
Petr Danecek
---------------------------------
*Carl Hermann (University of Heidelberg)**
*/*Machine learning approaches for the identification of disease
signatures*/
In this presentation, I will review some recent work in which we have
focused on extracting disease signatures from large-scale omics datasets
using various linear and non-linear machine-learning approaches. In
particular, I will present some recent work on building explainable AI
models an their applications to various disease traits.
Dear all,
the next seminar will take place tomorrow, please see the details below.
We look forward to seeing you all!
Title: MolMeDB - databáze interakcí malých molekul s membránami a
membránovými transportními proteiny
Speaker: Kateřina Storchmannova (Palacký University Olomouc)
Date and time: Wednesday19/4/2023 - 17:20
Location: MFF UK, Malostranské nám. 25, lecture hall S4
https://bioinformatika.mff.cuni.cz/seminar/
Best wishes,
Petr Danecek
-------------------------------
*Kateřina Storchmannova (Palacký University Olomouc) *
/*MolMeDB - databáze interakcí malých molekul s membránami a
membránovými transportními proteiny*/
Biologické membrány jsou přirozenými bariérami buněk a některých
buněčných organel. Jako takové hrají důležitou roli ve farmakokinetice a
farmakodynamice léčiv a dalších xenobiotik. K určení interakcí mezi
molekulami a membránami se používá řada experimentálních i teoretických
přístupů, ale neexistoval žádný otevřený zdroj těchto dat. Z tohoto
důvodu jsme vytvořili databázi MolMeDB (Molecule on Membrane Database),
která shromažďuje údaje o interakcích pro více než 500 000 molekul s
membránami či membránovými transportními proteiny. Informace o
interakcích byly shromážděny z vědeckých článků publikovaných v
recenzovaných časopisech, z in-house výpočtů či z databází. MolMeDB může
být mocným nástrojem, který může pomoci experimentátorům i teoretikům
pomoci lépe pochopit chování látek na membránách.
Dear all,
the next seminar will take place on Wednesday, please see the details
below. We look forward to seeing you all!
Title: Masked superstrings as a unified framework for textual k-mer set
representations
Speaker: Ondřej Sladký, Faculty of Mathematics and Physics, Charles
University
Date and time: Wednesday 27/3/2023 - 17:20
Location: MFF UK, Malostranské nám. 25, lecture hall S4
https://bioinformatika.mff.cuni.cz/seminar/
Best wishes,
Petr Danecek
-------------------------------
*Ondřej Sladký (Faculty of Mathematics and Physics, Charles University)*
*/Masked superstrings as a unified framework for textual k-mer set
representations/*
The popularity of k-mer-based methods has recently led to the
development of compact k-mer-set representations, such as
simplitigs/Spectrum-Preserving String Sets (SPSS), matchtigs, and
eulertigs. These aim to represent k-mer sets via strings that contain
individual k-mers as substrings more efficiently than the traditional
unitigs. Here, we demonstrate that all such representations can be
viewed as superstrings of input k-mers, and as such can be generalized
into a unified framework that we call the masked superstring of k-mers.
We study the complexity of masked superstring computation and prove
NP-hardness for both k-mer superstrings and their masks. We then design
local and global greedy heuristics for efficient computation of masked
superstrings, implement them in a program called KmerCamel, and evaluate
their performance using selected genomes and pan-genomes. Overall,
masked superstrings unify the theory and practice of textual k-mer set
representations and provide a useful framework for optimizing
representations for specific bioinformatics applications.