Dear all,
the next seminar will take place tomorrow, please see the details below.
We look forward to seeing you all!
Title: Augusta, a Python package for inferring Gene Regulatory and
Boolean Networks using RNA-Seq and data mining
Speaker: Jana Musilová (Brno University of Technology)
Date and time: Wednesday 22/11/2023 - 17:20
Location: MFF UK, Malostranské nám. 25, lecture hall S3
https://bioinformatics.cuni.cz/seminar/
Best wishes,
Petr Danecek
-------------------------------
*Jana Musilová (Brno University of Technology)*
*/Augusta, a Python package for inferring Gene Regulatory and Boolean
Networks using RNA-Seq and data mining/*
Augusta is an open-source Python package for Gene Regulatory Network
(GRN) and Boolean Network (BN) inference, based on a unique approach
combining laboratory-generated data processing with the additional
knowledge incorporation. In detail, the first estimation of a GRN is
inferred by mutual information calculation from the high-throughput gene
expression data. Subsequently, the GRN is refined by predicting
transcription factor binding motifs within the promoters of regulated
genes and integrating verified interactions obtained from curated
databases. The refined GRN is finally transformed into a draft BN by
searching in the curated model database CellCollective and adding
logical rules to individual edges.
Dear all,
the next seminar will take place on Wednesday, please see the details
below. We look forward to seeing you all!
Title: Machine learning approaches for the identification of disease
signatures
Speaker: Carl Hermann (University of Heidelberg)
Date and time: Wednesday 10/5/2023 - 17:20
Location: MFF UK, Malostranské nám. 25, lecture hall S4
https://bioinformatika.mff.cuni.cz/seminar
Best wishes,
Petr Danecek
---------------------------------
*Carl Hermann (University of Heidelberg)**
*/*Machine learning approaches for the identification of disease
signatures*/
In this presentation, I will review some recent work in which we have
focused on extracting disease signatures from large-scale omics datasets
using various linear and non-linear machine-learning approaches. In
particular, I will present some recent work on building explainable AI
models an their applications to various disease traits.
Dear all,
the next seminar will take place tomorrow, please see the details below.
We look forward to seeing you all!
Title: MolMeDB - databáze interakcí malých molekul s membránami a
membránovými transportními proteiny
Speaker: Kateřina Storchmannova (Palacký University Olomouc)
Date and time: Wednesday19/4/2023 - 17:20
Location: MFF UK, Malostranské nám. 25, lecture hall S4
https://bioinformatika.mff.cuni.cz/seminar/
Best wishes,
Petr Danecek
-------------------------------
*Kateřina Storchmannova (Palacký University Olomouc) *
/*MolMeDB - databáze interakcí malých molekul s membránami a
membránovými transportními proteiny*/
Biologické membrány jsou přirozenými bariérami buněk a některých
buněčných organel. Jako takové hrají důležitou roli ve farmakokinetice a
farmakodynamice léčiv a dalších xenobiotik. K určení interakcí mezi
molekulami a membránami se používá řada experimentálních i teoretických
přístupů, ale neexistoval žádný otevřený zdroj těchto dat. Z tohoto
důvodu jsme vytvořili databázi MolMeDB (Molecule on Membrane Database),
která shromažďuje údaje o interakcích pro více než 500 000 molekul s
membránami či membránovými transportními proteiny. Informace o
interakcích byly shromážděny z vědeckých článků publikovaných v
recenzovaných časopisech, z in-house výpočtů či z databází. MolMeDB může
být mocným nástrojem, který může pomoci experimentátorům i teoretikům
pomoci lépe pochopit chování látek na membránách.
Dear all,
the next seminar will take place on Wednesday, please see the details
below. We look forward to seeing you all!
Title: Masked superstrings as a unified framework for textual k-mer set
representations
Speaker: Ondřej Sladký, Faculty of Mathematics and Physics, Charles
University
Date and time: Wednesday 27/3/2023 - 17:20
Location: MFF UK, Malostranské nám. 25, lecture hall S4
https://bioinformatika.mff.cuni.cz/seminar/
Best wishes,
Petr Danecek
-------------------------------
*Ondřej Sladký (Faculty of Mathematics and Physics, Charles University)*
*/Masked superstrings as a unified framework for textual k-mer set
representations/*
The popularity of k-mer-based methods has recently led to the
development of compact k-mer-set representations, such as
simplitigs/Spectrum-Preserving String Sets (SPSS), matchtigs, and
eulertigs. These aim to represent k-mer sets via strings that contain
individual k-mers as substrings more efficiently than the traditional
unitigs. Here, we demonstrate that all such representations can be
viewed as superstrings of input k-mers, and as such can be generalized
into a unified framework that we call the masked superstring of k-mers.
We study the complexity of masked superstring computation and prove
NP-hardness for both k-mer superstrings and their masks. We then design
local and global greedy heuristics for efficient computation of masked
superstrings, implement them in a program called KmerCamel, and evaluate
their performance using selected genomes and pan-genomes. Overall,
masked superstrings unify the theory and practice of textual k-mer set
representations and provide a useful framework for optimizing
representations for specific bioinformatics applications.
Dear all,
the next seminar will take place on Wednesday, please see the details
below. We look forward to seeing you all!
Title: Disease maps: building and analysing graphical models of
biomedical knowledge
Speaker: Marek Ostaszewski, Luxembourg Centre for Systems Biomedicine
Date and time: Wednesday 22/3/2023 - 17:20
Location: MFF UK, Malostranské nám. 25, lecture hall S3
https://bioinformatika.mff.cuni.cz/seminar/
Best wishes,
Petr Danecek
-------------------------------
*Marek Ostaszewski*
*/Disease maps: building and analysing graphical models of biomedical
knowledge/*
Disease maps encode knowledge about molecular pathophysiology in both
visual and computational format, helping interdisciplinary exchange
between bench scientists, clinical researchers and bioinformaticians. In
this talk I’ll introduce the concept based on the example of Parkinson’s
disease map and demonstrate it as a tool for visual exploration,
analytics and investigating complex data. Then, I’ll describe the
evolution of the approach and how it was picked up by a broader research
community leading to a large-scale effort to build the COVID-19 Disease
Map.
Dear all,
the next seminar will take place on Wednesday, please see the details
below. We look forward to seeing you all!
Title: Building complex bioinformatics pipelines using Snakemake
Speaker: Joern Gerchen
Date and time: Wednesday 8/3/2023 - 17:20
Location: MFF UK, Malostranské nám. 25, lecture hall S3
https://bioinformatika.mff.cuni.cz/seminar/
Best wishes,
Petr Danecek
----------------
*Joern Gerchen: /Building complex bioinformatics pipelines using
Snakemake/*
Polyploidy, the presence of multiple genome copies as a result of
whole-genome-duplications, is often thought to cause immediate
repruductive isolation between polyploids and diploid relatives, due to
inviability and sterility of hybrid offspring. However, recent research
showed evidence of introgression (gene-flow via hybridization and
backcrossing) between diploid and polyploid lineages. In this seminar I
will introduce my PostDoc project, which uses population genomic
analyses to quantify the degree of introgression between multiple
natural plant lineages with variable ploidy and assess to what degree it
can contribute to adaptive evolution. In order to determine the degree
of inter-ploidy introgression, variant calling and subsequent population
genomic analyses have to be run for each non-model species in a
ploidy-aware manner. These analyses require complex custom
bioinformatics pipelines, which have to be run repeatedly for multiple
lineages on HPC computing clusters. Implementing and running these types
of workflows in an efficient and reproducible manner can be challenging.
As an approach to overcome these issues I will introduce Snakemake,
which allows to implement automated, scalable and reproducible
bioinformatics workflows.
Dear all,
the next seminar will take place next week, please see the details
below. We look forward to seeing you all!
Title: Functionalizing the cancer genome base-by-base with CRISPR
editing and computational tools
Speaker: Victoria Offord, WSI
Date and time: Wednesday 7/12/2022 - 17:20
Location: MFF UK, Malostranské nám. 25, lecture hall S6
http://bioinformatics.cuni.cz/seminar
Best wishes,
Petr Danecek
-------------------------------
Victoria Offord
/Functionalizing the cancer genome base-by-base with CRISPR editing and
computational tools/*
*
Over the last decade, the reduction in costs for high throughput
technologies has enabled us to sequence more human genomes than ever
before. We can see this in the rapid expansion of the repositories used
to store and share genetic data amongst researchers. To date, over 1.5
million variants have been submitted to ClinVar, a public database of
clinically relevant variants. However, over 50% of these are variants of
uncertain significance (VUS) for which we have no clear interpretation
of their function. Multiplexed Assays of Variant Effect (MAVEs) can be
used to generate variant effect maps, which, over time, may be used to
aid clinical interpretations. Saturation genome editing (SGE) is a
CRISPR-based MAVE which allows us to test all possible single- and
multi-nucleotide or amino acid variations across a genomic region. We’ll
discuss SGE, the community, available resources, tools being developed
and how the data that we’ve been generating may go on to aid in the
interpretation of VUS.
Dear all,
the next seminar will take place next week, please see the details
below. We look forward to seeing you all!
Title: AlphaFold DB, Increasing the structure coverage of the sequence
space by a thousand times
Speaker: Mihaly Varadi, EMBL
Date and time: Wednesday 23/11/2022 - 17:20
Location: MFF UK, Malostranské nám. 25, lecture hall S6
https://bioinformatika.mff.cuni.cz/seminar/
Best wishes,
Petr Danecek
-------------------------------
*Mihaly Varadi*
*/AlphaFold DB: Increasing the structure coverage of the sequence space
by a thousand times/*
I will give a brief introduction about why (predicted) protein
structures can be useful, and then talk briefly about how the protein
structure prediction field grew over the years. I will also talk about
AlphaFold, its strengths and weaknesses, and explain the various types
of output coming from the AI, in particular about the various confidence
measures. Then I'll talk about the database (what types of data, what
technology, how to access the data) and then briefly about the place of
AlphaFold DB in the wider context of other predicted protein structure
providers and initiatives such as the 3D-Beacons.
Dear all,
the next seminar will take place next week, please see the details
below. We look forward to seeing you all!
Title: Measures of quality of clusters in hierarchical clustering of
flow cytometry data
Speaker: Tomáš Sieger
Date and time: Wednesday 19/10/2022 - 17:20
Location: MFF UK, Malostranské nám. 25, lecture hall S6
https://bioinformatika.mff.cuni.cz/seminar/
Best wishes,
Petr Danecek
----------------
*Tomáš Sieger: **/Measures of quality of clusters in hierarchical
clustering of flow cytometry data/*
Hierarchical clustering enables unsupervised analysis of
multidimensional data, yielding a dendrogram, a hierarchical tree of
clusters of data samples. However, the dendrogram does not readily
specify the quality of individual clusters in it. In the lecture, we
will show measures of quality of clusters in hierarchical clustering,
developed in our group, that can be used to guide selection of
meaningful clusters in flow cytometry data. The new unsupervised method
enables automated processing of large amounts of data without a need of
costly and subjective manual intervention. In future, we need to study
the general applicability of our measures of cluster quality and
validate them on other data sets.
Dear all,
the next seminar will take place this Wednesday, please see the details
below. We look forward to seeing you all!
Best wishes,
Petr Danecek
Date and time: Wednesday 20/04/2022 - 17:20
Location: MFF UK, Malostranské nám. 25, lecture hall S6
https://bioinformatika.mff.cuni.cz/seminar/
Title: Sequence space and protein functional classification
Speaker: Alessandra Carbone
Affiliation: Department of Computational and Quantitative Biology,
Sorbonne University
Abstract:
Functional classification of proteins from sequences alone has become a
critical bottleneck in understanding the myriad of protein sequences
that accumulate in our databases. The great diversity of homologous
sequences hides, in many cases, a variety of functional activities that
cannot be anticipated. Their identification appears critical for a
fundamental understanding of the evolution of living organisms and for
biotechnological applications.
In this talk, I will explain how rethinking the sequence space with
multiple profile models leads to the functional classification of
proteins. I will present ProfileView, a sequence-based computational
method designed to functionally classify sets of homologous sequences.
ProfileView relies on two main ideas: the use of multiple profile models
whose construction explores evolutionary information in available
databases, and a novel definition of a representation space in which to
analyse sequences with multiple profile models combined together.
ProfileView classifies protein families by enriching known functional
groups with new sequences and discovering new groups and subgroups.