Speaker
Description
Since latent disease heterogeneity complicates discovery of biomarkers and elucidation of disease mechanisms, unsupervised stratification based on omics data is an extremely important problem in biomedicine. This problem is traditionally approached by clustering methods which may not be efficient for high-dimensional datasets with multiple overlapping patterns of various sizes. A promising alternative for unsupervised patient stratification is biclustering. This is an approach that allows finding submatrices with a specific pattern in a two-dimensional sample-gene matrix.
Although dozens of biclustering methods have already been published, only a minority of them is aimed specifically at finding differentially expressed biclusters. Our previous work has shown a limited ability of existing biclustering methods to robustly recover known PAM50 breast cancer subtypes and little agreement between the outputs of different tools. This motivated us to develop DESMOND (https://github.com/ozolotareva/DESMOND), a novel method for the identification of differentially expressed biclusters which uses interaction networks as constraints to improve the robustness of the biclustering results. We applied DESMOND to two independent breast cancer cohorts (TCGA-BRCA and METABRIC) and confirmed that it identified more robust biclusters than other methods. However, found biclusters poorly recovered known subtypes and were small in terms of genes, possibly due to incompleteness of the input network.
Currently, we are developing DESMOND 2.0, an updated version of DESMOND which makes three major modifications. First, it does not rely on interaction networks and clusters individual genes instead of gene pairs. Second, it uses Gaussian mixture models for the binarization of gene expressions. Third, it allows the user to choose between probabilistic and deterministic clustering based on weighted gene co-expression network analysis. These modifications greatly improve the tool runtime and help to find larger biclusters in terms of genes and to recover known breast cancer subtypes more precisely.