Learning Markov Networks for Mixed Big Data: Applications to Cancer Genomics
- đ¤ Speaker: Genevera Allen, Rice University
- đ Date & Time: Thursday 05 March 2015, 16:00 - 17:00
- đ Venue: MR12, Centre for Mathematical Sciences, Wilberforce Road, Cambridge
Abstract
“Mixed Data’’ comprising a large number of heterogeneous variables (e.g. count, binary, continuous, skewed continuous, among other data types) is prevalent in varied areas such as imaging genetics, national security, social networking, Internet advertising, and our particular motivation – high-throughput integrative genomics. There have been limited efforts at statistically modeling such mixed data jointly, in part because of the lack of computationally amenable multivariate distributions that can capture direct dependencies between variables of different types. In this talk, we address this by introducing several new classes of Markov Random Fields (MRFs), or graphical models, that yield joint densities over mixed variables. To begin, we present a novel class of MRFs arising when all node-conditional distributions follow univariate exponential family distributions that, for instance, yield novel Poisson graphical models. Next, we introduce extensions of this for Mixed MRF distributions. Unfortunately, these formulations can place severe and unrealistic restrictions on the parameter space. To remedy this, we we introduce a class of mixed conditional random field distributions, that are then chained according to a block-directed acyclic graph to form a new class of so-called Block Directed Markov Random Fields (BDMRFs). The Markov independence graph structure underlying our BDMRF then has both directed and undirected edges.
We will briefly review the theoretical properties of these models and introduce penalized conditional likelihood estimators with statistical guarantees for learning the underlying mixed network structure. Simulations as well as an application to integrative cancer genomics demonstrate the versatility of our methods. In our particular example, we learn integrative genomic networks from breast cancer next generation sequencing expression data and mutation data that yield several interesting findings.
Joint work with Eunho Yang, Pradeep Raviukmar, Zhandong Liu, Yulia Baker, and Ying-Wooi Wan.
Series This talk is part of the Statistics series.
Included in Lists
- All CMS events
- All Talks (aka the CURE list)
- bld31
- Cambridge Centre for Data-Driven Discovery (C2D3)
- Cambridge Forum of Science and Humanities
- Cambridge Language Sciences
- Cambridge talks
- Chris Davis' list
- CMS Events
- Creating transparent intact animal organs for high-resolution 3D deep-tissue imaging
- custom
- DPMMS info aggregator
- DPMMS lists
- DPMMS Lists
- Guy Emerson's list
- Hanchen DaDaDash
- Interested Talks
- Machine Learning
- MR12, Centre for Mathematical Sciences, Wilberforce Road, Cambridge
- ndk22's list
- ob366-ai4er
- rp587
- School of Physical Sciences
- Statistical Laboratory info aggregator
- Statistics
- Statistics Group
- Trust & Technology Initiative - interesting events
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)

Genevera Allen, Rice University
Thursday 05 March 2015, 16:00-17:00