Machine Learning Reveals the Genetic Code Controlling Splicing
- đ¤ Speaker: Brendan Frey - Microsoft Research
- đ Date & Time: Tuesday 14 July 2009, 15:00 - 16:00
- đ Venue: Small public lecture room, Microsoft Research Ltd, 7 J J Thomson Avenue (Off Madingley Road), Cambridge
Abstract
Abstract: Thirty years after the proposal of DNA , Roberts and Sharp discovered that DNA does not directly encode messenger RNA , but that a process called splicing assembles each mRNA based on carefully selected DNA subsequences. Because of this, a gene can encode many different mRNAs and which mRNAs are generated can depend on tissue type, age and disease. There are 22,000 human genes, but there are over 1,000,000 different mRNAs produced by splicing. One gene encodes 38,000 mRNAs that are involved in wiring together neurons. Another gene encodes two mRNAs that determine the organism`s sexual preference. Roberts and Sharp received the Nobel Prize for their work in 1993, but the genetic information responsible for controlling splicing has mostly remained a mystery. In the past 3 years it became possible to detect mRNAs with sufficient resolution that researchers can attempt to infer for the first time such a `splicing code`. In this talk, I`ll describe a machine learning technique that we used to infer a splicing code that is explanatory as well as predictive. Its interpretation is consistent with known mechanisms, but suggests new ones. The code achieves 93% prediction accuracy and was verified using different genes, different species and different experimental assays. Mutation of the identified genetic information leads to corresponding changes in splicing. In addition to describing these results, I`ll talk about how the objective was formulated as a machine learning problem, how the need for human interpretability shaped the approach and what was done to isolate causation from correlation.
Series This talk is part of the Microsoft Research Cambridge, general interest public talks series.
Included in Lists
- All Talks (aka the CURE list)
- bld31
- Cambridge Centre for Data-Driven Discovery (C2D3)
- Cambridge talks
- Centre for Health Leadership and Enterprise
- Chris Davis' list
- custom
- Featured lists
- Featured talks
- Guy Emerson's list
- Interested Talks
- Major Public Lectures in Cambridge
- Microsoft Research Cambridge, public talks
- ndk22's list
- Neurons, Fake News, DNA and your iPhone: The Mathematics of Information
- ob366-ai4er
- Optics for the Cloud
- personal list
- PMRFPS's
- rp587
- School of Technology
- Small public lecture room, Microsoft Research Ltd, 7 J J Thomson Avenue (Off Madingley Road), Cambridge
- Trust & Technology Initiative - interesting events
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)

Brendan Frey - Microsoft Research
Tuesday 14 July 2009, 15:00-16:00