Modeling Genetic Documents Written by DNA
- đ¤ Speaker: Naruemon Pratanwanich (University of Cambridge)
- đ Date & Time: Monday 12 May 2014, 14:00 - 15:00
- đ Venue: LT2, Computer Laboratory, William Gates Building
Abstract
For decades, biologists have recorded the expression levels of ten thousands of genes under many biological conditions of interest. However, such a huge list of genes would leave a burden of expertise for interpretation. In this lecture, I will present how this information can be deciphered into a more readable and understandable format. In particular, this data profile will be regarded as a genetic document that is written by DNA activities occurred in a cell.
Starting from Latent Dirichlet Allocation (LDA) which was originally developed in the field of text mining in order to model the relation of words in documents based on the abstract definition of topics, I will give you an overview of an inference method on the model parameters and explain how to use the learned model for new unseen documents. The concept of topic representation offers a less complex but more efficient method to manage a huge collection of documents. Next, I will demonstrate that the underlying intuitions of this model can be transferred onto biological data. Ending with some successful examples of LDA model extensions, I will show that it is straightforward to re-design a generative probabilistic model when new assumptions are given.
Series This talk is part of the Computer Laboratory Research Students' Lectures 2014 series.
Included in Lists
- Cambridge Infectious Diseases
- Computer Laboratory Research Students' Lectures 2014
- LT2, Computer Laboratory, William Gates Building
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)


Monday 12 May 2014, 14:00-15:00