BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:ShrinkSeq: a flexible and powerful method for Bayesian analysis of
  RNAseq data - Mark van de Wiel (VU University Medical Center)
DTSTART:20120402T130000Z
DTEND:20120402T140000Z
UID:TALK34774@talks.cam.ac.uk
CONTACT:Florian Markowetz
DESCRIPTION:Next generation sequencing is quickly replacing microarrays as
  a technique to probe different molecular levels of the cell\, such as DNA
  or mRNA. The technology has the advantage to provide higher resolution\, 
 while reducing biases\, in particular at the lower end of the spectrum. mR
 NA sequencing (RNAseq) data consist in counts of pieces of RNA called tags
 . This type of data imposes new challenges for statistical analysis. We pr
 esent a novel approach to model and analyze these data.\n\nMethodologies a
 nd softwares for differential expression analysis usually use some general
 ization of the Poisson or Binomial distribution that accounts for overdisp
 ersion. A popular choice is the negative binomial (i.e. Poisson-Gamma) mod
 el. However\, there is no consensus on what model fits best to RNAseq data
 \, and this may depend on the technology used. With RNAseq\, the number of
  features vastly exceeds the sample size. This implies that shrinkage of v
 ariance-related parameters may lead to more stable estimates and inference
 . Methods to do so are available\, but only for a single parameter and in 
 the context of restrictive study designs\, e.g. two-group comparisons or f
 ixed-effect designs.\n\nWe present a framework that allows for a) various 
 count models b) flexible designs c) random effects and d) multi-parameter 
 shrinkage across tags by Empirical Bayes. Moreover\, it implements Bayesia
 n multiplicity correction\, thereby providing solid inference. In a data-b
 ased simulation\, we show that our method outperforms other methods (edgeR
 \, DESeq\, baySeq\, noiSeq). Moreover\, we illustrate our approach on two 
 data sets. The first is a CAGE data set containing 25 samples representing
  five regions of the human brain from seven individuals. The design is inc
 omplete and a batch effect is present. The second is a miRNA sequencing da
 ta set from seven pairs of tumors. The data motivates use of the zero-infl
 ated negative binomial as a powerful alternative to the negative binomial\
 , because it leads to less bias of the overdispersion parameter and improv
 ed detection power for the low-count tags.\n\nThe framework is not restric
 ted to RNAseq data. It is currently being extended towards proteomics\, hi
 gh-throughput screening and integrative data\, in particular DNA copy numb
 er and mRNA/miRNA.\n
LOCATION:Cancer Research UK Cambridge Research Institute\, Lecture Theatre
END:VEVENT
END:VCALENDAR
