BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Waste Not\, Want Not: Why Rarefying Microbiome Data is not an opti
 mal normalization procedure - Holmes\, S (Stanford University)
DTSTART:20140328T114500Z
DTEND:20140328T123000Z
UID:TALK51678@talks.cam.ac.uk
CONTACT:Mustapha Amrani
DESCRIPTION:Co-author: Paul Joey McMurdie (Stanford University) \n\nThe in
 terpretation of metagenomic count data originating from the current genera
 tion of DNA sequencing platforms requires special attention. In particular
 \, the per-sample library sizes often vary by orders of magnitude from the
  same sequencing run\, and the counts are overdispersed relative to a simp
 le Poisson model. These challenges can be addressed using an appropriate m
 ixture model that simultaneously accounts for library size differences and
  biological variability. This approach is already well-characterized and i
 mplemented for RNA-Seq data in R packages such as edgeR and DESeq. We use 
 statistical theory\, extensive simulations\, and empirical data to show th
 at variance stabilizing normalization using a mixture model like the negat
 ive binomial is appropriate for microbiome count data. In simulations dete
 cting differential abundance\, normalization procedures based on a Gamma-P
 oisson mixture model provided systematic improvement in performance over c
 rude proportions or rarefied counts -- both of which led to a high rate of
  false positives. In simulations evaluating clustering accuracy\, we found
  that the rarefying procedure discarded samples that were nevertheless acc
 urately clustered by alternative methods\, and that the choice of minimum 
 library size threshold was critical in some settings\, but with an optimum
  that is unknown in practice. Techniques that use variance stabilizing tra
 nsformations by modeling microbiome count data with a mixture distribution
 \, such as those implemented in edgeR and DESeq\, substantially improved u
 pon techniques that attempt t o normalize by rarefying or crude proportion
 s. Based on these results and well-established statistical theory\, we adv
 ocate that investigators avoid rarefying altogether. We have provided micr
 obiome-specific extensions to these tools in the R package\, phyloseq.\n\n
 Related Links:\n\nhttp://arxiv.org/abs/1310.0424 - Arxiv Version of Paper.
  \n\nhttp://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.00
 61217 - Phyloseq Package Description and Philosophy \n
LOCATION:Seminar Room 1\, Newton Institute
END:VEVENT
END:VCALENDAR
