Discovery and allele frequency estimation of somatic twilight zone insertions and deletions
- ๐ค Speaker: Alexander Schรถnhuth (CWI Amsterdam) ๐ Website
- ๐ Date & Time: Monday 22 September 2014, 16:00 - 17:00
- ๐ Venue: CRUK CI Lecture Theatre
Abstract
Locating somatic mutations on the basis of next-generation sequence (NGS) data of disease-control matched samples constitutes an essential step in cancer and other clinical research. The detection of these genetic variants remains a major challange, not only due to the impurity and heterogeneity of the disease sample, but also because of the inherent uncertainty present in this type of data. The main sources of `noise’ in NGS data are commonly thought to be alignment and typing uncertainties, where the first refers to the fact that the origin of the reads on the genome are unknown, and the latter reflects the often limited confidence one has on whether a read stems from an allele-affected chromosome or not. The case of calling somatic twilight zone (or mid-size) indels is considered exceptionally hard, due to the high typing uncertainties involved. We present a maximum likelihood approach that allows us to robustly estimate twi- light zone indel allele frequencies while taking the individual read alignment and typing uncertainties into account. By means of likelihood factorization, we can estimate the allele frequencies in the disease and in the control sample simultaneously (while accounting for impurity as well). In addition, we define a likelihood-ratio based signficance test which allows one to test for the presence/absence of a somatic mutation. This statistical framework is not only restricted to somatic twilight zone indel calling, but allows for other applications as well, such as robustly genotpying di- and polyploid cells or de novo twilight zone indel calling, all while accounting for alignment and typing uncertainties.
In summary, our results point out that we have the first tool that can discover ‘somatic twilight zone insertions and deletions’ (indels of size 30-120 bp) at sufficient recall and precision— so far hardly any such somatic indels have been discovered. As a consequence, their extent and their potential effects have hardly been explored.
Series This talk is part of the Seminars on Quantitative Biology @ CRUK Cambridge Institute series.
Included in Lists
- All CMS Events
- All Talks (aka the CURE list)
- Biology
- bld31
- Cambridge Centre for Data-Driven Discovery (C2D3)
- CamBridgeSens
- Cambridge talks
- CCC talks for website
- Chris Davis' list
- Computational and Systems Biology
- CRUK CI Lecture Theatre
- CRUK CI Seminars
- custom
- Graduate-Seminars
- Interested Talks
- Liam
- Life Science Interface Seminars
- Life Sciences
- Life Sciences
- ME Seminar
- my_list
- ndk22's list
- ob366-ai4er
- other talks
- PMRFPS's
- rp587
- School of Physical Sciences
- se393's list
- Seminars on Quantitative Biology @ CRUK Cambridge Institute
- sfm36
- Trust & Technology Initiative - interesting events
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)



Monday 22 September 2014, 16:00-17:00