Very deep convolutional neural networks for speech recognition
- π€ Speaker: Tom Sercu, IBM Watson, USA
- π Date & Time: Friday 19 August 2016, 12:00 - 13:00
- π Venue: Department of Engineering - LR5
Abstract
Convolutional Neural Networks are one of the main drivers of the recent deep learning explosion, with the βAlexnet” (2012) result on the imagenet competition, and consecutive models like Overfeat (2013), VGG net (2014), GoogLeNet (2014), and residual networks (2015). In the speech recognition domain, CNNs with 2 convolutional layers were introduced around 2012 and have not seen major updates since. We will present a number of recent architectural advances in CNNs for speech recognition. We introduce a very deep convolutional network architecture with up to 14 weight layers. There are multiple convolutional layers before each pooling layer, with small 3×3 kernels, inspired by the VGG Imagenet 2014 architecture. We will discuss the design choice of strided pooling and zero-padding along the time direction, which renders convolutional evaluation of sequences highly inefficient. This can be phrased in the computer vision terminology of classification vs dense pixelwise prediction. We define the architectural constraints to make efficient evaluation of full utterances possible. This allows batch normalization to be adopted during full-utterance sequence training, resulting in faster training and improved performance. We show state of the art results on the benchmark switchboard 2000 hour dataset (Hub5 eval). We also adapted our architecture to the multilingual setting and got strong results on the babel OP3 surprise language after multilingual training on 25 languages.
Series This talk is part of the CUED Speech Group Seminars series.
Included in Lists
- bld31
- Cambridge Centre for Data-Driven Discovery (C2D3)
- Cambridge Forum of Science and Humanities
- Cambridge Language Sciences
- Cambridge talks
- Chris Davis' list
- CUED Speech Group Seminars
- Department of Engineering - LR5
- Guy Emerson's list
- Information Engineering Division seminar list
- Interested Talks
- ndk22's list
- ob366-ai4er
- PhD related
- rp587
- Trust & Technology Initiative - interesting events
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)


Friday 19 August 2016, 12:00-13:00