Syllable based keyword search: transducing syllable lattices to word lattices
- đ¤ Speaker: Jim Hieronymus, ICSI (US)
- đ Date & Time: Friday 21 November 2014, 13:30 - 14:30
- đ Venue: Department of Engineering - LT2
Abstract
This paper presents a weighted finite state transducer (WFST) based syllable decoding and transduction framework for keyword search (KWS). Acoustic context dependent phone models are trained from word forced alignments. Then syllable decoding is done with lattices generated using a syllable lexicon and language model (LM). To process out of vocabulary (OOV) keywords, pronunciations are produced using a grapheme-to-syllable (G2S) system. Syllables not seen in the training set are approximated by using the closest perceptual syllable in the recognized syllable set. A syllable to word lexical transducer containing both in-vocabulary (IV) and OOV keywords is then constructed and composed with a keyword-boosted LM transducer. The composed transducer is then used to transduce syllable lattices to word lattices for final KWS . An ngram word sequence LM with the keywords boosted, provides the best performance. We show that our method can effectively perform KWS on both IV and OOV keywords, and yields up to 0.03 Actual Term-Weighted Value (ATWV) improvement over searching keywords directly in syllable lattices. Word Error Rates (WER) and KWS results are reported for five different languages, comparing whole word, phonetic confusion and syllable techniques. Combining the techniques provides even more improvement.
Speaker
Jim Hieronymus is a senior scientist and principal investigator at the International Institute for Computer Science in Berkeley, CA, USA . He is a collaborator with the Cambridge Speech Recognition Group in the Engineering Department. He has worked on putting a spoken dialog system on the International Space Station for NASA , on the EU Trindi project on integrating prosodics into a dialogue system, and at Bell Labs on spoken dialogue systems, speech recognition and spoken language identification. Before that Jim was a professor at the Center for Speech Technology Research and the Linguistics Department at Edinburgh University.
Sandwiches will be provided at 13:00, 30 minutes before the talk.
Series This talk is part of the CUED Speech Group Seminars series.
Included in Lists
- bld31
- Cambridge Centre for Data-Driven Discovery (C2D3)
- Cambridge Forum of Science and Humanities
- Cambridge Language Sciences
- Cambridge talks
- Chris Davis' list
- CUED Speech Group Seminars
- Department of Engineering - LT2
- Guy Emerson's list
- Information Engineering Division seminar list
- Interested Talks
- ndk22's list
- ob366-ai4er
- PhD related
- rp587
- Trust & Technology Initiative - interesting events
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)

Jim Hieronymus, ICSI (US)
Friday 21 November 2014, 13:30-14:30