Deep Structured Prediction for Handwriting Recognition
- ๐ค Speaker: Juan Murillo Fuentes
- ๐ Date & Time: Thursday 02 November 2017, 13:30 - 15:00
- ๐ Venue: Engineering Department, CBL Seminar Room 4-38
Abstract
Work in collaboration with P. M. Olmos and J.C.A. Jaramillo
Abstract:
Structured prediction or structured (output) learning is an umbrella term for supervised machine learning techniques that involve predicting structured objects, rather than scalar discrete or real values. Application domains include bioinformatics, natural language processing, speech recognition, and computer vision. While convolutional nets are quite useful in this task, when looking for long term dependencies in the input, recurrent neural networks (RNN) are preferred. However, the problem of vanishing/exploding gradients prevents the use of simple RNNs and long short-term memory (LSTM) networks are widely adopted. In this talk we focus on the handwritten text recognition problem. We review LSTM as the state-of-the-art solution within deep learning structures. Then, to cope with the problem of (letters) labelling in an image we explain the connectionist temporal classification (CTC), a useful tool whose cost function can be incorporated in the network to allow for gradient computations. We discuss some results obtained in an attempt to put this theory to work with TensorFlow.
Reading list:
No previous reading is really needed beyond general concepts of deep neural networks.
Useful references:
- I. Goodfellow, Y. Bengio, A. Courville, “Deep Learning”. MIT Press 2016, chapters 6 to 9 for concepts on deep learning, chapter 10 in particular for recurrent networks (but see below for LSTM ).
- M. Gรถrner, Tensor Flow and Deep Learning without a PhD, https://codelabs.developers.google.com/codelabs/cloud-tensorflow-mnist/#0, quick Review of Main concepts and examples on DL
- Z. C. Lipton, John Berkowitz, Charles Elkan, “A Critical Review of Recurrent Neural Networks for Sequence Learning”, 2015, https://arxiv.org/abs/1506.00019
- K. Cho, “Natural Language Understanding with Distributed Representation”, 2016, for a detailed explanation of LST Ms
- C. Olah, “Understanding LSTM Networks” 2015 http://colah.github.io/posts/2015-08-Understanding-LSTMs/, for a good explanation of LST Ms
- A. Graves, “Supervised sequence labelling with recurrent neural networks” Ph. D. Thesis 2008, for an explanation of the CTC
Series This talk is part of the Machine Learning Reading Group @ CUED series.
Included in Lists
- All Talks (aka the CURE list)
- bld31
- Cambridge Centre for Data-Driven Discovery (C2D3)
- Cambridge Forum of Science and Humanities
- Cambridge Language Sciences
- Cambridge talks
- Cambridge University Engineering Department Talks
- Centre for Smart Infrastructure & Construction
- Chris Davis' list
- Computational Continuum Mechanics Group Seminars
- custom
- Engineering Department, CBL Seminar Room 4-38
- Featured lists
- Guy Emerson's list
- Hanchen DaDaDash
- Inference Group Journal Clubs
- Inference Group Summary
- Information Engineering Division seminar list
- Interested Talks
- Machine Learning Reading Group
- Machine Learning Reading Group @ CUED
- Machine Learning Summary
- ML
- ndk22's list
- ob366-ai4er
- Quantum Matter Journal Club
- Required lists for MLG
- rp587
- School of Technology
- Simon Baker's List
- TQS Journal Clubs
- Trust & Technology Initiative - interesting events
- yk373's list
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)

Juan Murillo Fuentes
Thursday 02 November 2017, 13:30-15:00