University of Cambridge > Talks.cam > Data Science and AI in Medicine > Principles of AI-driven Neuroscience and Translational Biomedicine

Principles of AI-driven Neuroscience and Translational Biomedicine

Download to your calendar using vCal

If you have a question about this talk, please contact Pietro Lio .

Title: Efficiency and flexibility in an LSTM model of human spoken word recognition

Abstract: Recent advances in artificial neural networks have enabled the design of automatic speech recognition systems that identify spoken words with an accuracy approaching human listeners. By analysing the functional characteristics and internal representations of such systems, and comparing them to human listeners, we can gain novel insights into classic psycholinguistic findings and testable predictions for neuroimaging experiments exploring the neural computations for human speech perception.

Here we build on a recently published end-to-end model of human speech recognition (‘EARSHOT’). This is a recurrent (LSTM) neural network trained to map from acoustic representations of spoken words to representations of word meaning (semantics). It exhibits a human-like time course of word identification with parallel activation (due to phonological overlap) of onset-aligned ‘cohort’ neighbours (e.g. chain/change), and reduced, or delayed competition effects between rhyme neighbours (e.g. chain/gain).

We systematically characterised EARSHOT ’s behaviour in recognising speech across different talkers, speaking rates, and levels of spectral detail. In addition, we analysed the model’s hidden-state dynamics to provide a mechanistic explanation for several of the observed behavioural patterns.

This talk is part of the Data Science and AI in Medicine series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2025 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity