Speaker Retrieval in the Wild: Challenges, Effectiveness and Robustness
- đ¤ Speaker: Erfan Loweimi, Cambridge University Engineering Department
- đ Date & Time: Monday 18 March 2024, 12:00 - 13:00
- đ Venue: Zoom only: https://cam-ac-uk.zoom.us/j/86177109545?pwd=TmE1YlgzNWJKdGJQa1NQdk1kNS9zQT09
Abstract
Effective speaker retrieval in real-world applications is an important problem with extensive applications, given the vastness of available media archives. In this talk, we investigate the speaker retrieval systems developed by CUED in the context of the EPSRC -funded MVSE (Multimodal Video Search by Example) project. While we focus on the BBC Rewind corpus (1948-1979), our framework addresses the broader issue of speaker retrieval on extensive and possibly aged archives.
We explore various challenges encountered in developing a speaker retrieval system in the wild, addressing two primary issues: the dataset’s unsuitability for direct training and performance evaluation due to noisy and unreliable metadata, and the unconstrained acoustic conditions encountered in the archive, ranging from quiet studios to adverse noisy real-world environments.
Various aspects of system development, challenges, potential solutions, and their functionality are examined, along with systematic experiments conducted in both clean setups and against various distortions to evaluate performance. Additionally, we touch on the utility of multimodal audio-visual speaker retrieval and analyse the synergy and consistency between these two modalities.
Series This talk is part of the CUED Speech Group Seminars series.
Included in Lists
- Cambridge Forum of Science and Humanities
- Cambridge Language Sciences
- Cambridge talks
- Chris Davis' list
- CUED Speech Group Seminars
- Guy Emerson's list
- Information Engineering Division seminar list
- PhD related
- Zoom only: https://cam-ac-uk.zoom.us/j/86177109545?pwd=TmE1YlgzNWJKdGJQa1NQdk1kNS9zQT09
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)


Monday 18 March 2024, 12:00-13:00