Generative Speech Separation based on Pitch Information
- đ¤ Speaker: Dr Xiang Li, Cambridge University Engineering Department
- đ Date & Time: Monday 11 October 2021, 12:00 - 13:00
- đ Venue: Zoom: https://us06web.zoom.us/j/87426783837?pwd=akx4ZWZMYVZML2ZoOWRlYzdRaTd6dz09
Abstract
Abstract: Monaural speech separation aims to separate concurrent speakers from a single-microphone mixture recording. Inspired by auditory scene analysis mechanisms, a generative speech separation framework based on pitch information will be presented in this talk. The prominent advantage of this framework is that both the permutation problem and the unknown speaker number problem existing in general models can be solved by using pitch contours to indicate the target speaker to be separated. In addition, the generative approach is applied instead of traditional time-frequency mask based approach, to improve the perceptual quality of separated speech. Specifically, the proposed framework can be divided into two phases: pitch extraction and speech separation. The former aims to accurately extract pitch contour candidates for each speaker from the mixture, where a two-stage approach is presented. Any pitch contour can be selected as the condition at the second phase, and a conditional generative adversarial network (CGAN) is used to separate the speaker corresponding to the given pitch condition. The proposed framework is evaluated in terms of pitch extraction as well as speech separation.
Bio: Xiang Li is a Research Associate in the Speech Group of the Machine Intelligence Laboratory, Engineering Department of Cambridge University, worked with Prof. Mark Gales. She recently received her PhD from Peking University, supervised by Prof. Xihong Wu. This talk is about her PhD thesis. Her research interests include speech enhancement/separation, perception and natural language processing.
Series This talk is part of the CUED Speech Group Seminars series.
Included in Lists
- Cambridge Forum of Science and Humanities
- Cambridge Language Sciences
- Cambridge talks
- Chris Davis' list
- CUED Speech Group Seminars
- Guy Emerson's list
- Information Engineering Division seminar list
- PhD related
- Zoom: https://us06web.zoom.us/j/87426783837?pwd=akx4ZWZMYVZML2ZoOWRlYzdRaTd6dz09
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)


Monday 11 October 2021, 12:00-13:00