Itakura-Saito nonnegative factorizations of the power spectrogram for music signal decomposition
- 👤 Speaker: Dr Cedric Fevotte, CNRS - TELECOM ParisTech
- 📅 Date & Time: Thursday 18 March 2010, 14:15 - 15:15
- 📍 Venue: LR5, Engineering, Department of
Abstract
Nonnegative matrix factorization (NMF) is a popular linear regression technique in the fields of machine learning and signal/image processing. Much research about this topic has been driven by applications in audio. NMF has been for example applied with success to automatic music transcription and audio source separation, where the data is usually taken as the magnitude spectrogram of the sound signal, and the Euclidean distance or Kullback-Leibler divergence are used as measures of fit between the original spectrogram and its approximate factorization.
After a brief overview of NMF , in this presentation we will show evidence of the relevance of considering factorization of the power spectrogram, with the Itakura-Saito (IS) divergence. Indeed, IS-NMF is shown to be connected to maximum likelihood inference of variance parameters in a well-defined statistical model of superimposed Gaussian components and this model is in turn shown to be well suited to audio. Furthermore, the statistical setting opens doors to Bayesian approaches and to a variety of computational inference techniques. We discuss in particular model order selection strategies and Markov regularization of the activation matrix, to account for time-persistence in audio.
This presentation will also adress extensions of NMF to the multichannel case, in both instantaneous or convolutive recordings, possibly underdetermined, leading to nonnegative tensor factorizations under novel structures. We will present in particular audio source separation results of real-world stereo musical excerpts.
References :
C. Févotte, N. Bertin and J.-L. Durrieu. “Nonnegative matrix factorization with the Itakura-Saito divergence. With application to music analysis,” Neural Computation, vol. 21, no 3, Mar. 2009 http://www.tsi.enst.fr/fevotte/Journals/neco09_is-nmf.pdf
A. Ozerov and C. Févotte. “Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation,” IEEE Trans. Audio, Speech and Language Processing, 2010 (to appear) http://www.tsi.enst.fr/fevotte/TechRep/techrep09_multinmf.pdf
Series This talk is part of the Probabilistic Systems, Information, and Inference Group Seminars series.
Included in Lists
- All Talks (aka the CURE list)
- bld31
- Cambridge Centre for Data-Driven Discovery (C2D3)
- Cambridge talks
- Cambridge University Engineering Department Talks
- Centre for Smart Infrastructure & Construction
- Chris Davis' list
- Computational Continuum Mechanics Group Seminars
- Featured lists
- Information Engineering Division seminar list
- Interested Talks
- LR5, Engineering, Department of
- ndk22's list
- ob366-ai4er
- Probabilistic Systems, Information, and Inference Group Seminars
- rp587
- School of Technology
- Trust & Technology Initiative - interesting events
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)

Dr Cedric Fevotte, CNRS - TELECOM ParisTech
Thursday 18 March 2010, 14:15-15:15