University of Cambridge > Talks.cam > Machine learning theory > Tackling Label Corruptions: Univariate Polynomial Regression and Generalized Linear Models

Tackling Label Corruptions: Univariate Polynomial Regression and Generalized Linear Models

Download to your calendar using vCal

If you have a question about this talk, please contact Fernando Ruiz Mazo .

Label corruptions pose a significant challenge in various machine learning tasks, affecting the accuracy and reliability of models. In this talk, we will address two distinct problems involving label corruptions, and present approaches to handle them effectively.

The first problem we consider is that of robust univariate polynomial regression. In this problem the goal is to recover a polynomial which is pointwise close to a target polynomial, given samples where, with probability $\alpha$ the samples are clean (satisfy the model); and with probability $1-\alpha$ the label is corrupted (completely arbitrary). We propose an approach which can tolerate a corruption fraction as large as any constant less than 1/2, which is the information theoretic limit for unique recovery in this problem.

In the second problem, we examine the challenge of learning a linear function composed with a generalized linear model (GLM). We focus on the oblivious noise setting, where up to any constant fraction of the labels are corrupted via arbitrary independent and additive noise. We show that in this setting, it is always possible to recover a polynomial-sized list of candidates, one of which is arbitrarily close to the true answer. Furthermore, under mild distributional assumptions, we show this recovery is unique.

This talk is co-hosted by the Computer Laboratory AI Research Group.

This talk is part of the Machine learning theory series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

Š 2006-2025 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity