BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Statistical theory for deep neural networks with ReLU activation f
 unction - Johannes Schmidt-hieber (Universiteit Leiden)
DTSTART:20180321T113000Z
DTEND:20180321T123000Z
UID:TALK102736@talks.cam.ac.uk
CONTACT:INI IT
DESCRIPTION:The universal approximation theorem states that neural network
 s are capable of approximating any continuous function up to a small error
  that depends on the size of the network. The expressive power of a networ
 k does\, however\, not guarantee that deep networks perform well on data. 
 For that\, control of the statistical estimation risk is needed. In the ta
 lk\, we derive statistical theory for fitting deep neural networks to data
  generated from the multivariate nonparametric regression model. It is sho
 wn that estimators based on sparsely connected deep neural networks with R
 eLU activation function and properly chosen network architecture achieve t
 he minimax rates of convergence (up to logarithmic factors) under a genera
 l composition assumption on the regression function. The framework include
 s many well-studied structural constraints such as (generalized) additive 
 models. While there is a lot of flexibility in the network architecture\, 
 the tuning parameter is the sparsity of the n etwork. Specifically\, we co
 nsider large networks with number of potential parameters being much bigge
 r than the sample size. Interestingly\, the depth (number of layers) of th
 e neural network architectures plays an important role and our theory sugg
 ests that scaling the network depth with the logarithm of the sample size 
 is natural.<br><br>Related Links<ul><li><a target="_blank" rel="nofollow" 
 href="http://www-old.newton.ac.uk/cgi/https%3A%2F%2Farxiv.org%2Fabs%2F1708
 .06633">https://arxiv.org/abs/1708.06633</a> - Article</li></ul>
LOCATION:Seminar Room 1\, Newton Institute
END:VEVENT
END:VCALENDAR
