University of Cambridge > Talks.cam > NLIP Seminar Series > Achieving Verified Robustness to Adversarial NLP Inputs

Log in

Google

Microsoft

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

Achieving Verified Robustness to Adversarial NLP Inputs

Download to your calendar using vCal

Johannes Welbl (UCL)
Friday 12 June 2020, 12:30-13:30
https://meet.google.com/tgv-vods-pdk.

If you have a question about this talk, please contact Guy Aglionby .

Note later start

Neural networks are part of many contemporary NLP systems, yet their empirical success comes at the price of vulnerability to adversarial attacks, e.g. by synonym replacements or adversarial text deletion. While much previous work uses adversarial training or data augmentation to partially mitigate such brittleness, these methods are unlikely to actually find worst-case inputs due to the complexity of the search space arising from discrete text perturbations. In this talk, I will introduce an approach that tackles the problem of adversarial robustness from the opposite direction: we formally verify a system’s robustness against pre-defined classes of adversarial attacks. To this end we adopt Interval Bound Propagation and bound the consequences which input changes can have on model predictions, thus establishing bounds on worst-case adversarial attacks. We furthermore modify the conventional log-likelihood training objective to train models which can be efficiently verified in constant time—this would otherwise come with exponential search complexity. The resulting models have much improved verified accuracy, and come with an efficiently computable formal guarantee on worst case adversarial attacks.

This talk is part of the NLIP Seminar Series series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Achieving Verified Robustness to Adversarial NLP Inputs

📅 Download to calendar (vCal)

⚠️ Important: Note later start

👤 Speaker: Johannes Welbl (UCL)
📅 Date & Time: Friday 12 June 2020, 12:30 - 13:30
📍 Venue: https://meet.google.com/tgv-vods-pdk

Questions? Contact Guy Aglionby

Abstract

Series This talk is part of the NLIP Seminar Series series.

Included in Lists

Note: Ex-directory lists are not shown.

Log in

🔐 Log In

Information on

ℹ️ Information

Achieving Verified Robustness to Adversarial NLP Inputs

This talk is included in these lists:

Achieving Verified Robustness to Adversarial NLP Inputs

Abstract

Included in Lists

Log in

🔐 Log In

Information on

ℹ️ Information

Achieving Verified Robustness to Adversarial NLP Inputs

This talk is included in these lists:

Other lists

Other talks

Achieving Verified Robustness to Adversarial NLP Inputs

Abstract

Included in Lists