University of Cambridge > Talks.cam > NLIP Seminar Series > NAACL practice talks

Log in

Google

Microsoft

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

NAACL practice talks

Download to your calendar using vCal

Simon Baker (LTL) & Marek Rei (NLIP), University of Cambridge
Friday 25 May 2018, 12:00-13:00
FW26, Computer Laboratory.

If you have a question about this talk, please contact Andrew Caines .

Zero-shot Sequence Labeling: Transferring Knowledge from Sentences to Tokens

Marek Rei & Anders Søgaard

Can attention- or gradient-based visualization techniques be used to infer token-level labels for binary sequence tagging problems, using networks trained only on sentence-level labels? We construct a neural network architecture based on soft attention, train it as a binary sentence classifier and evaluate against token-level annotation on four different datasets. Inferring token labels from a network provides a method for quantitatively evaluating what the model is learning, along with generating useful feedback in assistance systems. Our results indicate that attention-based methods are able to predict token-level labels more accurately, compared to gradient-based methods, sometimes even rivaling the supervised oracle network.

Variable Typing: Assigning Meaning to Variables in Mathematical Text

Yiannos A. Stathopoulos, Simon Baker, Marek Rei & Simone Teufel

Information about the meaning of mathematical variables in text is useful in NLP /IR tasks such as symbol disambiguation, topic modeling and mathematical information retrieval (MIR). We introduce variable typing, the task of assigning one mathematical type (multi-word technical terms referring to mathematical concepts) to each variable in a sentence of mathematical text. As part of this work, we also introduce a new annotated data set composed of 33,524 data points extracted from scientific documents published on arXiv. Our intrinsic evaluation demonstrates that our data set is sufficient to successfully train and evaluate current classifiers from three different model architectures. The best performing model is evaluated on an extrinsic task: MIR , by producing a typed formula index. Our results show that the best performing MIR models make use of our typed index, compared to a formula index only containing raw symbols, thereby demonstrating the usefulness of variable typing.

This talk is part of the NLIP Seminar Series series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

NAACL practice talks

📅 Download to calendar (vCal)

👤 Speaker: Simon Baker (LTL) & Marek Rei (NLIP), University of Cambridge
📅 Date & Time: Friday 25 May 2018, 12:00 - 13:00
📍 Venue: FW26, Computer Laboratory

Questions? Contact Andrew Caines

Abstract

Zero-shot Sequence Labeling: Transferring Knowledge from Sentences to Tokens

Marek Rei & Anders Søgaard

Variable Typing: Assigning Meaning to Variables in Mathematical Text

Yiannos A. Stathopoulos, Simon Baker, Marek Rei & Simone Teufel

Series This talk is part of the NLIP Seminar Series series.

Included in Lists

Note: Ex-directory lists are not shown.

Log in

🔐 Log In

Information on

ℹ️ Information

NAACL practice talks

This talk is included in these lists:

NAACL practice talks

Abstract

Included in Lists

Log in

🔐 Log In

Information on

ℹ️ Information

NAACL practice talks

This talk is included in these lists:

Other lists

Other talks

NAACL practice talks

Abstract

Included in Lists