BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Multilingual Autoregressive Entity Linking - Nicola De Cao (Univer
 sity of Amsterdam\, Huggingface)
DTSTART:20220318T120000Z
DTEND:20220318T130000Z
UID:TALK171803@talks.cam.ac.uk
CONTACT:Michael Schlichtkrull
DESCRIPTION:Entities are at the center of how we represent and aggregate k
 nowledge. For instance\, Encyclopedias such as Wikipedia are structured by
  entities (e.g.\, one per Wikipedia article). The ability to retrieve such
  entities given a query is fundamental for knowledge-intensive tasks such 
 as entity linking and open-domain question answering. Current approaches c
 an be understood as classifiers among atomic labels\, one for each entity.
  Their weight vectors are dense entity representations produced by encodin
 g entity meta information such as their descriptions. This approach has se
 veral shortcomings: (i) context and entity affinity is mainly captured thr
 ough a vector dot product\, potentially missing fine-grained interactions\
 ; (ii) a large memory footprint is needed to store dense representations w
 hen considering large entity sets\; (iii) an appropriately hard set of neg
 ative data has to be subsampled at training time. In this work\, we propos
 e mGENRE\, the first system that retrieves entities by generating their un
 ique names\, left to right\, token-by-token in an autoregressive fashion. 
 This mitigates the aforementioned technical issues since: (i) the autoregr
 essive formulation directly captures relations between context and entity 
 name\, effectively cross encoding both\; (ii) the memory footprint is grea
 tly reduced because the parameters of our encoder-decoder architecture sca
 le with vocabulary size\, not entity count\; (iii) the softmax loss is com
 puted without subsampling negative data. We experiment with more than with
  more than 100 languages on more than 25 datasets on entity disambiguation
 \, end-to-end entity linking and document retrieval tasks\, achieving new 
 state-of-the-art or very competitive results while using a tiny fraction o
 f the memory footprint of competing systems.\n\nTopic: NLIP Seminar\nTime:
  Mar 18\, 2022 12:00 PM London\n\nJoin Zoom Meeting\nhttps://cl-cam-ac-uk.
 zoom.us/j/93197062657?pwd=eEVuT0h4MGRJOEhCaEF4MDJZQm9zdz09\n\nMeeting ID: 
 931 9706 2657\nPasscode: 501991\n
LOCATION:Virtual (Zoom)
END:VEVENT
END:VCALENDAR
