Interactive and decomposed approaches for NLP: the case of multi-text summarization
- ๐ค Speaker: Ido Dagan (Bar-Ilan University) ๐ Website
- ๐ Date & Time: Friday 29 April 2022, 12:00 - 13:00
- ๐ Venue: Virtual (Zoom)
Abstract
Current approaches for NLP tasks often conform to two design principles. First, they address โstaticโ tasks, where a single input instance is addressed at a time, independently of other inputs. Second, outputs are computed via an end-to-end model, trained directly over input-output pairs for the task. In this talk, I will propose two directions in which NLP research may be systematically extended beyond the static end-to-end approach and demonstrate them for the use case of multi-text summarization. In the first part of the talk I suggest that in many realistic use cases multi-text (or long-text) summarization should support an interactive setting, where users interactively direct summary generation to best fit their information exploration needs. To promote principled research in this direction, we propose a systematic evaluation framework for interactive summarization. This framework extends summarization evaluation standards to consider the accumulating information along a user session, and includes an effective procedure for collecting user sessions. We then present a deep reinforcement learning model for interactive summarization, showing (using our evaluation framework) that it significantly improves information exposure over prior baselines while preserving positive user experience. In the second part of the talk I suggest that summarization modeling may be beneficially decomposed to inherent subtasks, each addressed by a targeted model, rather than employing a single end-to-end model. Such decomposition is enabled through a clever generation of targeted training datasets for specific subtasks, all derived from the original โend-to-endโ training data. As an additional contribution related to this context, I will describe our Cross-Document Language Model (CDLM), which is pre-trained specifically to model cross-text relationships, supporting diverse cross-document tasks.
Bio:
Ido Dagan is a Professor at the Department of Computer Science at Bar-Ilan University, Israel, the founder of the Natural Language Processing (NLP) Lab at Bar-Ilan, the founding Director of the nationally funded Bar-Ilan University Data Science Institute, and a Fellow of the Association for Computational Linguistics (ACL). His interests are in applied semantic processing, focusing on textual inference, natural open semantic representations, consolidation and summarization of multi-text information, and interactive text summarization and exploration. Dagan and colleagues initiated and promoted textual entailment recognition (RTE, later aka NLI ) as a generic empirical task. He was the President of the ACL in 2010 and served on its Executive Committee during 2008-2011. In that capacity, he led the establishment of the journal Transactions of the Association for Computational Linguistics, which became one of two premiere journals in NLP . Dagan received his B.A. summa cum laude and his Ph.D. (1992) in Computer Science from the Technion. He was a research fellow at the IBM Haifa Scientific Center (1991) and a Member of Technical Staff at AT&T Bell Laboratories (1992-1994). During 1998-2003 he was co-founder and CTO of FocusEngine and VP of Technology of LingoMotors, and has been regularly consulting in the industry. His academic research has involved extensive industrial collaboration, including funds from IBM , Google, Thomson-Reuters, Bloomberg, Intel and Facebook, as well as collaboration with local companies under funded projects of the Israel Innovation Authority.
Topic: NLIP Seminar Time: Apr 29, 2022 12:00 PM London
Join Zoom Meeting https://cl-cam-ac-uk.zoom.us/j/96419914999?pwd=RHN4TE9KMmdhY3loaE55bHRNTVFodz09
Meeting ID: 964 1991 4999 Passcode: 485878
Series This talk is part of the NLIP Seminar Series series.
Included in Lists
- All Talks (aka the CURE list)
- bld31
- Cambridge Centre for Data-Driven Discovery (C2D3)
- Cambridge Forum of Science and Humanities
- Cambridge Language Sciences
- Cambridge talks
- Chris Davis' list
- Computer Education Research
- Computing Education Research
- Department of Computer Science and Technology talks and seminars
- Graduate-Seminars
- Guy Emerson's list
- Interested Talks
- Language Sciences for Graduate Students
- ndk22's list
- NLIP Seminar Series
- ob366-ai4er
- PMRFPS's
- rp587
- School of Technology
- Simon Baker's List
- Trust & Technology Initiative - interesting events
- Virtual (Zoom)
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)

Ido Dagan (Bar-Ilan University) 
Friday 29 April 2022, 12:00-13:00