BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Narrative Summarization From Multiple Views - Pinelopi (Nelly) Pap
 alampidi\, DeepMind
DTSTART:20230202T110000Z
DTEND:20230202T120000Z
UID:TALK196804@talks.cam.ac.uk
CONTACT:Panagiotis Fytas
DESCRIPTION:Although summarizing movies and TV shows comes naturally to hu
 mans\, it is very challenging for machines. They have to combine different
  input sources (i.e.\, video\, audio\, subtitles)\, process long videos of
  1-2 hours\, and their transcripts\, and learn from a handful of examples\
 , since collecting and processing such videos is hard. Given the challenge
 s of multimodal summarization\, most prior work does not consider all face
 ts of the computational problem at once but instead focuses on either proc
 essing multiple but short input sources or long text-only narratives.\n\nI
 n contrast\, we aim at summarizing full-length movies and TV episodes whil
 e considering all input sources for creating video trailers and textual su
 mmaries. For trailer creation\, we propose an algorithm for selecting trai
 ler moments in movies based on interpretable criteria such as the narrativ
 e importance and sentiment intensity of events. We further demonstrate how
  we can convert our algorithm into an interactive tool for trailer creatio
 n with a human in the loop. Next\, for producing textual summaries from fu
 ll-length TV episodes\, we move to a video-to-text setting and hypothesize
  that multimodal information from the full-length video and audio can dire
 ctly facilitate abstractive dialogue summarization. We propose a parameter
 -efficient way for incorporating such information into a pre-trained textu
 al summarizer and demonstrate improvements in the generated summaries.
LOCATION:GR04\, English Faculty Building\, 9 West Road\, Sidgwick Site
END:VEVENT
END:VCALENDAR
