BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Efficient Few-Shot Continual Learning in Vision-Language Models - 
 Aristeidis Panos 
DTSTART:20250521T150000Z
DTEND:20250521T153000Z
UID:TALK231025@talks.cam.ac.uk
CONTACT:Cat Spencer
DESCRIPTION:Vision-language models (VLMs) excel in tasks such as visual qu
 estion answering and image captioning. However\, VLMs are often limited by
  their use of pretrained image encoders\, like CLIP\, leading to image und
 erstanding errors that hinder overall performance. On top of that\, real-w
 orld applications often require the model to be continuously adapted as ne
 w and often limited data continuously arrive. To address this\, we propose
  LoRSU (Low-Rank Adaptation with Structured Updates)\, a robust and comput
 ationally efficient method for selectively updating image encoders within 
 VLMs. LoRSU introduces structured and localized parameter updates\, effect
 ively correcting performance on previously error-prone data while preservi
 ng the model's general robustness. Our approach leverages theoretical insi
 ghts to identify and update only the most critical parameters\, achieving 
 significant resource efficiency. Specifically\, we demonstrate that LoRSU 
 reduces computational overhead by over 25x compared to full VLM updates\, 
 without sacrificing performance. Experimental results on VQA tasks in the 
 few-shot continual learning setting\, validate LoRSU's scalability\, effic
 iency\, and effectiveness\, making it a compelling solution for image enco
 der adaptation in resource-constrained environments
LOCATION: Cambridge University Engineering Department\, CBL Seminar room B
 E4-38.  For directions see http://learning.eng.cam.ac.uk/Public/Directions
END:VEVENT
END:VCALENDAR