Taming foundation models for visual concept learning and 4D modeling
- đ¤ Speaker: Kai Han, University of Hong Kong đ Website
- đ Date & Time: Thursday 11 September 2025, 11:00 - 12:00
- đ Venue: Cambridge University Engineering Department, JDB Teaching Room
Abstract
In this talk, I will present our recent work on leveraging foundation models for open-world visual concept learning and 4D modeling. First, I will discuss how we repurpose vision foundation models for continual category discovery by learning a flexible Gaussian mixture prompt pool. Next, I will introduce our approach to automatically extracting visual concepts, both at the object and intrinsic levels, using Stable Diffusion models. Finally, I will share our work on high-quality 4D generation by effectively harnessing video diffusion models, enabling temporally and spatially consistent content creation with 4D Gaussian splatting.
Series This talk is part of the Computer Vision Seminars series.
Included in Lists
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)



Thursday 11 September 2025, 11:00-12:00