University of Cambridge > Talks.cam > Computer Vision Seminars > Taming foundation models for visual concept learning and 4D modeling

Taming foundation models for visual concept learning and 4D modeling

Download to your calendar using vCal

If you have a question about this talk, please contact Elliott Wu .

In this talk, I will present our recent work on leveraging foundation models for open-world visual concept learning and 4D modeling. First, I will discuss how we repurpose vision foundation models for continual category discovery by learning a flexible Gaussian mixture prompt pool. Next, I will introduce our approach to automatically extracting visual concepts, both at the object and intrinsic levels, using Stable Diffusion models. Finally, I will share our work on high-quality 4D generation by effectively harnessing video diffusion models, enabling temporally and spatially consistent content creation with 4D Gaussian splatting.

This talk is part of the Computer Vision Seminars series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

Š 2006-2025 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity