Semantic Image Segmentation and Web-Supervised Visual Learning
- 👤 Speaker: Florian Schroff, University of Oxford
- 📅 Date & Time: Monday 08 June 2009, 11:30 - 12:00
- 📍 Venue: Small Lecture Room, Microsoft Research, Roger Needham Building, 7 J J Thomson Avenue, Cambridge CB3 0FB
Abstract
Abstract: In object recognition, the goal is to recognise objects of certain categories, usually known and trained in advance, despite intra-class appearance variations and small inter-class differences. The appearance of objects is influenced by lighting, scale, different poses, viewpoints, articulation of objects, clutter and occlusion. Two different aspects of object recognition are investigated in this thesis. The first part develops models for semantic object segmentation of natural images and relies on groundtruth labelling for training. The second part uses the implicit supervision that is available on the Internet to learn visual object-class models automatically. It can then provide a groundtruth labelling for object detection or segmentation algorithms.
The goal in the first part is to label connected regions in an image as belonging to specific object classes, such as grass or cow. We introduce a compact model to the bag of visual words approach, where each class is modelled by one single histogram of visual words, this is in contrast to common nearest-neighbour approaches which model each class by many histograms. After introducing segmentation algorithms based on these histogram models we extend the Random Forest classifier and evaluate its feature selection properties as well as the suitability of certain low-level features for the semantic object segmentation task.
Most object recognition methods rely on labelled training images. For each object category to be recognised, the system is trained on a set of images containing instances of these categories. The last part of this thesis focuses on the automatic creation of sets of images that contain a certain object class. The idea is to download an initial set of images from the Internet based on a search query ( penguin). Given the images a text based ranking that exploits the information on the web-pages is performed. This ranking is then used to automatically learn visual models for 18 object categories. We compare the performance of our system to previous work and show that it performs equally well without the need of explicit manual supervision.
Biography: Florian Schroff is currently a fourth year DPhil student in the Departement of Engineering Science at the University of Oxford funded by Microsoft Research through the European PhD Scholarship Programme. He is jointly supervised by Professor Andrew Zisserman and Antonio Criminisi at Microsoft Research Cambridge. Before joining the Visual Geometry Group (VGG) in Oxford he was working as a researcher at the German Research Center for Artificial Intelligence in Kaiserslautern. He received his degree (Diploma) in computer science at the University of Karlsruhe end of 2004, where he was working with Professor H.-H. Nagel on camera calibration and focused on artificial intelligence, cryptography and algebra. In 2003 he received the Master of Science in computer science from the University of Massachusetts – Amherst, where he had started his studies under the Baden-Württemberg exchange scholarship in 2002.
Series This talk is part of the Dr Fabien Petitcolas's list series.
Included in Lists
- All Talks (aka the CURE list)
- bld31
- Cambridge Centre for Data-Driven Discovery (C2D3)
- Cambridge talks
- Chris Davis' list
- Guy Emerson's list
- Interested Talks
- Microsoft Research Cambridge, public talks
- Microsoft Research PhD Scholars
- ndk22's list
- ob366-ai4er
- Optics for the Cloud
- personal list
- PMRFPS's
- rp587
- School of Technology
- Small Lecture Room, Microsoft Research, Roger Needham Building, 7 J J Thomson Avenue, Cambridge CB3 0FB
- Trust & Technology Initiative - interesting events
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)


Monday 08 June 2009, 11:30-12:00