BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Efficient Data Structures for Nonlinear Video Processing - Jiawen 
 Chen\, MIT's Computer Science and Artificial Intelligence Laboratory
DTSTART:20110411T130000Z
DTEND:20110411T140000Z
UID:TALK30741@talks.cam.ac.uk
CONTACT:Microsoft Research Cambridge Talks Admins
DESCRIPTION:Nonlinear techniques are used extensively in image and video p
 rocessing with applications ranging from low level kernels such as denoisi
 ng and detail enhancement to higher level operations such as object manipu
 lation and special effects. In this talk\, we will describe two computatio
 nally efficient data structures which dramatically simplify and accelerate
  a variety of algorithms for video processing.\n\nOur first data structure
  is the bilateral grid\, an image representation that explicitly accounts 
 for intensity edges.  By interpreting brightness differences as Euclidean 
 distances\, the bilateral grid naturally encodes the notion of edge-awaren
 ess into filters defined on it.  Smooth functions defined on the bilateral
  grid are piecewise-smooth in image space.  Within this framework\, we der
 ive efficient reinterpretations of a number of nonlinear filters commonly 
 used in computational photography as operations on the bilateral grid\, in
 cluding the bilateral filter\, edge-aware scattered data interpolation\, a
 nd local histogram equalization.  We also show how these techniques can be
  easily parallelized onto modern graphics hardware for real-time processin
 g of high definition video.\n\nThe second data structure we describe is th
 e video mesh\, designed as a flexible central data structure for general-p
 urpose nonlinear video editing workflows.\nIt represents objects in a vide
 o sequence as 2.5D "paper cutouts" and allows interactive editing of movin
 g objects and modeling of depth\, which enables 3D effects and post-exposu
 re camera control.  In our representation\, motion and depth are sparsely 
 encoded by a set of points tracked over time.  The video mesh is a triangu
 lation over this point set and per-pixel information is obtained by interp
 olation.  To handle occlusions and detailed object boundaries\, we rely on
  the user to rotoscope the scene at a sparse set of frames using spline cu
 rves.\nWe introduce an algorithm to robustly and automatically cut the mes
 h into local layers with proper occlusion topology\, and propagate the spl
 ines to the remaining frames.  Object boundaries are refined with per-pixe
 l alpha mattes.\n\nAt its core\, the video mesh is a collection of texture
 -mapped triangles\, which we can edit and render interactively using graph
 ics hardware.  We demonstrate the effectiveness of our representation with
  special effects such as 3D viewpoint changes\, object insertion\, depth-o
 f-field manipulation\, and 2D to 3D video conversion.\n
LOCATION:Small lecture theatre\, Microsoft Research Ltd\, 7 J J Thomson Av
 enue (Off Madingley Road)\, Cambridge
END:VEVENT
END:VCALENDAR
