Unstructured Video-Based Rendering:
Interactive Exploration of Casually Captured Videos


Luca Ballan Gabriel J. Brostow Jens Puwein Marc Pollefeys
ETH Zurich University College London ETH Zurich ETH Zurich

ACM Transactions on Graphics (Proceedings of SIGGRAPH 2010)



Other formats [Xvid] [MP4] [YouTube] [Vimeo]




Abstract:

We present an algorithm designed for navigating around a performance that was filmed as a "casual" multi-view video collection: real-world footage captured on hand held cameras by a few audience members. The objective is to easily navigate in 3D, generating a video-based rendering (VBR) of a performance filmed with widely separated cameras. Casually filmed events are especially challenging because they yield footage with complicated backgrounds and camera motion. Such challenging conditions preclude the use of most algorithms that depend on correlation-based stereo or 3D shape-from-silhouettes.

Our algorithm builds on the concepts developed for the exploration of photo-collections of empty scenes. Interactive performer-specific view-interpolation is now possible through innovations in interactive rendering and offline-matting relating to i) modeling the foreground subject as video-sprites on billboards, ii) modeling the background geometry with adaptive view-dependent textures, and iii) view interpolation that follows a performer. The billboards are embedded in a simple but realistic reconstruction of the environment. The reconstructed environment provides very effective visual cues for spatial navigation as the user transitions between viewpoints. The prototype is tested on footage from several challenging events, and demonstrates the editorial utility of the whole system and the particular value of our new billboard-to-billboard optimization.




Paper:      [PDF]
[ACM version]
[bibtex]




Binaries:

   Demo Rothman [ZIP]       [ZIP laptop version]
   Demo Juggler [ZIP] [ZIP laptop version]
   Demo Magician [ZIP] [ZIP laptop version]


Note: The code is optimized for quad-core machines with a GPU. Recommended configuration: Intel i7,
nVidia GTX 280, 7200rpm hard drive, Vista 32bit or 64bit. The laptop version runs at half video resolution and should play smoothly even on a decent dual-core laptop.







User interface:

(Regular mode) (Orbit mode)
(Help menu)






Supplementary materials:

   Rothman (with audio) [AVI] [YouTube]
   User interaction (camera captured) [AVI]
   User interaction (screen captured) [AVI]
   Segmentation results [AVI]
   User interaction: Juggler [AVI]
   User interaction: Magician [AVI]





Original datasets:

Calibrated and synchronized video sequences evaluated in the paper (codec: lagarith):

   Juggler [01, 02, 03, 04, 05, 06] [Calibration] [Segmentation] [3D Model]
   Magician [01, 02, 03, 04, 05, 06] [Calibration] [Segmentation] [3D Model]
   Rothman [01, 02, 03] [Calibration] [Segmentation] [3D Model]
   Climber (Hasler et al. 2009, data at tnt Hannover)
   Dancer (data at INRIA Grenoble Rhone-Alpes Perception Group)






BibTex references:

@ARTICLE{UnstructuredVBR10,
   author = "Luca Ballan and Gabriel J. Brostow and Jens Puwein and Marc Pollefeys",
   title = "Unstructured Video-Based Rendering: Interactive Exploration of Casually
            Captured Videos",
   booktitle = "ACM Transactions on Graphics (Proceedings of SIGGRAPH 2010)",
   month = "July",
   year = "2010",
   address = "Los Angeles",
   pages = {1--11},
   isbn = {978-1-4503-0210-4}
}






Related works:

   CrowdCam: Instantaneous Navigation of Crowd Images using Angled Graph
A. Arpa, L. Ballan, R. Sukthankar, G. Taubin, M. Pollefeys, R. Raskar [link] [youtube]
   Modeling Dynamic Scenes Recorded with Freely Moving Cameras
A. Taneja, L. Ballan and M. Pollefeys [link]
   Acquiring Shape and Motion of Interacting People from Videos
L. Ballan and G. M. Cortelazzo [link]





Acknowledgments:

We thank Ralph Wiedemeier, Davide Scaramuzza, and Mark Rothman whose performances constitute the Juggler, Magician, and Rothman data, Nils Hasler and Juergen Gall for the Climber videos, and Christopher Zach, David Gallup, Oisin Mac Aodha, Mike Terry, and the anonymous reviewers for help and valuable suggestions. The research leading to these results has received funding from the ERC under the EC’s Seventh Framework Programme (FP7/2007-2013) / ERC grant #210806, and from the Packard Foundation.








Contact: ballanlu@gmail.com
Computer Vision and Geometry Group CVG
ETH, Zurich