3d Shape Reconstruction from Photographs: a Multi-View Stereo Approach

Carlos Hernandez, George Vogiatzis and Yasutaka Furukawa

Half-day tutorial at CVPR 2010

Monday, June 14, 2010

Tutorial slides (58 MB)

Calibration matlab code

Click image to launch java 3d viewer


The state-of-the-art in 3D reconstruction from photographs has undergone a revolution in the last few years. Coupled with the rapid developments in the digital photography industry, state-of-the-art multi-view stereo algorithms now rival laser range scanners in accuracy. This tutorial will provide an introduction to this exciting field of research. The aims of the tutorial are: firstly to give a step-by-step, practical guide to implementing and deploying a multi-view stereo system. To that end, the tutorial will be covering the basics of the multi-view stereo pipeline consisting of camera calibration, image segmentation, photo-consistency estimation from images, and surface extraction from photo-consistency. The focus will be on existing state-of-the-art techniques that will be provided as case studies illustrating the key principles involved.

Our second aim is to introduce potential new researchers to the field. The tutorial will therefore be identifying the current research frontier in the field, discussing some of the important open questions. Finally, we will be describing some possible avenues for future work including interactivity in MVS, the incorporation of geometric priors in 3d reconstruction and Internet-scale MVS.

This tutorial is based on a class offered at the ICVSS08 summer school, but is redesigned for computer vision researchers and engineers with more technical details.

Target Audience

The course is intended for grad students, academic researchers, and industrial engineers who want to understand state-of-the-art multi-view stereo algorithms for solving new vision problems in 3D photography domain and/or building their own MVS system from scratch. Some knowledge of the camera projection model and basic linear algebra helps, but is not crucial.


  1. [15 min] Introduction
    1. How do you capture 3D - computer vision algorithms or 3D sensors?
    2. Why should you use computer vision (multi-view stereo algorithms)?

  2. [15 min] Data acquisition
    1. Tips on taking good pictures (coverage, large f-stop value, tripod etc.)
    2. Camera calibration: Bundler and OpenCV/Caltech Matlab toolbox

  3. [30 min] Computing correspondence: 2d photo-consistency metrics for MVS
    1. NCC, SSD, std deviation, square window vs homography warp
    2. Failure modes
      1. occlusion
      2. repeated or lack of texture
      3. lighting variation
    3. Seminal work identifying the occlusion problem: space carving
    4. Some solutions:
      1. Iterated surface proxy
      2. Post-filtering
      3. Explicit modeling of occlusion

  4. [90 min] Most successful general purpose approaches
    1. [45 min] Depth-map based pipeline
      1. depth-map computation
      2. 3d photo-consistency from depth-maps
      3. Surface extraction from 3d photo-consistency
    2. [45 min] Patch growing pipeline
      1. Reconstructing oriented points
      2. Surface extraction from point-sets
      3. Continuous optimization of photo-consistency metrics

  5. [30 min] Are we there yet? future trends in MVS
    1. Interactive MVS
    2. Geometric priors
    3. Internet-scale MVS

Relevant publications


######### Carlos Hernández obtained an MS in applied Mathematics from  l'Ecole Normale Supérieure de Cachan in 2000 and received his PhD in 2004 from the Ecole Nationale Supérieure des Télécommunications de Paris. He then moved to the University of Cambridge as a research associate until 2006, when he became a permanent researcher in the computer vision lab at Toshiba Research Europe, Cambridge. His research interests are mainly in shape-from-X algorithms and its applications for Computer Graphics.

######### George Vogiatzis obtained an MS in Mathematics and Computer Science from Imperial College in 2002 and a PhD in Computer Vision from the University of Cambridge in 2006. He then spent 3 years in Toshiba Research after which he joined Aston University as a Lecturer. His research interests are in 3d Computer Vision and its applications for Computer Graphics and Animation.

######### Yasutaka Furukawa received the BS degree in computer science from the University of Tokyo in 2001, and the PhD degree in computer science from the University of Illinois at Urbana-Champaign in 2008. He then joined the Graphics and Imaging Laboratory at University of Washington as post-doctoral research associate. He started working at Google from Jan 2010. His research interests include computer vision and computer graphics focusing on 3D photography.