- 著者
- Carlo Tomasi
- タイトル
- Shape and Motion from Image Streams: a Factorization Method
- 日時
- Sep 1991
- 概要
- We propose a method for estimating the three-dimensional shape
of objects and the motion of the camera from a stream of images
The goal is to give a robot the ability to localize itself with
respect to the environment, draw a map of its own surroundings,
and perceive the shape of objects in order to recognize or
grasp them.
Solutions proposed in the past were so sensitive to noise as to
be of little use in practical applications.
This sensitivity is closely related to the viewercentered
representation of scene geometry known as a depth map, and to
the use of stereo triangulation to infer depth from the images.
In fact, when objects are more than a few focal lengths away
from the camera, parallax effects become subtle, and even a
small amount of noise in the images produces large errors in
the final shape and motion results.
In our formulation, we represent shape in object-centered
coordinates, and model image formation by orthographic, rather
than perspective projection.
In this way, depth, the distance between viewer and scene, play
no role, and the problem's sensitivity to noise is critically
reduced.
We collect the image coordinates of P feature points tracked
through F frames into a 2F X P measurement matrix.
If these coordinates are measured with respect to their cent-
roid, we show that represent the measurement matrix can be
written as the product of two matrices that represent the
camera rotation and the positions of the feature points in
space.
The bilinear nature of this model, and its matrix formulation,
lead to a factorization method for the computation of shape
and motion, based on the Singular Value Decomposition.
Previous solutions assumed motion to be smooth, in one form or
another, in an attempt to constrain the solution and achieve
reliable convergence.
The factorization method, on the other hand, makes on assump-
tion about the camera motion, and can deal with the large jumps
from frame to frame found, for instance, in sequences taken
with a hand-held camera.
To make the factorization method into a working system, we
solve several corollary problems: how to select image features,
how to track them from frame to frame, how to deal with
occlusions, and how to cope with the noise and artifacts that
corrupt image features, how to track them from frame to frame,
how to deal with occlusions, and how to cope with the noise
and artifacts that corrupt images recorded with ordinary equip-
ment.
We test the entire system with a series of experiments on real
images taken both in the lab, for an accurate performance
evaluation, and outdoors, to demonstrate the applicability of
the method in real-life situations.
- カテゴリ
- CMUTR
Category: CMUTR
Institution: Department of Computer Science, Carnegie
Mellon University
Abstract: We propose a method for estimating the three-dimensional shape
of objects and the motion of the camera from a stream of images
The goal is to give a robot the ability to localize itself with
respect to the environment, draw a map of its own surroundings,
and perceive the shape of objects in order to recognize or
grasp them.
Solutions proposed in the past were so sensitive to noise as to
be of little use in practical applications.
This sensitivity is closely related to the viewercentered
representation of scene geometry known as a depth map, and to
the use of stereo triangulation to infer depth from the images.
In fact, when objects are more than a few focal lengths away
from the camera, parallax effects become subtle, and even a
small amount of noise in the images produces large errors in
the final shape and motion results.
In our formulation, we represent shape in object-centered
coordinates, and model image formation by orthographic, rather
than perspective projection.
In this way, depth, the distance between viewer and scene, play
no role, and the problem's sensitivity to noise is critically
reduced.
We collect the image coordinates of P feature points tracked
through F frames into a 2F X P measurement matrix.
If these coordinates are measured with respect to their cent-
roid, we show that represent the measurement matrix can be
written as the product of two matrices that represent the
camera rotation and the positions of the feature points in
space.
The bilinear nature of this model, and its matrix formulation,
lead to a factorization method for the computation of shape
and motion, based on the Singular Value Decomposition.
Previous solutions assumed motion to be smooth, in one form or
another, in an attempt to constrain the solution and achieve
reliable convergence.
The factorization method, on the other hand, makes on assump-
tion about the camera motion, and can deal with the large jumps
from frame to frame found, for instance, in sequences taken
with a hand-held camera.
To make the factorization method into a working system, we
solve several corollary problems: how to select image features,
how to track them from frame to frame, how to deal with
occlusions, and how to cope with the noise and artifacts that
corrupt image features, how to track them from frame to frame,
how to deal with occlusions, and how to cope with the noise
and artifacts that corrupt images recorded with ordinary equip-
ment.
We test the entire system with a series of experiments on real
images taken both in the lab, for an accurate performance
evaluation, and outdoors, to demonstrate the applicability of
the method in real-life situations.
Number: CMU-CS-91-172
Bibtype: TechReport
Month: Sep
Author: Carlo Tomasi
Title: Shape and Motion from Image Streams: a Factorization Method
Year: 1991
Address: Pittsburgh, PA
Super: @CMUTR