dc.description.abstract |
A variety of dynamic objects, such as faces, bodies, and cloth, are represented in computer
vision and computer graphics as a collection of moving spatial landmarks. A number of
tasks are performed on this type of data such as character animation, motion editing, and
nonrigid structure from motion. In theory, many of these tasks are highly under-constrained
and the estimation algorithms exploit the natural regularity that exists as a cloud of points
moves over time. In this thesis, we present compact and generalizable models of non-
rigid objects by exploiting spatial and temporal regularities of time-varying point data. We
demonstrate that several theoretically ill-posed tasks can be made well-posed with the help
of these models.
Our first contribution is to propose and demonstrate the effectiveness of the linear trajectory
model for representing time-varying point clouds. Traditionally, a linear shape model has
been used to represent time-varying point data; the 3D shape of a nonrigid object is modeled
as a linear combination of a small number of basis shapes. In contrast, we represent point
trajectories as a linear combination of basis trajectories. We show that the linear trajectory
and the linear shape models are dual to each other and have equal representation power.
In contrast to the shape basis, however, we demonstrate that the trajectory basis can be
predefined by exploiting the inherent smoothness of trajectories. In fact, we show that the
Discrete Cosine Transform (DCT) is a good choice for a predefined basis and empirically
demonstrate its compactness by showing that it approaches Principal Component Analysis
(PCA) for natural motions.
This linear trajectory model is applied to the problem of nonrigid structure from motion.
Analogous to the formulation under the shape model, the estimation of nonrigid struc-
ture from motion under the trajectory model results in an optimization problem based on
orthonormality constraints. Prior work asserted that structure recovery through orthonor-
mality constraints alone is inherently ambiguous and cannot result in a unique solution.
This assertion was accepted as a conventional wisdom and was the justification of several
remedial heuristics in literature. In contrast, we prove that orthonormality constraints are,
in fact, sufficient to recover the 3D structure in both the linear trajectory and the shape
models. Moreover, we show that the primary advantage of the trajectory model over the
shape model in nonrigid structure from motion is the possibility of predefining the basis.This results in a significant reduction in unknowns and corresponding stability in estima-
tion. We demonstrate significant improvement in reconstruction results over the state of the
art.
After demonstrating the effectiveness of the linear trajectory model over linear shape model
in nonrigid structure from motion, we also show how both the models can be synergisti-
cally combined. We present the bilinear spatiotemporal basis as a model to simultaneously
exploit spatial and temporal regularities, while maintaining the ability to generalize well
to new sequences. The model can be interpreted as representing the data as a linear com-
bination of spatiotemporal sequences consisting of shape modes oscillating over time at
key frequencies. We apply the model to natural spatiotemporal phenomena, including face,
body, and cloth motion data, and demonstrate its effectiveness in terms of compaction, gen-
eralization ability, predictive precision, and efficiency against existing models. We demon-
strate the application of the model in motion capture clean-up. We present an expectation-
maximization algorithm for motion capture labeling, gap-filling, and denoising. The solu-
tion provides drastic reduction in the clean-up time in comparison to the current industry
standards. |
en_US |