CSC 2530H: Visual Modeling Spring 2004

Project Proposal: Video Motion Summarization


 

Problem Description

Given a video sequence of a single moving subject taken with a stationary camera, I want to produce an image (or a short sequence of images) that summarizes the motion of the subject in a way that is intuitive for non-technical viewers. This would be useful for producing instructional materials for physical activities, such as dance or martial arts, and to prepare motion studies for animators and artists.

 

Related Work

Not much work has been done in this area in the vision community. Freeman and Zhang present a method for describing the change of an object's shape over time by taking stereo depth measurements for every pixel in each frame of a video sequence, and displaying only the closest pixels in the final image. The resulting images are somewhat confusing, however, and do not immediately evoke the idea of motion in a casual observer (see example below). Another area that has been explored before is that of Salient Stills, which summarize the content of general video sequences. Salient Stills capture the general content of sequences, but do not give detailed information on the motion of the subjects in the video.

Example Shape Time Image

While the vision literature in this area is somewhat 4/16/04s often use subtle (or not so subtle) distortions to invoke the idea of motion in the viewer. This is an interesting technique, but its application often requires a semantic understanding of the scene that precludes automation. Fortunately, there are several very promising examples in photography that can be applied without explicit understanding of the scene. Etienne-Jules Marey's late 19th century work in Chronophotography is very applicable. Most of Marey's effects are achieved through multiple exposure. While this would be easy to simulate with a video clip, the question remains as to how to choose the frames to include in the multiple exposure. Also, while this technique works well for translation movements, it does not produce clear images from rotations.

Marey Image of a Pole-Vaulter

The photographic work of Anton Bragaglia presents a very impressionistic view of motion. His technique, called Photodynamism, was accomplished by using very long exposure times. Again, this technique would easy to simulate in video, but there is still a question of control.

Bragaglia's "Portrait of Arturo Bragaglia"

Instructional materials for dance and athletics provide another technique for capturing motion. A good example is the following diagram taken from an Aikido book. The figures themselves are abstract, and presented with no background. The moves that they are performing are segmented into major poses, and "action lines" are added at their extremities to give an impression of how they transition from pose to pose.

From Page 172 of Aikido and the Dyanmic Sphere

Proposed Solution

Prof. Kutulakos has suggested that I begin by seeing what kind of effects can be reproduced using straightforward image-processing techniques. With these techniques implemented, I can then look into how they can be combined and automated to produce more instructional images.

For example, I have written a Matlab script that implements the following operation:

given images I_1,I_2,I_3,...,I_n
  create a sequence J_1,J_2,...,J_n
     where J_k = a*I_k+a^2*I_(k-1)+...a^m*I_(k-m)
     with 0<a<1 and m some small integer (the final pixel values are normalized so that the output is a convex combination.)

With a value of a around .9, the algorithm produces videos like this. Obviously, this is less than instructive, but by modifying the parameters and fall-off function, I may be able to produce individual frames that look like Bragaglia's photographs.

A series of videos produced with different parameter values is here (divx 5.1) and here (mpeg 2).

I think that an ideal solution would be a chronophotographic image where the individual poses are annotated using photodynamism and/or 'action lines'. In order to do this, I will need to find image processing algorithms to approximate each of the techniques, and then develop heuristics to guide their application.

 

References

 

Betancourt, Michael. Motion Perception in Movies and Painting: Towards a New Kinetic Art. www.ctheory.net, 2002.

Braun, Marta. Picturing Time, the Work of Etienne-Jules Marey (1830-1903). University of Chicago Press, 1992.

Freeman, W. & Zhang, H. Shape-Time Photography. MIT AI Memo 2002-002.

Massey, M. & Bender, W. Salient Stills: Process and Practice. IBM Systems Journal v35 n3&4, 1996.

Westbrook, A. & Ratti, O. Aikido and the Dynamic Sphere: an Illustrated Introduction. Tuttle Publishing, 2001.

 

 

<== BACK to home page

Kevin Forbes 4/16/04