The aim of motion tracking is to detect and track moving objects through a sequence of images. Motion tracking is not only useful for monitoring activity in public places, but is becoming a key ingredient for further analysis of video imagery. For instance, information about the location and identity of objects at different points in time (as summarised by the image on the right) is the basis of detecting unusual object movements (e.g. someone being mugged at an ATM) or coordinated activities (e.g. strategic plays in a football game).
On this page you will find information on a motion tracking project carried out by Fabian Wauthier as a fourth year honours project at the University of Edinburgh and supervised by Prof. Chris Williams. You can view sample video clips of tracked objects, read how it works and download the source code.
This project is based on the paper Learning Patterns of Activity Using Real-time Tracking by C. Stauffer and W. E. Grimson and proceeds in two main parts: object detection, and object tracking.
Given an image sequence from a static camera, the first step is to discriminate moving foreground objects from the background. The approach used in this project constructs for each pixel of an image a distribution of the typical values it takes (i.e. a background model). The central notion is that the most frequent values which a pixel takes are likely to correspond to background imagery. By modelling pixels through parametrised distributions one can capture sensor noise (for example at night) and repetitive background motion (for example waving tree branches) in a compact description. The decision of whether a pixel from a given, new image is part of the background or foreground is then based solely on the accumulated background model: pixels with values sufficiently different from the background distribution are labelled as foreground. The first step of the algorithm is thus to construct a suitable representation for the background (here a Mixture of Gaussians is used for each individual pixel) and to detect outliers. Additionally, by continually updating the background model of each pixel, one can adapt to gradual changes in lighting (for example due to changing cloud cover or sunrise and sunset) and changes in scene geometry (parking cars). The method used for this purpose resembles an on-line K-means algorithm. An example input image and the corresponding pixel labellings computed from a background model can be seen on the left where foreground pixels were labelled as white. By lumping large connected foreground regions into blobs, foreground objects, such as the person seen crossing the street, can be identified. However, in this instance a foreground object was also erroneously detected in the top right corner due to violent motion of the trees in the background. Such occasional mis-detections must be taken into account at later stages of the tracking algorithm.
The second task is to link a sequence of object blobs together across image frames in order to determine the identity and location of an object at different points in an image sequence. To achieve this, it is useful to describe each object blob by a set of attributes, for example its position, velocity and size, collectively called its state. Tracking the object's true position is done by tracking its state using a Kalman filter. This uses information from the current blob and the previous object state to create an estimate of the object's new state. The combination of previously estimated object dynamics and current measurements helps eliminate noise from blob measurements that would otherwise lead to erratic object tracking. As an example, knowing the previous position and velocity of a car allows us to give a rough estimate of its current position. When combining this estimate with additional information of its state, tracking accuracy can be overall improved.
In circumstances where several objects are being tracked simultaneously, it is critical that confusion of distinct objects is avoided, since updating a Kalman filter with inappropriate information corrupts its state estimate and tracking fails. A schematic visualisation of the problem faced can be seen in the sequence below. Each circle represents a blob which is tracked through frames from left to right. At the new input frame t+1, the identities of blobs are initially unknown and must be extrapolated from observed object dynamics of previous frames. Additional complications arise if objects can temporarily disappear (e.g. behind walls or cars) or if non-existent foreground objects are observed (as in the previous example). Hence, the central challenge of multiple hypothesis tracking is how to determine an appropriate association between blobs and Kalman filters while being robust against temporary occlusions and noise. The association task was formulated as a linear assignment problem where we identify a desired association as one that minimises a summed assignment cost of pairings between blob observations and Kalman filters. A known algorithm that solves the linear assignment problem and for which a C++ implementation exists was published by R. Jonker and A. Volgenant. The cost function was chosen to ensure that the overall assignment is the most likely one while respecting matching constraints. Once the matching has been determined, multiple object tracking is a matter of updating each Kalman filter with its associated observation.
By following the links below you can view a number of image sequences and the output produced by the algorithm.
The original sequence was kindly provided by the University of Reading and can be downloaded from their PETS2000 repository.
The original sequence was also kindly provided by the University of Reading and can be downloaded from their PETS2001 repository.
Our own image sequence that can be downloaded as a gzip'ed tarball.
All code was developed on Matlab 7.2 (R2006a) with the exception of the C++ code for the Linear Assignment Problem which was made available by MagicLogic. Please review their copyright policy if you wish to use their code commercially. All tracker code can be downloaded as a gzip'ed tarball.
- C. Stauffer and W. E.
L. Grimson. Learning patterns of activity using real-time
tracking. IEEE Transaction on Pattern Analysis and Machine
Intelligence, 22(8):747-757, 2000.
A draft paper is available online at Stauffer's website.
R. Jonker and
A. Volgenant. A shortest augmenting path algorithm for dense and
sparse linear assignment problems. Computing, 38(4):325-340, 1987.
A C++ implementation of this algorithm was made available by MagicLogic.