Abstract:
An adaptive edge-enhanced correlation based robust and real-time visual tracking
framework, and two machine vision systems based on the framework are proposed.
The visual tracking algorithm can track any object of interest in a video acquired from
a stationary or moving camera. It can handle the real-world problems, such as noise,
clutter, occlusion, uneven illumination, varying appearance, orientation, scale, and
velocity of the maneuvering object, and object fading and obscuration in low contrast
video at various zoom levels. The proposed machine vision systems are an active
camera tracking system and a vision based system for a UGV (unmanned ground
vehicle) to handle a road intersection.
The core of the proposed visual tracking framework is an Edge Enhanced
Back-propagation neural-network Controlled Fast Normalized Correlation (EE-
BCFNC), which makes the object localization stage efficient and robust to noise,
object fading, obscuration, and uneven illumination. The incorrect template
initialization and template-drift problems of the traditional correlation tracker are
handled by a best-match rectangle adjustment algorithm. The varying appearance of
the object and the short-term neighboring clutter are addressed by a robust template-
updating scheme. The background clutter and varying velocity of the object are
handled by looking for the object only in a dynamically resizable search window, in
which the likelihood of the presence of the object is high. The search window is
created using the prediction and the prediction error of a Kalman filter. The effect of
the long-term neighboring clutter is reduced by weighting the template pixels using a
2D Gaussian weighting window with adaptive standard deviation parameters. The
occlusion is addressed by a data association technique. The varying scale of the object
is handled by correlating the search window with three scales of the template, and
accepting the best-match region that produces the highest peak in the three correlation
surfaces. The proposed visual tracking algorithm is compared with the traditional
correlation tracker and, in some cases, with the mean-shift and the condensation
trackers on real-world imagery. The proposed algorithm outperforms them in
robustness and executes at the speed of 25 to 75 frames/second depending on the
current sizes of the adaptive template and the dynamic search window.
The proposed active camera tracking system can be used to get the target
always in focus (i.e. in the center of the video frame) regardless of the motion of the
target in the scene. It feeds the target coordinates estimated by the visual tracking
framework into a predictive open-loop car-following control (POL-CFC) algorithm
which in turn generates the precise control signals for the pan-tilt motion of the
camera. The performance analysis of the system shows that its percent overshoot, rise
time, and maximum steady state error are 0%, 1.7 second, and ±1 pixel, respectively.
The hardware of the proposed vision based system, that enables a UGV to
handle a road intersection, consists of three on-board computers and three cameras
(mounted on top of the UGV) looking towards the other three roads merging at the
intersection. The software in each computer consists of a vehicle detector, the
proposed tracker, and a finite state machine model (FSM) of the traffic. The
information from the three FSMs is combined to make an autonomous decision
whether it is safe for the UGV to cross the intersection or not. The results of the actual
UGV experiments are provided to validate the robustness of the proposed system.
Index terms – visual tracking, adaptive edge-enhanced correlation, active camera,
unmanned ground vehicle.