to multi-object tracking. For each object, we use an individual tracker to estimate the
position. Two different pre-trained networks are used as feature extractors, respectively. The
response peak and oscillation are both considered to validate the tracking. When the object
is lost, the discriminative appearance model achieved by DCF is considered as a part of the
feature representation between the object and detection for data association. In order to …