"Track only red color car"
We conduct extensive experiments to empirically prove the performance of our proposed Open-GMOT including both detection with CS-OD and association with MAC-SORT in the GMOT problem. Our strategy can help bridging the gap between human's intention and computer understanding to provide flexibility in tracking objects with distinctive characteristics follow input texts.
"Track only red color car"
We conduct extensive experiments to empirically prove the performance of our proposed Z-GMOT including both detection with Open-CSOD and association with MAC-SORT in the GMOT problem. Our strategy can help bridging the gap between human's intention and computer understanding to provide flexibility in tracking objects with distinctive characteristics follow input texts.
Trackers | Detectors | #-Shot | HOTA↑ | MOTA↑ | IDF1↑ |
---|---|---|---|---|---|
SORT [Bewley et al., 2016] |
OS-OD | one-shot | 30.05 | 20.83 | 33.90 |
iGLIP (Ours) | zero-shot | 54.21 | 62.90 | 64.34 | |
DeepSORT [Wojke et al., 2017] |
OS-OD | one-shot | 27.82 | 17.96 | 30.37 |
iGLIP (Ours) | zero-shot | 50.45 | 58.99 | 57.55 | |
ByteTrack [Zhang et al., 2022c] |
OS-OD | one-shot | 29.89 | 20.30 | 34.70 |
iGLIP (Ours) | zero-shot | 53.69 | 61.49 | 66.21 | |
OC-SORT [Cao et al., 2023] |
OS-OD | one-shot | 30.35 | 20.60 | 34.37 |
iGLIP (Ours) | zero-shot | 56.51 | 62.76 | 67.40 | |
Deep-OCSORT [Maggiolino et al., 2023] |
OS-OD | one-shot | 30.37 | 21.10 | 35.12 |
iGLIP (Ours) | zero-shot | 55.89 | 64.02 | 66.52 | |
MOTRv2 [Zhang et al., 2023] |
OS-OD | one-shot | 23.75 | 13.87 | 25.17 |
iGLIP (Ours) | zero-shot | 31.32 | 18.54 | 31.28 |
Trackers | HOTA↑ | MOTA↑ | IDF1↑ |
---|---|---|---|
SORT [bewley2016simple] |
54.21 | 62.90 | 64.34 |
DeepSORT [wojke2017simple] |
50.45 | 58.99 | 57.55 |
ByteTrack [zhang2021bytetrack] |
53.69 | 61.49 | 66.21 |
OC-SORT [cao2023observation] |
56.51 | 62.76 | 67.40 |
Deep-OCSORT [maggiolino2023deep] |
55.89 | 64.02 | 66.52 |
MOTRv2 [zhang2023motrv2] |
31.32 | 18.54 | 31.28 |
MA-SORT (Ours) | 56.75 | 64.62 | 68.17 |
Tracker | Detector | Train | HOTA | MOTA | IDF1 |
---|---|---|---|---|---|
SORT | FRCNN [ren2015faster] |
✔ | 42.80 | 55.60 | 49.20 |
DeepSORT | FRCNN [ren2015faster] |
✔ | 32.80 | 41.40 | 35.20 |
ByteTrack | YOLOX [yolox2021] |
✔ | 40.10 | 38.50 | 51.20 |
TransTrack | YOLOX [yolox2021] |
✔ | 45.40 | 48.30 | 53.40 |
QDTrack | YOLOX [yolox2021] |
✔ | 47.00 | 55.70 | 56.30 |
MA-SORT (Ours) | YOLOX [yolox2021] |
✔ | 57.86 | 68.32 | 63.01 |
MA-SORT (Ours) | iGLIP (Z-GMOT) (Ours) | ✖ | 53.28 | 57.64 | 58.43 |
Trackers | Detectors | Train | HOTA↑ | MOTA↑ | IDF1↑ |
---|---|---|---|---|---|
SORT [bewley2016simple] |
YOLOX [yolox2021] |
✔ | 47.80 | 88.20 | 48.30 |
DeepSORT [wojke2017simple] |
YOLOX [yolox2021] |
✔ | 45.80 | 87.10 | 46.80 |
MOTDT [Chen2018RealTimeMP] |
YOLOX [yolox2021] |
✔ | 39.20 | 84.30 | 39.60 |
ByteTrack [zhang2021bytetrack] |
YOLOX [yolox2021] |
✔ | 47.10 | 88.20 | 51.90 |
OC-SORT [cao2023observation] |
YOLOX [yolox2021] |
✔ | 52.10 | 87.30 | 51.60 |
MA-SORT (Ours) | YOLOX [yolox2021] |
✔ | 53.44 | 87.31 | 53.78 |
MA-SORT (Ours) | iGLIP Z-GMOT (Ours) | ✖ | 47.57 | 83.11 | 46.58 |
Trackers | HOTA↑ | MOTA↑ | IDF1↑ |
---|---|---|---|
MeMOT (Cai et al., 2022a) | 54.1 | 63.7 | 66.1 |
FairMOT (Zhang et al., 2021) | 54.6 | 61.8 | 67.3 |
TransTrack (Sun et al., 2020a) | 48.9 | 65.0 | 59.4 |
TrackFormer (Meinhardt et al., 2022b) | 54.7 | 68.6 | 65.7 |
ReMOT (Fan Yang and Nakamura, 2021) | 61.2 | 77.4 | 73.1 |
GSDT (Wang et al., 2020) | 53.6 | 67.1 | 67.5 |
CSTrack (Chao Liang and Zou, 2022) | 54.0 | 66.6 | 68.6 |
TransMOT (Peng Chu and Liu, 2023) | - | 77.4 | 75.2 |
ByteTrack (Zhang et al., 2022c) | 61.3 | 77.8 | 75.2 |
OC-SORT (Cao et al., 2023) | 62.4 | 75.7 | 76.3 |
ByteTrack (Zhang et al., 2022c)† | 60.4 | 74.2 | 74.5 |
OC-SORT (Cao et al., 2023)† | 60.5 | 73.1 | 74.4 |
MA-SORT (Ours) | 61.4 | 77.6 | 75.5 |