gaining popularity within the construction domain. However, their relatively low per-frame
performance necessitates additional post-processing to link the per-frame detection results
and construct the corresponding action tubes. To address this problem, this study proposes
DIGER, which stands for knowledge DIstillation of temporal Gradient data for Excavator
activity Recognition. DIGER is built upon the You Only Watch Once activity recognition …