Recent work has shown that explicitly modeling the co-occurrence relationship between
classes is critical for achieving good performance on this task. State-of-theart approaches
model this using graph convolutional networks, which are complex and computationally
expensive. We propose a novel, efficient association module as an alternative. This is
coupled with a transformer-based feature-extraction backbone. The proposed model was …