Spatial-temporal knowledge-embedded transformer for video scene graph generation

T Pu, T Chen, H Wu, Y Lu, L Lin - IEEE Transactions on Image …, 2023 - ieeexplore.ieee.org
Video scene graph generation (VidSGG) aims to identify objects in visual scenes and infer
their relationships for a given video. It requires not only a comprehensive understanding of …

QDETRv: Query-Guided DETR for One-Shot Object Localization in Videos

Y Kumar, S Mallick, A Mishra, S Rasipuram… - Proceedings of the …, 2024 - ojs.aaai.org
In this work, we study one-shot video object localization problem that aims to localize
instances of unseen objects in the target video using a single query image of the object …

PDSE-Lite: lightweight framework for plant disease severity estimation based on Convolutional Autoencoder and Few-Shot Learning

P Bedi, P Gole, S Marwaha - Frontiers in Plant Science, 2024 - frontiersin.org
Plant disease diagnosis with estimation of disease severity at early stages still remains a
significant research challenge in agriculture. It is helpful in diagnosing plant diseases at the …

CHAPVIDMR: Chapter-based Video Moment Retrieval using Natural Language Queries

U Agarwal, Y Kumar, A Shahid, P Gatti… - Proceedings of the …, 2024 - dl.acm.org
Video Moment Retrieval (VMR) is the task of linking a query with a relevant moment from a
video. Although, recently, there has been work on the VMR task where a query is linked to a …

[PDF][PDF] Supplementary Material: Few-Shot Referring Relationships in Videos

Y Kumar, A Mishra - vl2g.github.io
The goal of this module is to propose relationship pair proposals. Given a video v, the first
tracklet set Te is extracted using a pre-trained detector. An attention score is calculated for …