查看文章

arxiv.org 中的 [PDF]

A simple transformer-based model for ego4d natural language queries challenge

作者

Sicheng Mo, Fangzhou Mu, Yin Li

发表日期

2022/11/16

期刊

arXiv preprint arXiv:2211.08704

简介

This report describes Badgers@UW-Madison, our submission to the Ego4D Natural Language Queries (NLQ) Challenge. Our solution inherits the point-based event representation from our prior work on temporal action localization, and develops a Transformer-based model for video grounding. Further, our solution integrates several strong video features including SlowFast, Omnivore and EgoVLP. Without bells and whistles, our submission based on a single model achieves 12.64% Mean R@1 and is ranked 2nd on the public leaderboard. Meanwhile, our method garners 28.45% (18.03%) R@5 at tIoU=0.3 (0.5), surpassing the top-ranked solution by up to 5.5 absolute percentage points.

引用总数

被引用次数：8

202320246 2

学术搜索中的文章

A simple transformer-based model for ego4d natural language queries challenge

S Mo, F Mu, Y Li - arXiv preprint arXiv:2211.08704, 2022

被引用次数：8 相关文章所有 2 个版本