TORQUE: A reading comprehension dataset of temporal ordering questions

A Rogers, M Gardner, I Augenstein - ACM Computing Surveys, 2023 - dl.acm.org

Alongside huge volumes of research on deep learning models in NLP in the recent years,
there has been much work on benchmark datasets needed to track modeling progress …

被引用次数：220 相关文章所有 6 个版本

[PDF] arxiv.org

Benchmarks for automated commonsense reasoning: A survey

E Davis - ACM Computing Surveys, 2023 - dl.acm.org

More than one hundred benchmarks have been developed to test the commonsense
knowledge and commonsense reasoning abilities of artificial intelligence (AI) systems …

被引用次数：53 相关文章所有 4 个版本

[PDF] mit.edu

Time-aware language models as temporal knowledge bases

B Dhingra, JR Cole, JM Eisenschlos… - Transactions of the …, 2022 - direct.mit.edu

Many facts come with an expiration date, from the name of the President to the basketball
team Lebron James plays for. However, most language models (LMs) are trained on …

被引用次数：247 相关文章所有 10 个版本

[PDF] arxiv.org

Towards benchmarking and improving the temporal reasoning capability of large language models

Q Tan, HT Ng, L Bing - arXiv preprint arXiv:2306.08952, 2023 - arxiv.org

Reasoning about time is of fundamental importance. Many facts are time-dependent. For
example, athletes change teams from time to time, and different government officials are …

被引用次数：57 相关文章所有 4 个版本

[PDF] neurips.cc

Language models can improve event prediction by few-shot abductive reasoning

X Shi, S Xue, K Wang, F Zhou… - Advances in …, 2024 - proceedings.neurips.cc

Large language models have shown astonishing performance on a wide range of reasoning
tasks. In this paper, we investigate whether they could reason about real-world events and …

被引用次数：37 相关文章所有 6 个版本

[PDF] thecvf.com

Test of time: Instilling video-language models with a sense of time

P Bagad, M Tapaswi… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Modelling and understanding time remains a challenge in contemporary video
understanding models. With language emerging as a key driver towards powerful …

被引用次数：37 相关文章所有 9 个版本

[PDF] mlr.press

Streamingqa: A benchmark for adaptation to new knowledge over time in question answering models

A Liska, T Kocisky, E Gribovskaya… - International …, 2022 - proceedings.mlr.press

Abstract Knowledge and language understanding of models evaluated through question
answering (QA) has been usually studied on static snapshots of knowledge, like Wikipedia …

被引用次数：56 相关文章所有 3 个版本

[PDF] arxiv.org

A dataset for answering time-sensitive questions

W Chen, X Wang, WY Wang - arXiv preprint arXiv:2108.06314, 2021 - arxiv.org

Time is an important dimension in our physical world. Lots of facts can evolve with respect to
time. For example, the US President might change every four years. Therefore, it is important …

被引用次数：85 相关文章所有 5 个版本

[PDF] hal.science

[引用][C] Reasoning with transformer-based models: Deep learning, but shallow reasoning

C Helwe, C Clavel, F Suchanek - International Conference on …, 2021 - imt.hal.science

Recent years have seen impressive performance of transformer-based models on different
natural language processing tasks. However, it is not clear to what degree the transformers …

被引用次数：65 相关文章所有 13 个版本

[PDF] arxiv.org

TIMEDIAL: Temporal commonsense reasoning in dialog

L Qin, A Gupta, S Upadhyay, L He, Y Choi… - arXiv preprint arXiv …, 2021 - arxiv.org

Everyday conversations require understanding everyday events, which in turn, requires
understanding temporal commonsense concepts interwoven with those events. Despite …

被引用次数：64 相关文章所有 7 个版本

高级搜索

QQ 群