E Davis - ACM Computing Surveys, 2023 - dl.acm.org
More than one hundred benchmarks have been developed to test the commonsense knowledge and commonsense reasoning abilities of artificial intelligence (AI) systems …
Recent works successfully leveraged Large Language Models'(LLM) abilities to capture abstract knowledge about world's physics to solve decision-making problems. Yet, the …
Recent efforts have incorporated large language models (LLMs) with external resources (eg, the Internet) or internal control flows (eg, prompt chaining) for tasks requiring grounding or …
BY Lin, Y Fu, K Yang, F Brahman… - Advances in …, 2024 - proceedings.neurips.cc
We introduce SwiftSage, a novel agent framework inspired by the dual-process theory of human cognition, designed to excel in action planning for complex interactive reasoning …
Open large language models (LLMs) with great performance in various tasks have significantly advanced the development of LLMs. However, they are far inferior to …
Recent advancements in Large Language Models (LLMs) showcase advanced reasoning, yet NLP evaluations often depend on static benchmarks. Evaluating this necessitates …
X Huang, W Liu, X Chen, X Wang, H Wang… - arXiv preprint arXiv …, 2024 - arxiv.org
As Large Language Models (LLMs) have shown significant intelligence, the progress to leverage LLMs as planning modules of autonomous agents has attracted more attention …
Abstract Language models are capable of commonsense reasoning: while domain-specific models can learn from explicit knowledge (eg commonsense graphs [6], ethical norms [25]) …
Large language models (LLMs) have achieved remarkable performance on various NLP tasks and are augmented by tools for broader applications. Yet, how to evaluate and …