Reconcile: Round-table conference improves reasoning via consensus among diverse llms

JCY Chen, S Saha, M Bansal - arXiv preprint arXiv:2309.13007, 2023 - arxiv.org
Large Language Models (LLMs) still struggle with complex reasoning tasks. Motivated by
the society of minds (Minsky, 1988), we propose ReConcile, a multi-model multi-agent …

Improving factuality and reasoning in language models through multiagent debate

Y Du, S Li, A Torralba, JB Tenenbaum… - arXiv preprint arXiv …, 2023 - arxiv.org
Large language models (LLMs) have demonstrated remarkable capabilities in language
generation, understanding, and few-shot learning in recent years. An extensive body of work …

Encouraging divergent thinking in large language models through multi-agent debate

T Liang, Z He, W Jiao, X Wang, Y Wang… - arXiv preprint arXiv …, 2023 - arxiv.org
Modern large language models (LLMs) like ChatGPT have shown remarkable performance
on general language tasks but still struggle on complex reasoning tasks, which drives the …

Corex: Pushing the boundaries of complex reasoning through multi-model collaboration

Q Sun, Z Yin, X Li, Z Wu, X Qiu, L Kong - arXiv preprint arXiv:2310.00280, 2023 - arxiv.org
Large Language Models (LLMs) are evolving at an unprecedented pace and have exhibited
considerable capability in the realm of natural language processing (NLP) with world …

Language models with rationality

N Kassner, O Tafjord, A Sabharwal… - arXiv preprint arXiv …, 2023 - arxiv.org
While large language models (LLMs) are proficient at question-answering (QA), it is not
always clear how (or even if) an answer follows from their latent" beliefs". This lack of …

Exchange-of-thought: Enhancing large language model capabilities through cross-model communication

Z Yin, Q Sun, C Chang, Q Guo, J Dai… - Proceedings of the …, 2023 - aclanthology.org
Abstract Large Language Models (LLMs) have recently made significant strides in complex
reasoning tasks through the Chain-of-Thought technique. Despite this progress, their …

Can ChatGPT Defend the Truth? Automatic Dialectical Evaluation Elicits LLMs' Deficiencies in Reasoning

B Wang, X Yue, H Sun - arXiv preprint arXiv:2305.13160, 2023 - arxiv.org
We explore testing the reasoning ability of large language models (LLMs), such as
ChatGPT, by engaging with them in a debate-like conversation that probes deeper into their …

Can large language models explore in-context?

A Krishnamurthy, K Harris, DJ Foster, C Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org
We investigate the extent to which contemporary Large Language Models (LLMs) can
engage in exploration, a core capability in reinforcement learning and decision making. We …

Adapting llm agents through communication

K Wang, Y Lu, M Santacroce, Y Gong, C Zhang… - arXiv preprint arXiv …, 2023 - arxiv.org
Recent advancements in large language models (LLMs) have shown potential for human-
like agents. To help these agents adapt to new tasks without extensive human supervision …

Is multi-hop reasoning really explainable? Towards benchmarking reasoning interpretability

X Lv, Y Cao, L Hou, J Li, Z Liu, Y Zhang… - arXiv preprint arXiv …, 2021 - arxiv.org
Multi-hop reasoning has been widely studied in recent years to obtain more interpretable
link prediction. However, we find in experiments that many paths given by these models are …