Text2sql is not enough: Unifying ai and databases with tag

A Biswal, L Patel, S Jha, A Kamsetty, S Liu… - arXiv preprint arXiv …, 2024 - arxiv.org
AI systems that serve natural language questions over databases promise to unlock
tremendous value. Such systems would allow users to leverage the powerful reasoning and …

Databases Unbound: Querying All of the World's Bytes with AI

S Madden, M Cafarella, M Franklin… - Proceedings of the VLDB …, 2024 - dl.acm.org
Over the past five decades, the relational database model has proven to be a scaleable and
adaptable model for querying a variety of structured data, with use cases in analytics …

DocETL: Agentic Query Rewriting and Evaluation for Complex Document Processing

S Shankar, AG Parameswaran, E Wu - arXiv preprint arXiv:2410.12189, 2024 - arxiv.org
Analyzing unstructured data, such as complex documents, has been a persistent challenge
in data processing. Large Language Models (LLMs) have shown promise in this regard …

Knowledge-Aware Query Expansion with Large Language Models for Textual and Relational Retrieval

Y Xia, J Wu, S Kim, T Yu, RA Rossi, H Wang… - arXiv preprint arXiv …, 2024 - arxiv.org
Large language models (LLMs) have been used to generate query expansions augmenting
original queries for improving information search. Recent studies also explore providing …

The Design of an LLM-powered Unstructured Analytics System

E Anderson, J Fritz, A Lee, B Li, M Lindblad… - arXiv preprint arXiv …, 2024 - arxiv.org
LLMs demonstrate an uncanny ability to process unstructured data, and as such, have the
potential to go beyond search and run complex, semantic analyses at scale. We describe …

Variable Extraction for Model Recovery in Scientific Literature

C Liu, E Noriega-Atala, A Pyarelal, CT Morrison… - arXiv preprint arXiv …, 2024 - arxiv.org
The global output of academic publications exceeds 5 million articles per year, making it
difficult for humans to keep up with even a tiny fraction of scientific output. We need methods …

Large language models for artificial general intelligence (AGI): A survey of foundational principles and approaches

A Mumuni, F Mumuni - arXiv preprint arXiv:2501.03151, 2025 - arxiv.org
Generative artificial intelligence (AI) systems based on large-scale pretrained foundation
models (PFMs) such as vision-language models, large language models (LLMs), diffusion …

Code-Survey: An LLM-Driven Methodology for Analyzing Large-Scale Codebases

Y Zheng, Y Yang, H Tu, Y Huang - arXiv preprint arXiv:2410.01837, 2024 - arxiv.org
Modern software systems like the Linux kernel are among the world's largest and most
intricate codebases, continually evolving with new features and increasing complexity …

TQA-Bench: Evaluating LLMs for Multi-Table Question Answering with Scalable Context and Symbolic Extension

Z Qiu, Y Peng, G He, B Yuan, C Wang - arXiv preprint arXiv:2411.19504, 2024 - arxiv.org
The advent of large language models (LLMs) has unlocked great opportunities in complex
data management tasks, particularly in question answering (QA) over complicated multi …

CHASE: A Native Relational Database for Hybrid Queries on Structured and Unstructured Data

R Ma, K Zhang, Z He, Y Jing, XS Wang… - arXiv preprint arXiv …, 2025 - arxiv.org
Querying both structured and unstructured data has become a new paradigm in data
analytics and recommendation. With unstructured data, such as text and videos, are …