DS-1000: A natural and reliable benchmark for data science code generation

Y Lai, C Li, Y Wang, T Zhang, R Zhong… - International …, 2023 - proceedings.mlr.press
We introduce DS-1000, a code generation benchmark with a thousand data science
problems spanning seven Python libraries, such as Numpy and Pandas. Compared to prior …

Cypher: An evolving query language for property graphs

N Francis, A Green, P Guagliardo, L Libkin… - Proceedings of the …, 2018 - dl.acm.org
The Cypher property graph query language is an evolving language, originally designed
and implemented as part of the Neo4j graph database, and it is currently used by several …

Synthesizing highly expressive SQL queries from input-output examples

C Wang, A Cheung, R Bodik - Proceedings of the 38th ACM SIGPLAN …, 2017 - dl.acm.org
SQL is the de facto language for manipulating relational data. Though powerful, many users
find it difficult to write SQL queries due to highly expressive constructs. While using the …

Natural language to SQL: Where are we today?

H Kim, BH So, WS Han, H Lee - Proceedings of the VLDB Endowment, 2020 - dl.acm.org
Translating natural language to SQL (NL2SQL) has received extensive attention lately,
especially with the recent success of deep learning technologies. However, despite the …

EVA: A symbolic approach to accelerating exploratory video analytics with materialized views

Z Xu, GT Kakkar, J Arulraj… - Proceedings of the 2022 …, 2022 - dl.acm.org
Advances in deep learning have led to a resurgence of interest in video analytics. In an
exploratory video analytics pipeline, a data scientist often starts by searching for a global …

Debugging database queries: A survey of tools, techniques, and users

S Gathani, P Lim, L Battle - Proceedings of the 2020 CHI Conference on …, 2020 - dl.acm.org
Database management systems (or DBMSs) have been around for decades, and yet are still
difficult to use, particularly when trying to identify and fix errors in user programs (or queries) …

Automatic view generation with deep learning and reinforcement learning

H Yuan, G Li, L Feng, J Sun… - 2020 IEEE 36th …, 2020 - ieeexplore.ieee.org
Materializing views is an important method to reduce redundant computations in DBMS,
especially for processing large scale analytical queries. However, many existing methods …

Quantifying TPC-H choke points and their optimizations

M Dreseler, M Boissier, T Rabl, M Uflacker - Proceedings of the VLDB …, 2020 - dl.acm.org
TPC-H continues to be the most widely used benchmark for relational OLAP systems. It
poses a number of challenges, also known as" choke points", which database systems have …

Wetune: Automatic discovery and verification of query rewrite rules

Z Wang, Z Zhou, Y Yang, H Ding, G Hu, D Ding… - Proceedings of the …, 2022 - dl.acm.org
Query rewriting transforms a relational database query into an equivalent but more efficient
one, which is crucial for the performance of database-backed applications. Such rewriting …

Genie: A generator of natural language semantic parsers for virtual assistant commands

G Campagna, S Xu, M Moradshahi, R Socher… - Proceedings of the 40th …, 2019 - dl.acm.org
To understand diverse natural language commands, virtual assistants today are trained with
numerous labor-intensive, manually annotated sentences. This paper presents a …