Codefusion: A pre-trained diffusion model for code generation

M Singh, J Cambronero, S Gulwani, V Le… - arXiv preprint arXiv …, 2023 - arxiv.org
Imagine a developer who can only change their last line of code, how often would they have
to start writing a function from scratch before it is correct? Auto-regressive models for code …

DataVinci: Learning Syntactic and Semantic String Repairs

M Singh, J Cambronero, S Gulwani, V Le… - arXiv preprint arXiv …, 2023 - arxiv.org
String data is common in real-world datasets: 67.6% of values in a sample of 1.8 million real
Excel spreadsheets from the web were represented as text. Systems that successfully clean …

Semantically Aligned Question and Code Generation for Automated Insight Generation

A Singha, B Chopra, A Khatry, S Gulwani… - arXiv preprint arXiv …, 2024 - arxiv.org
Automated insight generation is a common tactic for helping knowledge workers, such as
data scientists, to quickly understand the potential value of new and unfamiliar data …

[PDF][PDF] 4.8 Scalability Estimates of Graph Certificates in a Theorem Prover Using SAT Encodings

A Bauer, K Berčič, F Rabe… - … proofs, algorithms and …, 2024 - drops.dagstuhl.de
Recent advances in artificial intelligence (AI) have ushered in a transformative era in the
cybersecurity landscape. The integration of AI technologies introduces a novel dimension to …

[PDF][PDF] 4.4 Finding Suitable Benchmark Problems for Inductive Programming

G Verbruggen - Approaches and Applications of Inductive … - drops.dagstuhl.de
To advance progress as well as visibility of IP, a collection of suitable benchmarks,
convincing use cases, and joint formats to represent problems, as well as starting an IP …