Big code!= big vocabulary: Open-vocabulary models for source code

RM Karampatsis, H Babii, R Robbes, C Sutton… - Proceedings of the …, 2020 - dl.acm.org
Statistical language modeling techniques have successfully been applied to large source
code corpora, yielding a variety of new software development tools, such as tools for code …

Learning features by watching objects move

D Pathak, R Girshick, P Dollár… - Proceedings of the …, 2017 - openaccess.thecvf.com
This paper presents a novel yet intuitive approach to unsupervised feature learning. Inspired
by the human visual system, we explore whether low-level motion-based grouping cues can …

Symlm: Predicting function names in stripped binaries via context-sensitive execution-aware code embeddings

X Jin, K Pei, JY Won, Z Lin - Proceedings of the 2022 ACM SIGSAC …, 2022 - dl.acm.org
Predicting function names in stripped binaries is an extremely useful but challenging task, as
it requires summarizing the execution behavior and semantics of the function in human …

Improving bug localization using structured information retrieval

RK Saha, M Lease, S Khurshid… - 2013 28th IEEE/ACM …, 2013 - ieeexplore.ieee.org
Locating bugs is important, difficult, and expensive, particularly for large-scale systems. To
address this, natural language information retrieval techniques are increasingly being used …

Learning to rank relevant files for bug reports using domain knowledge

X Ye, R Bunescu, C Liu - Proceedings of the 22nd ACM SIGSOFT …, 2014 - dl.acm.org
When a new bug report is received, developers usually need to reproduce the bug and
perform code reviews to find the cause, a process that can be tedious and time consuming …

Towards automatically generating summary comments for java methods

G Sridhara, E Hill, D Muppaneni, L Pollock… - Proceedings of the 25th …, 2010 - dl.acm.org
Studies have shown that good comments can help programmers quickly understand what a
method does, aiding program comprehension and software maintenance. Unfortunately, few …

On the localness of software

Z Tu, Z Su, P Devanbu - Proceedings of the 22nd ACM SIGSOFT …, 2014 - dl.acm.org
The n-gram language model, which has its roots in statistical natural language processing,
has been shown to successfully capture the repetitive and predictable regularities …

Retrieval from software libraries for bug localization: a comparative study of generic and composite text models

S Rao, A Kak - Proceedings of the 8th Working Conference on Mining …, 2011 - dl.acm.org
From the standpoint of retrieval from large software libraries for the purpose of bug
localization, we compare five generic text models and certain composite variations thereof …

Improving ir-based bug localization with context-aware query reformulation

MM Rahman, CK Roy - Proceedings of the 2018 26th ACM joint meeting …, 2018 - dl.acm.org
Recent findings suggest that Information Retrieval (IR)-based bug localization techniques do
not perform well if the bug report lacks rich structured information (eg, relevant program …

Automatically assessing code understandability: How far are we?

S Scalabrino, G Bavota, C Vendome… - 2017 32nd IEEE …, 2017 - ieeexplore.ieee.org
Program understanding plays a pivotal role in software maintenance and evolution: a deep
understanding of code is the stepping stone for most software-related activities, such as bug …