World of code: an infrastructure for mining the universe of open source VCS data

Y Ma, C Bogart, S Amreen, R Zaretzki… - 2019 IEEE/ACM 16th …, 2019 - ieeexplore.ieee.org
Open source software (OSS) is essential for modern society and, while substantial research
has been done on individual (typically central) projects, only a limited understanding of the …

World of code: enabling a research workflow for mining and analyzing the universe of open source VCS data

Y Ma, T Dey, C Bogart, S Amreen, M Valiev… - Empirical Software …, 2021 - Springer
Open source software (OSS) is essential for modern society and, while substantial research
has been done on individual (typically central) projects, only a limited understanding of the …

Collective program analysis

G Upadhyaya, H Rajan - … of the 40th international conference on …, 2018 - dl.acm.org
Popularity of data-driven software engineering has led to an increasing demand on the
infrastructures to support efficient execution of tasks that require deeper source code …

Semantics and anomaly preserving sampling strategy for large-scale time series data

S Ahmed, MJ Islam, H Rajan - ACM/IMS Transactions on Data Science …, 2022 - dl.acm.org
We propose PASS, a O (n) algorithm for data reduction that is specifically aimed at
preserving the semantics of time series data visualization in the form of line chart …

Bcfa: bespoke control flow analysis for cfa at scale

R Ramu, GB Upadhyaya, HA Nguyen… - Proceedings of the ACM …, 2020 - dl.acm.org
Many data-driven software engineering tasks such as discovering programming patterns,
mining API specifications, etc., perform source code analysis over control flow graphs …

Software Supply Chain Development and Application

Y Ma - 2020 - trace.tennessee.edu
Abstract Motivation: Free Libre Open Source Software (FLOSS) has become a critical
componentin numerous devices and applications. Despite its importance, it is not clear why …

BoaT: A domain specific language and shared data science infrastructure for large scale transportation data analysis

J Islam - 2019 - search.proquest.com
Big data-driven transportation engineering has the potential to improve utilization of road
infrastructure, decrease traffic fatalities, improve fuel consumption, and decrease …

Domain-specific language and infrastructure for genomics

H Bagheri - 2019 - search.proquest.com
Creating a scalable computational infrastructure to analyze the wealth of information
contained in data repositories is difficult due to significant barriers in organizing, extracting …

A hybrid approach for selecting and optimizing graph traversal strategy for analyzing big code

R Ramu - 2017 - search.proquest.com
Our newfound ability to analyze source code in massive software repositories such as
GitHub has led to an uptick in data-driven solutions to software engineering problems …