PIMan: A comprehensive approach for establishing plausible influence among software repositories

MOF Rokon, R Islam, MR Masud… - 2022 IEEE/ACM …, 2022 - ieeexplore.ieee.org
2022 IEEE/ACM International Conference on Advances in Social …, 2022ieeexplore.ieee.org
How can we quantify the influence among repos-itories in online archives like GitHub?
Determining repository influence is an essential building block for understanding the
dynamics of GitHub-like software archives. The key challenge is to define the appropriate
representation model of influence that captures the nuances of the concept and considers its
diverse manifestations. We propose PIMan, a systematic approach to quantify the influence
among the repositories in a software archive by focusing on the social level interactions. As …
How can we quantify the influence among repos-itories in online archives like GitHub? Determining repository influence is an essential building block for understanding the dynamics of GitHub-like software archives. The key challenge is to define the appropriate representation model of influence that captures the nuances of the concept and considers its diverse manifestations. We propose PIMan, a systematic approach to quantify the influence among the repositories in a software archive by focusing on the social level interactions. As our key novelty, we introduce the concept of Plausible Influence which considers three types of information: (a) repository level interactions, (b) author level interactions, and (c) temporal considerations. We evaluate and apply our method using 2089 malware repositories from GitHub spanning approximately 12 years. First, we show how our approach provides a powerful and flexible way to generate a plausible influence graph whose density is determined by the Plausible Influence Threshold (PIT), which is modifiable to meet the needs of a study. Second, we find that there is a significant collaboration and influence among the repositories in our dataset. We identify 28 connected components in the plausible influence graph (PIT = 0.25) with 7% of the components containing at least 15 repositories. Furthermore, we find 19 repositories that influenced at least 10 other repositories directly and spawned at least two “families” of repositories. In addition, the results show that our influence metrics capture the manifold aspects of the interactions that are not captured by the typical repository popularity metrics (e.g. number of stars). Overall, our work is a fundamental building block for identifying the influence and lineage of the repositories in online software platforms.
ieeexplore.ieee.org
以上显示的是最相近的搜索结果。 查看全部搜索结果