SecurityEval dataset: mining vulnerability examples to evaluate machine learning-based code generation techniques

ML Siddiq, JCS Santos - Proceedings of the 1st International Workshop …, 2022 - dl.acm.org
Automated source code generation is currently a popular machine-learning-based task. It
can be helpful for software developers to write functionally correct code from a given context …

Detecting code vulnerabilities by learning from large-scale open source repositories

R Xu, Z Tang, G Ye, H Wang, X Ke, D Fang… - Journal of Information …, 2022 - Elsevier
Abstract Machine learning methods are widely used to identify common, repeatedly
occurring bugs and code vulnerabilities. The performance of a machine-learned model is …

Vulnerability prediction from source code using machine learning

Z Bilgin, MA Ersoy, EU Soykan, E Tomur… - IEEE …, 2020 - ieeexplore.ieee.org
As the role of information and communication technologies gradually increases in our lives,
software security becomes a major issue to provide protection against malicious attempts …

Ai-powered vulnerability detection for secure source code development

S Rajapaksha, J Senanayake, H Kalutarage… - International Conference …, 2022 - Springer
Vulnerable source code in software applications is causing paramount reliability and
security issues. Software security principles should be integrated to reduce these issues at …

Machine learning for source code vulnerability detection: What works and what isn't there yet

T Marjanov, I Pashchenko… - IEEE Security & Privacy, 2022 - ieeexplore.ieee.org
We review machine learning approaches for detecting (and correcting) vulnerabilities in
source code, finding that the biggest challenges ahead involve agreeing to a benchmark …

VUDENC: vulnerability detection with deep learning on a natural codebase for Python

L Wartschinski, Y Noller, T Vogel, T Kehrer… - Information and …, 2022 - Elsevier
Context: Identifying potential vulnerable code is important to improve the security of our
software systems. However, the manual detection of software vulnerabilities requires expert …

Security weaknesses of copilot generated code in github

Y Fu, P Liang, A Tahir, Z Li, M Shahin, J Yu - arXiv preprint arXiv …, 2023 - arxiv.org
Modern code generation tools use AI models, particularly Large Language Models (LLMs),
to generate functional and complete code. While such tools are becoming popular and …

Is github's copilot as bad as humans at introducing vulnerabilities in code?

O Asare, M Nagappan, N Asokan - Empirical Software Engineering, 2023 - Springer
Several advances in deep learning have been successfully applied to the software
development process. Of recent interest is the use of neural language models to build tools …

Llmseceval: A dataset of natural language prompts for security evaluations

C Tony, M Mutas, NED Ferreyra… - 2023 IEEE/ACM 20th …, 2023 - ieeexplore.ieee.org
Large Language Models (LLMs) like Codex are powerful tools for performing code
completion and code generation tasks as they are trained on billions of lines of code from …

A lightweight framework for high-quality code generation

ML Siddiq, B Casey, J Santos - arXiv preprint arXiv:2307.08220, 2023 - arxiv.org
In recent years, the use of automated source code generation utilizing transformer-based
generative models has expanded, and these models can generate functional code …