作者
Guanjun Lin, Jun Zhang, Wei Luo, Lei Pan, Yang Xiang, Olivier De Vel, Paul Montague
发表日期
2018/4/2
期刊
IEEE Transactions on Industrial Informatics
卷号
14
期号
7
页码范围
3289-3297
出版商
IEEE
简介
Machine learning is now widely used to detect security vulnerabilities in the software, even before the software is released. But its potential is often severely compromised at the early stage of a software project when we face a shortage of high-quality training data and have to rely on overly generic hand-crafted features. This paper addresses this cold-start problem of machine learning, by learning rich features that generalize across similar projects. To reach an optimal balance between feature-richness and generalizability, we devise a data-driven method including the following innovative ideas. First, the code semantics are revealed through serialized abstract syntax trees (ASTs), with tokens encoded by Continuous Bag-of-Words neural embeddings. Next, the serialized ASTs are fed to a sequential deep learning classifier (Bi-LSTM) to obtain a representation indicative of software vulnerability. Finally, the neural …
引用总数
2017201820192020202120222023202412203431425622
学术搜索中的文章
G Lin, J Zhang, W Luo, L Pan, Y Xiang, O De Vel… - IEEE Transactions on Industrial Informatics, 2018