Duplication detection for software bug reports based on topic model

J Zou, L Xu, M Yang, M Yan, D Yang… - 2016 9th International …, 2016 - ieeexplore.ieee.org
J Zou, L Xu, M Yang, M Yan, D Yang, X Zhang
2016 9th International Conference on Service Science (ICSS), 2016ieeexplore.ieee.org
The traditional duplicate bug reports detection approaches are usually based on vector
space model. However, the experimental result is rarely satisfying since this method cannot
distinguish semantic correlation among bug reports which written by natural languages.
Topic model, as a method to model underlying topics of texts, can solve the problem of
document similarity calculation methods used in the information retrieving. It can find the
semantic topics among the texts through massive training data, and obtain semantic …
The traditional duplicate bug reports detection approaches are usually based on vector space model. However, the experimental result is rarely satisfying since this method cannot distinguish semantic correlation among bug reports which written by natural languages. Topic model, as a method to model underlying topics of texts, can solve the problem of document similarity calculation methods used in the information retrieving. It can find the semantic topics among the texts through massive training data, and obtain semantic relatedness among documents. Therefore, this paper proposes a novel duplication detection method based on topic model. Through selecting bug reports with execution information and combing with classified information of bugs, not only does this new method overcome the problem of high dimension, sparse data and loud noise, but also avoid the problem of synonymy and ambiguity in the natural languages. Comparing to the traditional SVM method, the recall rate and precision rate of our proposed approach have obviously increased, which indicates the effectiveness of this new method.
ieeexplore.ieee.org
以上显示的是最相近的搜索结果。 查看全部搜索结果