查看文章

warwick.ac.uk 中的 [PDF]

Style Analysis for Source Code Plagiarism Detection

作者

Olfat Mirza, Mike Joy

发表日期

2015

研讨会论文

Plagiarism across Europe and Beyond

页码范围

53–61

简介

The enormous growth in the available online code resources has created new challenges for detecting plagiarism in source code of programs. Several software applications can detect source code similarity using different detection methods. However, few current detection tools detect every kind of detection plagiarism attack. The aim of this thesis is, therefore, to enhance methods for plagiarism detection in source code using a style analysis approach that has been used to detect authorship. There are very few large source-code datasets which are suitable for research purposes, and two such datasets include the BlackBox dataset and the SOCO (Detection of SOurce COde) dataset. SOCO is a benchmark dataset that contains groups of similar source-code files that can be considered plagiarised and has been used in authorship and plagiarism detection competitions. In the first part of the thesis, the suitability of BlackBox as source of datasets for testing plagiarism detection is explored. The files in BlackBox were analysed and visualised in order to evaluate its suitability as a dataset that can be used in this research. The analysis aimed to identify similar source code files, and therefore to detect groups of Java files within BlackBox that can be used for evaluating the performance of source-code plagiarism detection methods. In the second part of the thesis, a plagiarism detection framework (\the Metric-File Matrix Framework (MFM)" is proposed. The MFM framework is designed to overcome some of the limitations of existing plagiarism detection methods by 1) proposing a new set of metrics which consider structural and stylistic similarities; and 2 …

引用总数

被引用次数：4

202020212022202320241 1 1 1

学术搜索中的文章

Style analysis for source code plagiarism detection

OM Mirza - 2018

被引用次数：4 相关文章所有 6 个版本