Detecting plagiarism in text documents through grammar-analysis of authors

M Tschuggnall, G Specht - 2013 - dl.gi.de
2013dl.gi.de
The task of intrinsic plagiarism detection is to find plagiarized sections within text documents
without using a reference corpus. In this paper, the intrinsic detection approach Plag-Inn is
presented which is based on the assumption that authors use a recognizable and
distinguishable grammar to construct sentences. The main idea is to analyze the grammar of
text documents and to find irregularities within the syntax of sentences, regardless of the
usage of concrete words. If suspicious sentences are found by computing the pq-gram …
Abstract
The task of intrinsic plagiarism detection is to find plagiarized sections within text documents without using a reference corpus. In this paper, the intrinsic detection approach Plag-Inn is presented which is based on the assumption that authors use a recognizable and distinguishable grammar to construct sentences. The main idea is to analyze the grammar of text documents and to find irregularities within the syntax of sentences, regardless of the usage of concrete words. If suspicious sentences are found by computing the pq-gram distance of grammar trees and by utilizing a Gaussian normal distribution, the algorithm tries to select and combine those sentences into potentially plagiarized sections. The parameters and thresholds needed by the algorithm are optimized by using genetic algorithms. Finally, the approach is evaluated against a large test corpus consisting of English documents, showing promising results.
dl.gi.de
以上显示的是最相近的搜索结果。 查看全部搜索结果

Google学术搜索按钮

example.edu/paper.pdf
搜索
获取 PDF 文件
引用
References