Plag-inn: Intrinsic plagiarism detection using grammar trees

M Tschuggnall, G Specht - … on Applications of Natural Language to …, 2012 - Springer
Natural Language Processing and Information Systems: 17th International …, 2012Springer
Intrinsic plagiarism detection deals with the task of finding plagiarized sections of text
documents without using a reference corpus. This paper describes a novel approach to this
task by processing and analyzing the grammar of a suspicious document. The main idea is
to split a text into single sentences and to calculate grammar trees. To find suspicious
sentences, these grammar trees are compared in a distance matrix by using the pq-gram-
distance, an alternative for the tree edit distance. Finally, significantly different sentences …
Abstract
Intrinsic plagiarism detection deals with the task of finding plagiarized sections of text documents without using a reference corpus. This paper describes a novel approach to this task by processing and analyzing the grammar of a suspicious document. The main idea is to split a text into single sentences and to calculate grammar trees. To find suspicious sentences, these grammar trees are compared in a distance matrix by using the pq-gram-distance, an alternative for the tree edit distance. Finally, significantly different sentences regarding their grammar and with respect to the Gaussian normal distribution are marked as suspicious.
Springer
以上显示的是最相近的搜索结果。 查看全部搜索结果