[PDF][PDF] Enhancing authorship attribution by utilizing syntax tree profiles

M Tschuggnall, G Specht - Proceedings of the 14th Conference of …, 2014 - aclanthology.org
Proceedings of the 14th Conference of the European Chapter of the …, 2014aclanthology.org
The aim of modern authorship attribution approaches is to analyze known authors and to
assign authorships to previously unseen and unlabeled text documents based on various
features. In this paper we present a novel feature to enhance current attribution methods by
analyzing the grammar of authors. To extract the feature, a syntax tree of each sentence of a
document is calculated, which is then split up into length-independent patterns using pq-
grams. The mostly used pq-grams are then used to compose sample profiles of authors that …
Abstract
The aim of modern authorship attribution approaches is to analyze known authors and to assign authorships to previously unseen and unlabeled text documents based on various features. In this paper we present a novel feature to enhance current attribution methods by analyzing the grammar of authors. To extract the feature, a syntax tree of each sentence of a document is calculated, which is then split up into length-independent patterns using pq-grams. The mostly used pq-grams are then used to compose sample profiles of authors that are compared with the profile of the unlabeled document by utilizing various distance metrics and similarity scores. An evaluation using three different and independent data sets reveals promising results and indicate that the grammar of authors is a significant feature to enhance modern authorship attribution methods.
aclanthology.org
以上显示的是最相近的搜索结果。 查看全部搜索结果