Words are not enough: sentence level natural language watermarking

M Topkara, U Topkara, MJ Atallah - Proceedings of the 4th ACM …, 2006 - dl.acm.org
Proceedings of the 4th ACM international workshop on Contents protection and …, 2006dl.acm.org
Compared to other media, natural language text presents unique challenges for information
hiding. These challenges require the design of a robust algorithm that can work under
following constraints:(i) low embedding bandwidth, ie, number of sentences is comparable
with message length,(ii) not all transformations can be applied to a given sentence (iii) the
number of alternative forms for a sentence is relatively small, a limitation governed by the
grammar and vocabulary of the natural language, as well as the requirement to preserve the …
Compared to other media, natural language text presents unique challenges for information hiding. These challenges require the design of a robust algorithm that can work under following constraints: (i) low embedding bandwidth, i.e., number of sentences is comparable with message length, (ii) not all transformations can be applied to a given sentence (iii) the number of alternative forms for a sentence is relatively small, a limitation governed by the grammar and vocabulary of the natural language, as well as the requirement to preserve the style and fluency of the document. The adversary can carry out all the transformations used for embedding to remove the embedded message. In addition, the adversary can also permute the sentences, select and use a subset of sentences, and insert new sentences. We give a scheme that overcomes these challenges, together with a partial implementation and its evaluation for the English language. The present application of this scheme works at the sentence level while also using a word-level watermarking technique that was recently designed and built into a fully automatic system ("Equimark"). Unlike Equimark, whose resilience relied on the introduction of ambiguities, the present paper's sentence-level technique is more tuned to situations where very little change to the text is allowable (i.e., when style is important). Secondarily, this paper shows how to use lower-level (in this case word-level) marking to improve the resilience and embedding properties of higher level (in this case sentence level) schemes. We achieve this by using the word-based methods as a separate channel from the sentence-based methods, thereby improving the results of either one alone. The sentence level watermarking technique we introduce is novel and powerful, as it relies on multiple features of each sentence and exploits the notion of orthogonality between features.
ACM Digital Library
以上显示的是最相近的搜索结果。 查看全部搜索结果