Detecting the Theft of Natural Language Text Using Birthmark

Author

Yang, Jianlong ; Wang, Jianmin ; Li, Deyi

Author_Institution

Tsinghua University, China

fYear

2006

fDate

Dec. 2006

Firstpage

699

Lastpage

702

Abstract

To detect the theft of natural language text effectively, we present a novel scheme to derive birthmark from the text. Since birthmark is a unique and native characteristic of every text, a text with the same birthmark of another can be easily suspected of a copy. Ideally, birthmark should satisfy two properties: (a) credibility - independent texts must be distinguished by completely different birthmarks, and (b) resilience - birthmark should be tolerant against meaningpreserving attacks. To evaluate the effectiveness of the proposed birthmark, we conduct two experiments. The first one shows that birthmark successfully distinguishes non-copied files. In the second one, it shows that birthmark has quite good a tolerance against meaning-preserving attacks.

Keywords

Data mining; Natural languages; Q measurement; Resilience; Signal processing; Software protection; Watermarking;

fLanguage

English

Publisher

ieee

Conference_Titel

Intelligent Information Hiding and Multimedia Signal Processing, 2006. IIH-MSP '06. International Conference on

Conference_Location

Pasadena, CA, USA

Print_ISBN

0-7695-2745-0

Type

conf

DOI

10.1109/IIH-MSP.2006.265097

Filename

4041817

Link To Document

https://search.isc.ac/dl/search/defaultta.aspx?DTC=49&DC=2977125