• DocumentCode
    3722612
  • Title

    Document Copy Detection Using the Improved Fuzzy Hashing

  • Author

    Guohua Wu;Ershuai Fu;Liuyang Wang;Mengmeng Zhao

  • Author_Institution
    Sch. of Comput. Sci. &
  • fYear
    2015
  • Firstpage
    55
  • Lastpage
    60
  • Abstract
    Document copy detection is an effective method that can protect intellectual property rights as well as improve the efficiency of information retrieval. To our knowledge, it is a common method that using the fingerprints of one document in the process of detecting. Therefore, selecting the appropriate document fingerprints plays a key role. This paper firstly describes several mature methods of selecting document fingerprints, and analyzes their merit and demerit. Then we review the principle of Fuzzy Hashing, which suffers from the instability and inefficiency of fragmenting. To resolve the critical problems, we finally propose a novel algorithm based on the Fuzzy Hashing. Compared to original method, the proposed document copy detection algorithm can not only ensure the proper size of fragment but also enhance the speed of fragmenting. And in terms of efficiency and accuracy, the algorithm achieves high performance.
  • Keywords
    "Fingerprint recognition","Encoding","Algorithm design and analysis","Computer science","Interference","Intellectual property","Information retrieval"
  • Publisher
    ieee
  • Conference_Titel
    Computer Science and Mechanical Automation (CSMA), 2015 International Conference on
  • Type

    conf

  • DOI
    10.1109/CSMA.2015.18
  • Filename
    7371622