• DocumentCode
    600218
  • Title

    Quality Assurance for Segmentation and Tagging of Chinese Novels in the Ming and Qing Dynasties

  • Author

    Dan Xiong ; Qin Lu ; Fengju Lo ; Dingxu Shi ; Tin-Shing Chiu

  • Author_Institution
    Dept. of Comput., Hong Kong Polytech. Univ., Hong Kong, China
  • fYear
    2012
  • fDate
    13-15 Nov. 2012
  • Firstpage
    77
  • Lastpage
    80
  • Abstract
    This paper presents a word segmentation and named entity tagging project which annotates Chinese novels in the Ming and Qing dynasties. Computer-aided tools are used to assist the annotation. The focus of this paper will be on the quality assurance measures to ensure precision and consistency. The specification for word segmentation and named entity tagging is formulated based on the standards for modern Chinese segmentation commonly used in Mainland China and in Taiwan as well as the analysis of differences between Chinese classics and modern Chinese. The specification is established through iterative refinements. This refinement process can offer valuable insights into the quality control of computer-aided processing performed on Chinese literature works in the Ming and Qing dynasties and can be applied to those in even earlier periods. The finalized corpus, built in a computer-aided, manually-reviewed method in accordance with the specification, can be used for researches in literature, linguistics, information technology, and teaching of Chinese.
  • Keywords
    computer aided analysis; iterative methods; linguistics; literature; natural language processing; quality assurance; quality control; text analysis; word processing; Chinese literature; Chinese novels; Mainland China; Ming dynasty; Qing dynasty; Taiwan; classic Chinese; computer aided processing tool; computer-aided manually-reviewed method; information technology; iterative refinement process; linguistics; modern Chinese; modern Chinese segmentation; named entity tagging project; quality assurance; quality control; word segmentation; Buildings; Educational institutions; Manuals; Pragmatics; Process control; Semantics; Tagging; Quality assurance; named entities; novels in the Ming and Qing dynasties; tagging; word segmentation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Asian Language Processing (IALP), 2012 International Conference on
  • Conference_Location
    Hanoi
  • Print_ISBN
    978-1-4673-6113-2
  • Electronic_ISBN
    978-0-7695-4886-9
  • Type

    conf

  • DOI
    10.1109/IALP.2012.60
  • Filename
    6473700