• DocumentCode
    2623754
  • Title

    Resume Parser: Semi-structured Chinese Document Analysis

  • Author

    Chuang, Zhang ; Ming, Wu ; Li Chun Guang ; Xiao Bo ; Zhi-qing, Lin

  • Author_Institution
    Pattern Recognition & Intell. Syst. Lab., Beijing Univ. of Posts & Telecommun., Beijing, China
  • Volume
    5
  • fYear
    2009
  • fDate
    March 31 2009-April 2 2009
  • Firstpage
    12
  • Lastpage
    16
  • Abstract
    Semi-structured Chinese document analysis is the most difficult task for complex structure and Chinese semantics. According to the generic characteristics of the semi-structured document and the specific characteristics of the resume document, the paper researched on resume document block analysis based on pattern matching, multi-level information identification and feedback control algorithms was also prompted. Based on the research, resume parser system was implemented for ChinaHR, which is the biggest recruitment Website. It can read, analysis, retrieval and store the information automatically. According to all kinds of experiments results, the accuracy and efficiency of this system can generally satisfy the practical requirements. As the research on the processing of the semi-structured document, it will not only be as a directive of the further research on the resume analysis, but also be as the reference to other form of the semi-structured document.
  • Keywords
    feedback; grammars; information retrieval; pattern matching; text analysis; Chinese semantics; feedback control algorithm; information retrieval; multilevel information identification; pattern matching; recruitment Website; resume parser; semistructured Chinese document block analysis; text categorisation; Resumes; Text analysis; document anlysis; pattern matching; resume parsing; semi-structured;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Science and Information Engineering, 2009 WRI World Congress on
  • Conference_Location
    Los Angeles, CA
  • Print_ISBN
    978-0-7695-3507-4
  • Type

    conf

  • DOI
    10.1109/CSIE.2009.562
  • Filename
    5170487