• DocumentCode
    495491
  • Title

    Application of Symbol Feature-Based HMM in Web Information Extraction

  • Author

    Yongjin, Ma ; Bingyao, Jin

  • Author_Institution
    Zhejiang Normal Univ., Jinhua, China
  • Volume
    4
  • fYear
    2009
  • fDate
    March 31 2009-April 2 2009
  • Firstpage
    173
  • Lastpage
    177
  • Abstract
    This paper proposes a symbol feature-based hidden Markov model (HMM). Each state in the model is expressed by some symbol features, and is described by feature lists that draw from regular expressions and text inference; based on which, we use Viterbi Algorithm to extract the information from scientific researcherspsila homepages. It works well although there is great information redundancy.
  • Keywords
    Internet; hidden Markov models; inference mechanisms; information retrieval; scientific information systems; statistical distributions; Veterbi algorithm; Web information extraction; probability distribution; regular expression; scientific researcher homepage; symbol feature-based hidden Markov model; text inference; Application software; Computer science; Data mining; Dictionaries; Feature extraction; Hidden Markov models; Inference algorithms; Internet; Statistics; XML; Information Extraction; Keywords Hidden Markov Model(HMM); Symbol feature;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Science and Information Engineering, 2009 WRI World Congress on
  • Conference_Location
    Los Angeles, CA
  • Print_ISBN
    978-0-7695-3507-4
  • Type

    conf

  • DOI
    10.1109/CSIE.2009.98
  • Filename
    5170982