Title :
Application of Symbol Feature-Based HMM in Web Information Extraction
Author :
Yongjin, Ma ; Bingyao, Jin
Author_Institution :
Zhejiang Normal Univ., Jinhua, China
fDate :
March 31 2009-April 2 2009
Abstract :
This paper proposes a symbol feature-based hidden Markov model (HMM). Each state in the model is expressed by some symbol features, and is described by feature lists that draw from regular expressions and text inference; based on which, we use Viterbi Algorithm to extract the information from scientific researcherspsila homepages. It works well although there is great information redundancy.
Keywords :
Internet; hidden Markov models; inference mechanisms; information retrieval; scientific information systems; statistical distributions; Veterbi algorithm; Web information extraction; probability distribution; regular expression; scientific researcher homepage; symbol feature-based hidden Markov model; text inference; Application software; Computer science; Data mining; Dictionaries; Feature extraction; Hidden Markov models; Inference algorithms; Internet; Statistics; XML; Information Extraction; Keywords Hidden Markov Model(HMM); Symbol feature;
Conference_Titel :
Computer Science and Information Engineering, 2009 WRI World Congress on
Conference_Location :
Los Angeles, CA
Print_ISBN :
978-0-7695-3507-4
DOI :
10.1109/CSIE.2009.98