DocumentCode
495491
Title
Application of Symbol Feature-Based HMM in Web Information Extraction
Author
Yongjin, Ma ; Bingyao, Jin
Author_Institution
Zhejiang Normal Univ., Jinhua, China
Volume
4
fYear
2009
fDate
March 31 2009-April 2 2009
Firstpage
173
Lastpage
177
Abstract
This paper proposes a symbol feature-based hidden Markov model (HMM). Each state in the model is expressed by some symbol features, and is described by feature lists that draw from regular expressions and text inference; based on which, we use Viterbi Algorithm to extract the information from scientific researcherspsila homepages. It works well although there is great information redundancy.
Keywords
Internet; hidden Markov models; inference mechanisms; information retrieval; scientific information systems; statistical distributions; Veterbi algorithm; Web information extraction; probability distribution; regular expression; scientific researcher homepage; symbol feature-based hidden Markov model; text inference; Application software; Computer science; Data mining; Dictionaries; Feature extraction; Hidden Markov models; Inference algorithms; Internet; Statistics; XML; Information Extraction; Keywords Hidden Markov Model(HMM); Symbol feature;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Science and Information Engineering, 2009 WRI World Congress on
Conference_Location
Los Angeles, CA
Print_ISBN
978-0-7695-3507-4
Type
conf
DOI
10.1109/CSIE.2009.98
Filename
5170982
Link To Document