DocumentCode :
482226
Title :
P2DHMM: A Novel Web Object Information Extraction Model
Author :
Wang, Jing ; Liu, Zhijing
Author_Institution :
Sch. of Comput. Sci. & Technol., Xidian Univ., Xi´´an
Volume :
1
fYear :
2009
fDate :
22-24 Jan. 2009
Firstpage :
531
Lastpage :
535
Abstract :
Due to the difference between Web page and plain text document, the concept of Web object is introduced in this paper. Besides, the supposed state transition and the emission symbol conditions are improved based on Pseudo two dimension hidden Markov model (P2D-HMM), and a novel web objects information extraction method is proposed. Finally, through an example, it shows that the proposed method has a very high precision for web objects information extraction.
Keywords :
Internet; hidden Markov models; information retrieval; Web object information extraction model; Web page; emission symbol condition; plain text document; pseudo two dimension hidden Markov model; state transition; Computer science; Data mining; Dictionaries; Electronic mail; HTML; Hidden Markov models; Information analysis; Internet; Spatial databases; Web pages; Information Extraction (IE); Pseudo two-dimension Hidden Markov Model (P2D-HMM); Web Object;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Engineering and Technology, 2009. ICCET '09. International Conference on
Conference_Location :
Singapore
Print_ISBN :
978-1-4244-3334-6
Type :
conf
DOI :
10.1109/ICCET.2009.147
Filename :
4769523
Link To Document :
بازگشت