DocumentCode :
1175660
Title :
Olera: semisupervised Web-data extraction with visual support
Author :
Chang, Chia-Hui ; Kuo, Shih-Chien
Author_Institution :
Dept. of Comput. Sci. & Inf. Eng., Nat. Central Univ., Chung-li, Taiwan
Volume :
19
Issue :
6
fYear :
2004
Firstpage :
56
Lastpage :
64
Abstract :
Olera is a semisupervised information-extraction system that produces extraction rules from semistructured Web documents without requiring detailed annotation of the training documents. It performs well for program-generated Web pages with few training pages and limited user intervention.
Keywords :
Internet; Web sites; document handling; information retrieval; information retrieval systems; string matching; Olera semisupervised information-extraction system; program-generated Web pages; semistructured Web document; visual tool; Data mining; Databases; Explosives; Humans; Induction generators; Internet; Labeling; Machine learning; Software systems; Web pages; Web data extraction; multiple string alignment; rule generalization; semistructured data;
fLanguage :
English
Journal_Title :
Intelligent Systems, IEEE
Publisher :
ieee
ISSN :
1541-1672
Type :
jour
DOI :
10.1109/MIS.2004.71
Filename :
1363735
Link To Document :
بازگشت