DocumentCode :
3115524
Title :
A basic study on attribute name extraction from the web
Author :
Nakane, Fumitaka ; Otsubo, Masanori ; Hijikata, Yoshinori ; Nishida, Shogo
Author_Institution :
Grad. Sch. of Eng. Sci., Osaka Univ., Toyonaka
fYear :
2008
fDate :
12-15 Oct. 2008
Firstpage :
2161
Lastpage :
2166
Abstract :
A large number of semistructured documents exist on the Web. We can find pages that contain keywords by using a search engine. But when we want to obtain information about an object like a notebook computer with 1 GB memory, a method is needed that automatically extracts attribute name (in this example, ldquomemoryrdquo) and attribute value (in this example, ldquo1 GBrdquo). In the past, many researchers examined extracting attribute values corresponding to each attribute name. This paper discribes a method that extracts schemas (sets of attribute names) using bootstrapping algorithm.
Keywords :
information retrieval; search engines; text analysis; attribute name extraction; bootstrapping algorithm; information extraction; search engine; semistructured Web document; text substring; Data mining; Dictionaries; Hard disks; Personal communication networks; Relational databases; Search engines; Web pages; Information extraction; attribute name; bootstrapping;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Systems, Man and Cybernetics, 2008. SMC 2008. IEEE International Conference on
Conference_Location :
Singapore
ISSN :
1062-922X
Print_ISBN :
978-1-4244-2383-5
Electronic_ISBN :
1062-922X
Type :
conf
DOI :
10.1109/ICSMC.2008.4811612
Filename :
4811612
Link To Document :
بازگشت