مرکز منطقه ای اطلاع رساني علوم و فناوري - A basic study on attribute name extraction from the web

DocumentCode :

3115524

Title :

A basic study on attribute name extraction from the web

Author :

Nakane, Fumitaka ; Otsubo, Masanori ; Hijikata, Yoshinori ; Nishida, Shogo

Author_Institution :

Grad. Sch. of Eng. Sci., Osaka Univ., Toyonaka

fYear :

2008

fDate :

12-15 Oct. 2008

Firstpage :

2161

Lastpage :

2166

Abstract :

A large number of semistructured documents exist on the Web. We can find pages that contain keywords by using a search engine. But when we want to obtain information about an object like a notebook computer with 1 GB memory, a method is needed that automatically extracts attribute name (in this example, ldquomemoryrdquo) and attribute value (in this example, ldquo1 GBrdquo). In the past, many researchers examined extracting attribute values corresponding to each attribute name. This paper discribes a method that extracts schemas (sets of attribute names) using bootstrapping algorithm.

Keywords :

information retrieval; search engines; text analysis; attribute name extraction; bootstrapping algorithm; information extraction; search engine; semistructured Web document; text substring; Data mining; Dictionaries; Hard disks; Personal communication networks; Relational databases; Search engines; Web pages; Information extraction; attribute name; bootstrapping;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Systems, Man and Cybernetics, 2008. SMC 2008. IEEE International Conference on

Conference_Location :

Singapore

ISSN :

1062-922X

Print_ISBN :

978-1-4244-2383-5

Electronic_ISBN :

1062-922X

Type :

conf

DOI :

10.1109/ICSMC.2008.4811612

Filename :

4811612

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3115524