Title :
A Methodology for Extracting Head Contents from Meaningful Tables in Web Pages
Author :
Chavan, Madhuri M. ; Shirgave, S.K.
Author_Institution :
D.Y. Patil Coll. of Eng. & Technol., Kolhapur, India
Abstract :
Tables are an important feature of presenting information & are widely used on the web. They show relational data in a simple & precise manner. A typical web page consists of many blocks or areas e.g. main content areas, advertisements, images etc. Tables contain meaningful information. Almost all data is arranged in tabular format. Tables describe relational information in a compact manner. So there is need to find out the tables which contains meaningfulness structural data. In this paper, a method is introduced for determining the meaningfulness of a table and extracting the Head from meaningful table.
Keywords :
Web sites; text analysis; Web pages; head content extraction; meaningful tables; relational data; Data mining; Decision trees; Filtering; HTML; Head; Magnetic heads; Web pages; DOM Tree; Table mining; Text mining; Web table; information extraction;
Conference_Titel :
Communication Systems and Network Technologies (CSNT), 2011 International Conference on
Conference_Location :
Katra, Jammu
Print_ISBN :
978-1-4577-0543-4
Electronic_ISBN :
978-0-7695-4437-3
DOI :
10.1109/CSNT.2011.66