DocumentCode
1808474
Title
A Web Page Segmentation Algorithm Based on Iterated Dividing and Shrinking
Author
Jiuxin, Cao ; Bo, Mao ; Junzhou, Luo
Author_Institution
Southeast Univ., Nanjing
fYear
2007
fDate
18-21 Sept. 2007
Firstpage
701
Lastpage
705
Abstract
Based on image processing technology and the web page special characteristics, a new web page segmentation algorithm - Iterated Dividing and Shrinking Algorithm is proposed. Image dividing conditions are introduced, and the dividing zone concept is given. Based on that, the web page is first transformed into image, and then by shrinking and splitting repeatedly, the image is divided into sub- images which are consentaneous in vision. Experiments show that the algorithm is suitable for web page segmentation, and does well in expansibility and performance.
Keywords
Internet; document image processing; image segmentation; Web page segmentation algorithm; image dividing; iterated dividing and shrinking algorithm; Algorithm design and analysis; Computer networks; Computer science; HTML; Image processing; Image segmentation; Information security; Laboratories; Parallel processing; Web pages;
fLanguage
English
Publisher
ieee
Conference_Titel
Network and Parallel Computing Workshops, 2007. NPC Workshops. IFIP International Conference on
Conference_Location
Liaoning
Print_ISBN
978-0-7695-2943-1
Type
conf
DOI
10.1109/NPC.2007.63
Filename
4351566
Link To Document