DocumentCode :
2467181
Title :
A frame work for search forms classification
Author :
Klassen, Myungsook
Author_Institution :
Comput. Sci. Dept., California Lutheran Univ., Thousand Oaks, CA, USA
fYear :
2012
fDate :
14-17 Oct. 2012
Firstpage :
1029
Lastpage :
1034
Abstract :
The deep web pages provide highly relevant quality content and represent a large sector of online information sources, yet general purpose search engines do not find deep web pages due to difficulties of identifying them. In this paper, we present a frame work to identify search forms from non search forms using a small number of HTML input elements extracted from user input HTML forms and a few keywords. It utilizes pre-query technique and post-query technique in a hierarchical manner. Decision trees and multi layer artificial neural networks were used to obtain the classification rates over 91% to classify search forms and non search forms. In the new frame work, it is proposed to use post query technique additionally at the last step to distinguish suite search forms form deep search forms.
Keywords :
Web sites; classification; decision trees; multilayer perceptrons; search engines; HTML; decision trees; deep Web pages; multilayer artificial neural networks; post query technique; search engines; search forms classification; Biological neural networks; Databases; Decision trees; HTML; Search engines; Vegetation; Web pages; classification; deep web; framework; hidden web; mining; multi layer neural networks; random forests; search form;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Systems, Man, and Cybernetics (SMC), 2012 IEEE International Conference on
Conference_Location :
Seoul
Print_ISBN :
978-1-4673-1713-9
Electronic_ISBN :
978-1-4673-1712-2
Type :
conf
DOI :
10.1109/ICSMC.2012.6377864
Filename :
6377864
Link To Document :
بازگشت