DocumentCode :
1578236
Title :
Trinity tree construction for unattended web extraction
Author :
Gayathri, M.S. ; Selvi, S. Tamil ; Vijayaraj, A. ; Ilavarasan, S.
Author_Institution :
Dept. of Comput. Sci. & Eng., Sams Coll. of Eng. & Technol., India
fYear :
2015
Firstpage :
1
Lastpage :
4
Abstract :
An innovative framework to automatically extract the data from the cyber world predicated web applications to process the data in linear tree fashion. Most of the terminus users were probing for an efficacious system which can provide an optimized comparative solution without any astronomically immense expenditure. We have proposed a technique that works on one or more web documents engendered by the same server-side template and learns a customary expression that models it and can later be habituated to extract data from kindred documents. In our project, we are trying to use an intelligent “Dominant Super String Algorithm” to extract the effective data from the web pages without any major computational impacts on the system. We have evaluated and compared our technique with others in the literature on an astronomically immense accumulation of collection of web documents; our proposed system results demonstrate that our proposal performs better than the others and that input errors do not have a negative impact on its efficacy and it provides a cost comparison analysis.
Keywords :
Internet; data handling; information retrieval; tree data structures; Trinity tree construction; Web documents; data extraction; intelligent dominant super string algorithm; unattended Web extraction; Algorithm design and analysis; Conferences; Data mining; Databases; Technological innovation; Web pages; Crawling mechanism; Dominant Superstring Algorithm; Linear tree fashion;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Innovations in Information, Embedded and Communication Systems (ICIIECS), 2015 International Conference on
Conference_Location :
Coimbatore
Print_ISBN :
978-1-4799-6817-6
Type :
conf
DOI :
10.1109/ICIIECS.2015.7193060
Filename :
7193060
Link To Document :
بازگشت