DocumentCode
3213295
Title
Automated Data Mining from Web Servers Using Perl Script
Author
Neeli, Sandeep ; Govindasamy, Kannan ; Wilamowski, Bogdan M. ; Malinowski, Aleksander
Author_Institution
Dept. of Electr. & Comput. Eng., Auburn Univ., Auburn, AL
fYear
2008
fDate
25-29 Feb. 2008
Firstpage
191
Lastpage
196
Abstract
Data mining from the Web is the process of extracting essential data from any web server. In this paper, we present a method called Ethernet Robot to extract information/data from a web server using perl scripting language and to process the data using regular expressions. The procedure involves fetching, filtering, processing and presentation of required data. The resultant HTML file consisting of the required data is displayed for the perusal of users. Future enhancements to our ethernet robot include optimization to improve performance and customization for use as a sophisticated client-specific search agent.
Keywords
Internet; Perl; data acquisition; data mining; hypermedia markup languages; local area networks; Ethernet robot; HTML; Web servers; automated data mining; client-specific search agent; data extraction; information extraction; perl scripting language; regular expressions; Data analysis; Data mining; Ethernet networks; Filtering; HTML; Machine learning; Pattern analysis; Web pages; Web server; Web sites; Data Extraction; Data Mining; Perl; Regular Expressions; wget;
fLanguage
English
Publisher
ieee
Conference_Titel
Intelligent Engineering Systems, 2008. INES 2008. International Conference on
Conference_Location
Miami, FL
Print_ISBN
978-1-4244-2082-7
Electronic_ISBN
978-1-4244-2083-4
Type
conf
DOI
10.1109/INES.2008.4481293
Filename
4481293
Link To Document