• DocumentCode
    3213295
  • Title

    Automated Data Mining from Web Servers Using Perl Script

  • Author

    Neeli, Sandeep ; Govindasamy, Kannan ; Wilamowski, Bogdan M. ; Malinowski, Aleksander

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Auburn Univ., Auburn, AL
  • fYear
    2008
  • fDate
    25-29 Feb. 2008
  • Firstpage
    191
  • Lastpage
    196
  • Abstract
    Data mining from the Web is the process of extracting essential data from any web server. In this paper, we present a method called Ethernet Robot to extract information/data from a web server using perl scripting language and to process the data using regular expressions. The procedure involves fetching, filtering, processing and presentation of required data. The resultant HTML file consisting of the required data is displayed for the perusal of users. Future enhancements to our ethernet robot include optimization to improve performance and customization for use as a sophisticated client-specific search agent.
  • Keywords
    Internet; Perl; data acquisition; data mining; hypermedia markup languages; local area networks; Ethernet robot; HTML; Web servers; automated data mining; client-specific search agent; data extraction; information extraction; perl scripting language; regular expressions; Data analysis; Data mining; Ethernet networks; Filtering; HTML; Machine learning; Pattern analysis; Web pages; Web server; Web sites; Data Extraction; Data Mining; Perl; Regular Expressions; wget;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Engineering Systems, 2008. INES 2008. International Conference on
  • Conference_Location
    Miami, FL
  • Print_ISBN
    978-1-4244-2082-7
  • Electronic_ISBN
    978-1-4244-2083-4
  • Type

    conf

  • DOI
    10.1109/INES.2008.4481293
  • Filename
    4481293