• DocumentCode
    3740088
  • Title

    Implementing Web Classification for TLDs

  • Author

    Luca Deri;Maurizio Martinelli;Daniele Sartiano;Michela Serrecchia;Loredana Sideri;Sonia Prignoli

  • Author_Institution
    IIT, Pisa, Italy
  • Volume
    1
  • fYear
    2015
  • Firstpage
    85
  • Lastpage
    88
  • Abstract
    On the market there are many commercial web classification services and a few publicly available web directory services. Unfortunately they mostly focus on English-speaking web sites, making them unsuitable for other languages in terms of classification reliability and coverage. This paper covers the design and implementation of a web-based classification tool for TLDs (Top Level Domain). Each domain is classified by analysing the main domain web site, and organised it in categories according to its content. The tool has been successfully validated by classifying all the registered. It Internet domains, whose results are presented in this paper.
  • Keywords
    "Web pages","Feature extraction","Crawlers","Companies","Internet","HTML"
  • Publisher
    ieee
  • Conference_Titel
    Web Intelligence and Intelligent Agent Technology (WI-IAT), 2015 IEEE / WIC / ACM International Conference on
  • Type

    conf

  • DOI
    10.1109/WI-IAT.2015.112
  • Filename
    7396784