• DocumentCode
    679971
  • Title

    Machine learning for understanding the contextual semantics of tabular web sources

  • Author

    Weerasinghe, Jagath ; Weerasinghe, Saranga ; Panditha, Akila ; Weerasinghe, Vathsala

  • fYear
    2013
  • fDate
    17-20 Dec. 2013
  • Firstpage
    577
  • Lastpage
    582
  • Abstract
    Tables are frequently used in web sources to present relational data in a human friendly manner. Because they are intended for humans, using machines to extract such information is difficult. There are approaches such as wrappers that attempt to solve this problem, but they lack adaptability and require high maintenance. Identifying and extracting information from web tables is not a trivial task, and understanding the semantics of a web table proves to be even harder. In this paper, we introduce a machine learning based approach to understand the semantics in the data residing in tabular web sources. We suggest features that reflect the characteristics of the content in the tables and analyze their impact on the accuracy of the classification process.
  • Keywords
    information resources; learning (artificial intelligence); semantic Web; Web tables; classification process; contextual semantics; machine learning based approach; tabular Web sources; Accuracy; Data mining; Feature extraction; HTML; Pricing; Semantics; Web pages; Artificial Intelligence; Machine Learning; Semantics; Web Tables;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Industrial and Information Systems (ICIIS), 2013 8th IEEE International Conference on
  • Conference_Location
    Peradeniya
  • Print_ISBN
    978-1-4799-0908-7
  • Type

    conf

  • DOI
    10.1109/ICIInfS.2013.6732048
  • Filename
    6732048