• DocumentCode
    507828
  • Title

    Exploiting tags for concept extraction and information integration

  • Author

    Escobar-Molano, Martha L. ; Badia, Antonio ; Alonso, Rafael

  • Author_Institution
    SET Corp., Arlington, VA, USA
  • fYear
    2009
  • fDate
    11-14 Nov. 2009
  • Firstpage
    1
  • Lastpage
    9
  • Abstract
    The use of tags to annotate content creates an opportunity to explore alternatives to automate the process of extracting semantics from data sources. Semantic information is needed for many complex tasks like concept extraction and information integration. In order to establish the value of user-generated annotation, this paper presents two experiments on which only user tags are used as input. At the core of semantic extraction is the identification of concepts and relationships that are present in the data. We show, through an experimental study on tagged photographs, how to extract concepts associated with photographs and their relationships. Our experiments demonstrate that supervised machine learning techniques can be used to extract a concept associated with a photograph with an overall precision score of 80%. Our experiments also show that a variation of the Jaccard similarity coefficient on sets of tags can be used to determine equivalence relationships between the concepts associated with these sets.
  • Keywords
    digital photography; information filtering; learning (artificial intelligence); Jaccard similarity coefficient; data sources; information integration; semantic information extraction; supervised machine learning techniques; tagged photographs; user-generated annotation; Centralized control; Collaboration; Computer science; Data analysis; Data engineering; Data mining; Machine learning; Ontologies; Tagging; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Collaborative Computing: Networking, Applications and Worksharing, 2009. CollaborateCom 2009. 5th International Conference on
  • Conference_Location
    Washington, DC
  • Print_ISBN
    978-963-9799-76-9
  • Electronic_ISBN
    978-963-9799-76-9
  • Type

    conf

  • DOI
    10.4108/ICST.COLLABORATECOM2009.8330
  • Filename
    5363321