• DocumentCode
    3104150
  • Title

    Integrating Full-Text Search and Linguistic Analyses on Disperse Data for Humanities and Social Sciences Research Projects

  • Author

    Villegas, Marta ; Parra, Carla

  • Author_Institution
    Inst. Univ. de Linguistica Aplic., Univ. Pompeu Fabra, Barcelona, Spain
  • fYear
    2009
  • fDate
    9-11 Dec. 2009
  • Firstpage
    28
  • Lastpage
    32
  • Abstract
    The research reported in this paper is part of the activities carried out within the CLARIN (common language resources and technology infrastructure) project, a large-scale pan-European project to create, coordinate and make language resources and technologies (LRT) available and readily useable. CLARIN is devoted to the creation of a persistent and stable infrastructure serving the needs of the European humanities and social sciences (HSS) research community. HSS researchers will be able to efficiently access distributed resources and apply analysis and exploitation tools relevant for their research. Hereby we present a real use case addressed as a CLARIN scenario and the implementation of a demonstrator that enables us to foresee the potential problems and contributes to the planning of the implementation phase. It deals with how to support researchers interested in harvesting and analyzing data from historical press archives. Therefore, we address the integration and interoperability of distributed and heterogeneous research data and analysis tools.
  • Keywords
    data analysis; linguistics; open systems; query formulation; social sciences; CLARIN; common language resources and technology infrastructure; disperse data; full-text search; humanities; interoperability; linguistic analyses; social sciences; Computer aided software engineering; Data analysis; Large scale integration; Large-scale systems; Light rail systems; Natural language processing; Proposals; Service oriented architecture; Text analysis; Wheels; Humanities & Social Sciences; Linguistic analysis tools; integration and interoperability; textual data harvesting;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    e-Science, 2009. e-Science '09. Fifth IEEE International Conference on
  • Conference_Location
    Oxford
  • Print_ISBN
    978-0-7695-3877-8
  • Type

    conf

  • DOI
    10.1109/e-Science.2009.12
  • Filename
    5380889