DocumentCode :
660853
Title :
Natural Language Processing and Big Data - An Ontology-Based Approach for Cross-Lingual Information Retrieval
Author :
Monti, Johanna ; Monteleone, M. ; di Buono, M.P. ; Marano, Francesco
Author_Institution :
Dept. of Social & Human Sci., Univ. of Sassari, Sassari, Italy
fYear :
2013
fDate :
8-14 Sept. 2013
Firstpage :
725
Lastpage :
731
Abstract :
Extracting relevant information in multilingual context from massive amounts of unstructured, structured and semi-structured data is a challenging task. Various theories have been developed and applied to ease the access to multicultural and multilingual resources. This papers describes a methodology for the development of an ontology-based Cross-Language Information Retrieval (CLIR) application and shows how it is possible to achieve the translation of Natural Language (NL) queries in any language by means of a knowledge-driven approach which allows to semi-automatically map natural language to formal language, simplifying and improving in this way the human-computer interaction and communication. The outlined research activities are based on Lexicon-Grammar (LG), a method devised for natural language formalization, automatic textual analysis and parsing. Thanks to its main characteristics, LG is independent from factors which are critical for other approaches, i.e. interaction type (voice or keyboard-based), length of sentences and propositions, type of vocabulary used and restrictions due to users´ idiolects. The feasibility of our knowledge-based methodological framework, which allows mapping both data and metadata, will be tested for CLIR by implementing a domain-specific early prototype system.
Keywords :
Big Data; formal languages; grammars; human computer interaction; meta data; natural language processing; ontologies (artificial intelligence); query formulation; vocabulary; Big Data; CLIR application; LG method; NL queries translation; automatic textual analysis; data mapping; formal language; human-computer communication; human-computer interaction; keyboard-based interaction; knowledge-based methodological framework; knowledge-driven approach; lexicon-grammar; metadata; multicultural resources; multilingual context; multilingual resources; natural language formalization; natural language processing; natural language queries translation; ontology-based cross-language information retrieval; parsing; propositions; relevant information extraction; semistructured data; sentences; unstructured data; vocabulary; voice-based interaction; Compounds; Context; Dictionaries; Grammar; Ontologies; Pragmatics; Semantics; Cross-Language Information Retrieval (CLIR); Lexicon-grammar; Ontology;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Social Computing (SocialCom), 2013 International Conference on
Conference_Location :
Alexandria, VA
Type :
conf
DOI :
10.1109/SocialCom.2013.108
Filename :
6693405
Link To Document :
بازگشت