DocumentCode :
1957784
Title :
Challenges to Automated Allegory Resolution in Open Source Intelligence
Author :
Watters, Paul A.
Author_Institution :
Internet Commerce Security Lab. (ICSL), Univ. of Ballarat, Ballarat, VIC, Australia
fYear :
2012
fDate :
29-30 Oct. 2012
Firstpage :
14
Lastpage :
18
Abstract :
The resolution of lexical ambiguity in machine translation systems often involves the automated, on-line selection of the correct sense of polysemous target words in the context of a clause, phrase or sentence. However, the performance of machine translation systems in emulating this aspect of human language processing has not been entirely successful, to the extent that resolution of entities and terms in natural language could be automated for open source intelligence analysis. Whilst some of these systems confine themselves to processing domain-specific knowledge (e.g., medical terminology), with some success, the popular general-purpose direct translation systems now freely available on the World Wide Web (WWW) are investigated for characteristic semantic processing errors in this study. A ubiquitous sentence ("The quick brown fox jumps over the lazy dog"), an equative metaphor, and a simile are translated into four romance and one Germanic language, with the translation then inverted back to English using the same translation system. It is found that in addition to expected differences in correctly mapping shades of meaning (e.g., "quick" is mapped to "fast"), some spatial meanings are incorrectly transformed, especially for verbs (e.g., "jumps over" becomes "branches over" or "jumps on"). The most serious error is the addition of extra semantic features to individual words, particularly features associated with nouns (e.g., the gender-neutral "fox" becomes the female "vixen"). The implications of these types of errors for the automatic translation of human language - with respect to semantic representation in open source intelligence -- are discussed.
Keywords :
grammars; language translation; natural language processing; public domain software; semantic Web; ubiquitous computing; Germanic language; WWW; World Wide Web; automated allegory resolution; automatic human language translation; characteristic semantic processing errors; domain-specific knowledge; entities resolution; equative metaphor; human language processing; lexical ambiguity; machine translation systems; natural language; on-line selection; open source intelligence; open source intelligence analysis; polysemous target words sense; popular general-purpose direct translation systems; semantic features; semantic representation; ubiquitous sentence; Open source intelligence; metaphor; polysemy;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cybercrime and Trustworthy Computing Workshop (CTC), 2012 Third
Conference_Location :
Ballarat, VIC
Print_ISBN :
978-1-4673-6460-7
Type :
conf
DOI :
10.1109/CTC.2012.8
Filename :
6498423
Link To Document :
بازگشت