DocumentCode :
2410805
Title :
Using Term Extraction Patterns to Discover Coherent Relationships from Open Source Intelligence
Author :
Sousan, William L. ; Zhu, Qiuming ; Gandhi, Robin ; Mahoney, William ; Sharma, Anup
Author_Institution :
Univ. of Nebraska, Omaha, NE, USA
fYear :
2010
fDate :
20-22 Aug. 2010
Firstpage :
967
Lastpage :
972
Abstract :
Unstructured open source information, especially the social, political, economic and cultural events described within web-based text/news articles, often contain possible motives for cyber security and trust issues. Automated processing of numerous open source intelligence sources requires the discovery of key domain terms, their conceptual hierarchies and the coherent relationships among them. A syntactic analysis of the word sequences in unstructured text documents allows for the extraction of subject-predicate-object triples, which form the basis for Term Extraction Patterns (TEP). In our research, we use TEPs to discover domain-specific multi-word entities which in turn, can be arranged in a taxonomy based on their semiotic inter-relationships. We explore the use of this method within the cyber security domain and analyze a collection of related news articles gathered from various public web sources. In this paper our initial results of term extraction and the semantic coherence derived from the TEP analyses are described. Our work extends beyond current methods, and our contribution is a novel methodology to extract semantics from unstructured text in domain specific open source information and its application to predict cyber attack outbreaks.
Keywords :
Internet; information retrieval; security of data; text analysis; Web-based news articles; Web-based text; cyber attack; cyber security; open source intelligence; public Web sources; term extraction patterns; trust issues; unstructured text documents; word sequences; Computer crime; Data mining; Ontologies; Pragmatics; Semantics; Taxonomy; Conceptualization; Open Source Intelligence; Semantic Relevance; Term Extraction; Term Extraction Patterns;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Social Computing (SocialCom), 2010 IEEE Second International Conference on
Conference_Location :
Minneapolis, MN
Print_ISBN :
978-1-4244-8439-3
Electronic_ISBN :
978-0-7695-4211-9
Type :
conf
DOI :
10.1109/SocialCom.2010.143
Filename :
5591400
Link To Document :
بازگشت