Title :
Exploiting RFC2828 as a Domain Vocabulary for Identifying IT Security Literature
Author :
Wang, Lidong ; Qian, Liping
Author_Institution :
Dept. of R&D, Coordination Center of China, Beijing, China
Abstract :
The volume of published scientific literature available on Internet has been increasing exponentially. Some of them reflect the latest achievement of the specific research domain. In recent years, many projects have been funded aiming to online scientific literature mining, especially in biomedical research. Scientific literature covers most of the hot topics in the research field and has a very large domain-specific vocabulary. The exploitation of domain knowledge and specialized vocabulary can dramatically improve the result of literature text processing. The purpose of this paper is to research on automatic identifying and classifying IT security literature so that IT security related papers can be retrieved from Internet with high accuracy. RFC 2828 provides explanations and recommendations for use of IT security terminology. In this paper, we evaluated the effects of IT security literatures identification with RFC2828 glossary-based feature choice and TF/IDF scheme. Our experimental result shows that its performance is better than the common TF/IDF method.
Keywords :
Internet; pattern classification; text analysis; vocabulary; IT security terminology; Internet; RFC2828; TF-IDF scheme; domain-specific vocabulary; literature text processing; online scientific literature mining; Computer security; Databases; Information security; Internet; National security; Proteins; Terminology; Text categorization; Text processing; Vocabulary; IT security; RFC2828; document classification; literature; vocabulary;
Conference_Titel :
Information Assurance and Security, 2009. IAS '09. Fifth International Conference on
Conference_Location :
Xian
Print_ISBN :
978-0-7695-3744-3
DOI :
10.1109/IAS.2009.75