Title :
Extracting and organizing acronyms based on ranking
Author :
Ni, Weijian ; Huang, Yalou
Author_Institution :
Coll. of Inf. Tech. Sci., Nankai Univ., Tianjin
Abstract :
The paper addresses the problem of automatically extracting and organizing acronyms and expansions (e.g., dasiaROMpsila and dasiaRead Only Memorypsila) in text. To deal with the problem, we propose a two-step approach based on ranking. In the first step, for each occurrence of acronyms in text, we rank the expansion candidates around the acronym and extract the top ranked ones. In the second step, expansion candidates collected in the first step are organized before presented to end users. dasiaOrganizepsila here means grouping expansions and then ranking them according to their correctness and popularity. In this way, the numbers of expansions in the results which users need to examine will be drastically reduced. Experimental results based on real-world dataset show that our approach can always rank correct and popular expansions to the top. Experimental results also show that the trained ranking models are generic and perform well on different domains.
Keywords :
classification; text analysis; acronym extraction; ranking; text analysis; Cities and towns; Educational institutions; Machine learning; Natural languages; Organizing; Pattern recognition; Read only memory; Support vector machines; Tagging; Text recognition; Acronym extraction; Ranking; Support Vector Machine;
Conference_Titel :
Intelligent Control and Automation, 2008. WCICA 2008. 7th World Congress on
Conference_Location :
Chongqing
Print_ISBN :
978-1-4244-2113-8
Electronic_ISBN :
978-1-4244-2114-5
DOI :
10.1109/WCICA.2008.4594528