Title :
SBA-term: Sparse Bilingual Association for Terms
Author :
Dai, Xinyu ; Jia, Jinzhu ; El Ghaoui, Laurent ; Yu, Bin
Author_Institution :
Nat. Key Lab. for Novel Software Technol., Nanjing Univ., Nanjing, China
Abstract :
Bilingual semantic term association is very useful in cross-language information retrieval, statistical machine translation, and many other applications in natural language processing. In this paper, we present a method, named SBA-term, which applies sparse linear regression (Lasso, Least Squares with l1 penalty) and L2 rescaling for design matrix to the task of bilingual term association. The approach hinges on formulating the task as a feature selection problem within a classification framework. Our experimental results indicate that our novel proposed method is more efficient than co-occurrence at extracting relevant bilingual terms semantic associations. In addition, our approach connects the vibrant area of sparse machine learning to an important problem of natural language processing.
Keywords :
information retrieval; language translation; learning (artificial intelligence); natural language processing; regression analysis; SBA-term; bilingual semantic term association; cross-language information retrieval; feature selection problem; machine learning; natural language processing; sparse bilingual terms association; sparse linear regression; statistical machine translation; Dictionaries; Educational institutions; Humans; Linear regression; Prediction algorithms; Semantics;
Conference_Titel :
Semantic Computing (ICSC), 2011 Fifth IEEE International Conference on
Conference_Location :
Palo Alto, CA
Print_ISBN :
978-1-4577-1648-5
Electronic_ISBN :
978-0-7695-4492-2
DOI :
10.1109/ICSC.2011.25