DocumentCode
1638446
Title
Automatic identification of assamese and bodo multiword expressions
Author
Barman, A.K. ; Sarmah, J. ; Sarma, S.K.
Author_Institution
Dept. of Inf. Technol., Gauhati Univ., Guwahati, India
fYear
2013
Firstpage
26
Lastpage
30
Abstract
Multiword Expressions (MWEs) are sequence of words separated by space or delimiter which determines a unique meaning instead of words´ individual meanings. Our work concentrates on automatic identification of MWEs for two less computationally aware languages Assamese and Bodo spoken in the North Eastern part of India. Statistical measure and Language specific knowledge helps us to extract MWEs from raw corpus. Natural Language Processing tasks in Assamese and Bodo languages have started in recent years, and this is the first organised approach to exploit MWEs in both these languages. Linguistics aspects for analysing the results have been considered, and we have found the results quite satisfactory.
Keywords
linguistics; natural language processing; pattern recognition; statistical analysis; text analysis; Assamese language; Bodo language; MWE; computationally aware languages; language specific knowledge; linguistics; multiword expressions automatic identification; natural language processing; north eastern India; statistical measure; Bismuth; Informatics; Assamese; Bodo; MWEs; NLP; Statistical measure;
fLanguage
English
Publisher
ieee
Conference_Titel
Advances in Computing, Communications and Informatics (ICACCI), 2013 International Conference on
Conference_Location
Mysore
Print_ISBN
978-1-4799-2432-5
Type
conf
DOI
10.1109/ICACCI.2013.6637141
Filename
6637141
Link To Document