DocumentCode :
3574526
Title :
Named entity recognition for tamil biomedical documents
Author :
Betina Antony, J. ; Mahalakshmi, G.S.
Author_Institution :
Dept. of Comput. Sci. & Eng., Anna Univ., Chennai, India
fYear :
2014
Firstpage :
1571
Lastpage :
1577
Abstract :
Valuable Information about tamil traditional medicines are available in various forms like books, magazines and websites. These instructions are however very large and unstructured. Our system focuses on constructing a NER identification module using SVM classifier to identify named entities and to classify them into their corresponding categories. The two main categories considered are name of disorders and name of ingredients used. The system uses features such as unigrams/bigrams, case markers, substring clues and tf-idf score to classify the entities into their classes. These named entities are stored in the NE Dictionary based on their categories.
Keywords :
document handling; natural language processing; pattern classification; support vector machines; NE Dictionary; NER identification module; SVM classifier; Tamil biomedical documents; Tamil traditional medicines; named entity identification; named entity recognition; Computers; Dictionaries; Feature extraction; Hidden Markov models; Natural language processing; Support vector machines; Biomedical NER; SVM classification; Siddha documents; Tamil Biomedical Documents;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Circuit, Power and Computing Technologies (ICCPCT), 2014 International Conference on
Print_ISBN :
978-1-4799-2395-3
Type :
conf
DOI :
10.1109/ICCPCT.2014.7055016
Filename :
7055016
Link To Document :
بازگشت