Title :
A method of Chinese organization named entities recognition based on statistical word frequency, part of speech and length
Author_Institution :
Beijing Univ. of Posts & Telecommun., Beijing, China
Abstract :
We propose a recognition method based on statistics through analysis the grammatical and semantic characteristics of the Chinese organization name. This recognition method includes three elements: frequency, part of speech, word length. We use the data in mature collection as training data; separately calculate a candidate organization name´s word frequency, part of speech and word length of the contribution. Finally get this candidate organization name´s contribution, compare with a given threshold and achieve the recognition of a Chinese organization name.
Keywords :
grammars; speech recognition; statistics; Chinese organization name; entities recognition; grammatical characteristics; mature collection; part of speech; semantic characteristics; speech length; statistical word frequency; training data; word length; Accuracy; Character recognition; Hidden Markov models; Organizations; Speech; Training; Training data; Contribution; Statistics; the recognition of Chinese organization name;
Conference_Titel :
Broadband Network and Multimedia Technology (IC-BNMT), 2011 4th IEEE International Conference on
Conference_Location :
Shenzhen
Print_ISBN :
978-1-61284-158-8
DOI :
10.1109/ICBNMT.2011.6156013