DocumentCode :
3190391
Title :
High-Speed Identification of Language and Script
Author :
Ratner, Alan ; Loui, Ron
fYear :
2007
fDate :
28-31 Oct. 2007
Firstpage :
563
Lastpage :
568
Abstract :
Humans communicate with text in thousands of languages, in dozens of scripts, and a wide variety of binary codes. There is a need to identify the language, script and code of this text to enable follow-on processing such as transcoding, translation, transliteration, routing and prioritization. This paper deals with the implementation of real-time language and script identification on high-speed hardware (principally a ternary content addressable memory) capable of processing network data streams at several gigabits per second.
Keywords :
Associative memory; Background noise; Conferences; Data mining; Field programmable gate arrays; Hardware; Humans; Java; Natural languages; Pattern matching;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining Workshops, 2007. ICDM Workshops 2007. Seventh IEEE International Conference on
Conference_Location :
Omaha, NE
Print_ISBN :
978-0-7695-3019-2
Electronic_ISBN :
978-0-7695-3033-8
Type :
conf
DOI :
10.1109/ICDMW.2007.117
Filename :
4476723
Link To Document :
بازگشت