Title :
The research of performance of Lucene´s Chinese tokenizer
Author :
Cao, Liang ; Wu, Weiming ; Gu, Yonghao
Author_Institution :
Sch. of Comput. Sci., Beijing Univ. of Posts & Telecommun., Beijing, China
Abstract :
Almost all the websites and content management systems provided full-text search. The developer and customer always focus on the full-text search module during the process of develop and maintain the website and content management system. In order to meet the demands of customers for the full-text search capability, we must study the existing Chinese tokenizer. This article will examine four kinds of Chinese tokenizer and compare their performance.
Keywords :
Web sites; content management; search problems; text analysis; Lucene Chinese tokenizer; Web site; content management systems; customer demand; full-text search module; Bills of materials; Content management; Educational institutions; Java; Search engines; Telecommunications; Chinese Tokenizer; Full-text Search; Lucene;
Conference_Titel :
Artificial Intelligence, Management Science and Electronic Commerce (AIMSEC), 2011 2nd International Conference on
Conference_Location :
Deng Leng
Print_ISBN :
978-1-4577-0535-9
DOI :
10.1109/AIMSEC.2011.6011478