DocumentCode :
2929608
Title :
The study and design of the full-text search engine in electrical power industry based on Nutch
Author :
Wu, Kehe ; Ye, Shichao ; He, Hui
Author_Institution :
Eng. Res. Center of Power Inf., North China Electr. Power Univ., Beijing, China
Volume :
2
fYear :
2010
fDate :
1-2 Aug. 2010
Firstpage :
217
Lastpage :
220
Abstract :
At present, the general search engine not only covers the small percentage of the particular field and the particular. subject, but also can not make sure the safety of the data which is indexed by the engine. Therefore, the paper developed a professional search engine with bright electrical power industry character based on the open-source search engine framework of Nutch. The system has a dictionary of electrical power industry, using an improved VSM algorithm to calculate the correlation of content which is captured by the crawler, and then filter the relevant parts. The indexed data is ordered by PageRank algorithm. The system also has an access control module, which can certificate the user´s authority and classify the information. The system can improve the specialty of the information retrieval in some fields, and enhance the security of the search engine.
Keywords :
authorisation; electricity supply industry; information retrieval; search engines; Nutch; PageRank algorithm; access control; electrical power industry; full text search engine; improved VSM algorithm; information classification; information retrieval; open source search engine; professional search engine; user authority; World Wide Web; Nutch; electrical power industry; full-text search engine;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Circuits,Communications and System (PACCS), 2010 Second Pacific-Asia Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-7969-6
Type :
conf
DOI :
10.1109/PACCS.2010.5626666
Filename :
5626666
Link To Document :
بازگشت