DocumentCode :
3309450
Title :
An improved Naive Bayesian algorithm for Web page text classification
Author :
He Youquan ; Xie Jianfang ; Xu Cheng
Author_Institution :
Inf. Sci. & Eng. Dept., Chongqing Jiaotong Univ., Chongqing, China
Volume :
3
fYear :
2011
fDate :
26-28 July 2011
Firstpage :
1765
Lastpage :
1768
Abstract :
This paper studies the process and methods of text classification. Based on Naive Bayesian algorithm and the semi-structured feature in Web page information, this paper proposes an improved Algorithm for Web page text Information classification which utilizes Html tag Information in classification. Experiments show that this algorithm is feasible and effective and can apply to information extraction in topic search engine, which can enhance the theme fitness of the search results and further improve the searching efficiency.
Keywords :
Bayes methods; Web sites; information retrieval; pattern classification; search engines; text analysis; HTML tag information; Naive Bayesian algorithm; Web page text Information classification; information extraction; search engine; semistructured feature; Accuracy; Algorithm design and analysis; Bayesian methods; Classification algorithms; Text categorization; Web pages; Naive Bayesian; Text classification; Web page;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fuzzy Systems and Knowledge Discovery (FSKD), 2011 Eighth International Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-1-61284-180-9
Type :
conf
DOI :
10.1109/FSKD.2011.6019801
Filename :
6019801
Link To Document :
بازگشت