Title :
Classifying web hierarchically using multi label tree classifier
Author :
Daya Gupta;Harsh Tripathi;Mayukh Maitra
Author_Institution :
Department of Computer Science and Software Engineering, Delhi Technological University, New Delhi, India
Abstract :
Classification and extraction of web finds its applications in semantic web, searching and information extraction. The first part of the paper deals with the problem of classifying web pages, according to their content. Further, the methodology to classify web pages hierarchically in order to achieve topic-wise modeling of websites using multi label tree classifier, a variant of classification where instances may belong to multiple classes at the same time. Data from an implementation of multi label tree classifier shows marked improvements in processing multi-class classification in comparison to conventional hierarchical classification techniques.
Keywords :
"Web pages","Training","Support vector machines","Feature extraction","Dictionaries","Multimedia communication","Classification algorithms"
Conference_Titel :
India Conference (INDICON), 2015 Annual IEEE
Electronic_ISBN :
2325-9418
DOI :
10.1109/INDICON.2015.7443337