DocumentCode
20096
Title
Arbitrary Category Classification of Websites Based on Image Content
Author
Akusok, Anton ; Miche, Yoan ; Karhunen, Juha ; Bjork, Kaj-Mikael ; Rui Nian ; Lendasse, Amaury
Author_Institution
Dept. of Mech. & Ind. Eng., Univ. of Iowa, Iowa City, IA, USA
Volume
10
Issue
2
fYear
2015
fDate
May-15
Firstpage
30
Lastpage
41
Abstract
This paper presents a comprehensive methodology for general large-scale image-based classification tasks. It addresses the Big Data challenge in arbitrary image classification and more specifically, filtering of millions of websites with abstract target classes and high levels of label noise. Our approach uses local image features and their color descriptors to build image representations with the help of a modified k-NN algorithm. Image representations are refined into image and website class predictions by a two-stage classifier method suitable for a very large-scale real dataset. A modification of an Extreme Learning Machine is found to be a suitable classifier technique. The methodology is robust to noise and can learn abstract target categories; website classification accuracy surpasses 97% for the most important categories considered in this study.
Keywords
Big Data; Web sites; image classification; image colour analysis; image representation; learning (artificial intelligence); Big Data; Web sites class predictions; arbitrary category classification; color descriptors; extreme learning machine; general large-scale image-based classification tasks; image content; image representations; label noise; local image features; modified k-NN algorithm; two-stage classifier method; very large-scale real dataset; Big data; Classification; Image classification; Image color analysis; Image representation; Large-scale systems; Noise measurement;
fLanguage
English
Journal_Title
Computational Intelligence Magazine, IEEE
Publisher
ieee
ISSN
1556-603X
Type
jour
DOI
10.1109/MCI.2015.2405317
Filename
7083681
Link To Document