DocumentCode :
561157
Title :
Improving Classifier Performance by Autonomously Collecting Background Knowledge from the Web
Author :
Minton, Steven N. ; Michelson, Matthew ; See, Kane ; Macskassy, Sofus ; Gazen, Bora C. ; Getoor, Lise
Author_Institution :
InferLink Corp, El Segundo, CA, USA
Volume :
1
fYear :
2011
fDate :
18-21 Dec. 2011
Firstpage :
1
Lastpage :
6
Abstract :
Many websites allow users to tag data items to make them easier to find. In this paper we consider the problem of classifying tagged data according to user-specified interests. We present an approach for aggregating background knowledge from the Web to improve the performance of a classier. In previous work, researchers have developed technology for extracting knowledge, in the form of relational tables, from semi-structured websites. In this paper we integrate this extraction technology with generic machine learning algorithms, showing that knowledge extracted from the Web can significantly benefit the learning process. Specifically, the knowledge can lead to better generalizations, reduce the number of samples required for supervised learning, and eliminate the need to retrain the system when the environment changes. We validate the approach with an application that classifies tagged Fickr data.
Keywords :
Web sites; information retrieval; learning (artificial intelligence); pattern classification; Website; autonomous background knowledge collection; background knowledge aggregation; classifier performance improvement; generic machine learning algorithm; knowledge extraction technology; supervised learning; tagged Flickr data; tagged data classification; user-specified interest; Cities and towns; Data mining; Fires; Knowledge engineering; Monitoring; Portals; Training; Background Knowledge; Classifiers; Information Extraction; Ontologies; Web Harvesting;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Applications and Workshops (ICMLA), 2011 10th International Conference on
Conference_Location :
Honolulu, HI
Print_ISBN :
978-1-4577-2134-2
Type :
conf
DOI :
10.1109/ICMLA.2011.76
Filename :
6146932
Link To Document :
بازگشت