Image collector II: a system for gathering more than one thousand images from the Web for one keyword

Author

Yanai, Keiji

Author_Institution

Dept. of Comput. Sci., Electro-Commun. Univ., Japan

Volume

1

fYear

2003

fDate

6-9 July 2003

Abstract

We propose a system that enables us to gather more than one thousand images from the World Wide Web. The system is called Image Collector II. The image collector, which we proposed previously, can gather only several hundreds images. We made the two following improvements to extend the ability of our previous system in terms of the number of gathered images and their precision: (1) We extracted some words appearing with high frequency from all HTML files embedding output images in an initial image gathering, and using them as keywords, we made a second image gathering again. Through this, we obtained more than one thousand images for one keyword. (2) The more images we gathered, the more he precision of gathered images decreased. To raise the precision, we introduced word vectors of HTML files embedding images into the image selecting process in addition to image feature vectors.

Keywords

Internet; Web sites; feature extraction; hypermedia markup languages; image processing; HTML files; World Wide Web; image collector II; image feature vectors; image gathering; image selecting process; keywords; Computer science; Content based retrieval; Explosions; Frequency; HTML; Image analysis; Image databases; Image retrieval; Search engines; Web sites;

fLanguage

English

Publisher

ieee

Conference_Titel

Multimedia and Expo, 2003. ICME '03. Proceedings. 2003 International Conference on

Print_ISBN

0-7803-7965-9

Type

conf

DOI

10.1109/ICME.2003.1221035

Filename

1221035