Title :
Association Pattern Mining for Product Specification Integration
Author :
Tsay, Jyh-Jong ; Tsay, Chin-Wen ; Chen, Ping-Hong
Author_Institution :
Dept. of Comput. Sci. & Inf. Eng., Nat. Chung Cheng Univ., Chiayi, Taiwan
Abstract :
As there are more and more online stores and shopping sites available on the Web, integration of product and shopping information provided by different sources has become more and more important, and attract attention of recent research in information integration. One of the fundamental problems is to integrate specifications for products of the same type from difference vendors so that they are described in a homogeneous and uniform way. Observe that specifications for products of the same type from different vendors can look quite different. Integration of them is a tedious and error-prone task. In this paper, we formulate product specification integration as the problem of text categorization, and propose an association pattern mining approach that can automatically generate pattern rules for each attribute. Association patterns are mined from n-grams generated from product specifications. However, mining of association patterns from n-grams can be very time inefficient as any substrings of a frequent string is also frequent. We propose substring pruning strategies that are specific to text data to improve the running time. Experiment shows that our approach is very time-efficient, and achieves classification accuracy higher than 0.95 for data sets collected for digital cameras, notebook PCs, and LCDs.
Keywords :
Internet; data mining; retailing; Web site; association pattern mining approach; digital cameras; product specification integration; shopping information integration; substring pruning strategies; text categorization; Computer science; Data mining; Digital cameras; Explosions; Fuzzy systems; Knowledge engineering; Pattern matching; Personal communication networks; Sorting; Text categorization; association mining; associative classifier; data integration; product specification integration;
Conference_Titel :
Fuzzy Systems and Knowledge Discovery, 2009. FSKD '09. Sixth International Conference on
Conference_Location :
Tianjin
Print_ISBN :
978-0-7695-3735-1
DOI :
10.1109/FSKD.2009.620