DocumentCode :
1844217
Title :
Combining Similarity and Distribution Features to Match Attributes
Author :
Wang, Yu ; Fang, Binxing ; Guo, Yan
Volume :
3
fYear :
2009
fDate :
15-18 Sept. 2009
Firstpage :
299
Lastpage :
302
Abstract :
The Web contains much useful semistructued information which can be organized into web objects, and many of them are commercially valuable. The inner structures of these web objects are highly heterogeneous that web objects from different web sites cover different subsets of useful attributes. The complete set of attributes can be mined from web pages through attribute extraction algorithms. However, to construct high quality web object schema, some mined attributes should be merged since they are synonyms for the same concepts. Our empirical study shows that features used by traditional schema matching and deep web integration methods are usually domain specific, so they are not applicable to match attributes extracted from the Web. To overcome this problem, this paper proposes new features to depict attribute distribution characteristics and uses machine learning techniques to combine attribute distribution characteristics with attribute similarity features. We empirically compare the proposed method with existing methods use other features, and the results show the effectiveness of our method.
Keywords :
Business; Computers; Conferences; Data mining; Intelligent agent; Machine learning; Research and development; Skeleton; Spatial databases; Web pages; attribute distribution; attribute extraction; deep web integration; schema matching;
fLanguage :
English
Publisher :
iet
Conference_Titel :
Web Intelligence and Intelligent Agent Technologies, 2009. WI-IAT '09. IEEE/WIC/ACM International Joint Conferences on
Conference_Location :
Milan, Italy
Print_ISBN :
978-0-7695-3801-3
Electronic_ISBN :
978-1-4244-5331-3
Type :
conf
DOI :
10.1109/WI-IAT.2009.287
Filename :
5285054
Link To Document :
بازگشت