Title :
Extracting Representative Information to Enhance Flexible Data Queries
Author :
Jin Zhang ; Guoqing Chen ; Xiaohui Tang
Author_Institution :
Sch. of Econ. & Manage., Tsinghua Univ., Beijing, China
fDate :
6/1/2012 12:00:00 AM
Abstract :
Extracting representative information is of great interest in data queries and web applications nowadays, where approximate match between attribute values/records is an important issue in the extraction process. This paper proposes an approach to extracting representative tuples from data classes under an extended possibility-based data model, and to introducing a measure (namely, relation compactness) based upon information entropy to reflect the degree that a relation is compact in light of information redundancy. Theoretical analysis and data experiments show that the approach has desirable properties that: 1) the set of representative tuples has high degrees of compactness (less redundancy) and coverage (rich content); 2) it provides a way to obtain data query outcomes of different sizes in a flexible manner according to user preference; and 3) the approach is also meaningful and applicable to web search applications.
Keywords :
Internet; approximation theory; data handling; query processing; Web search applications; approximate match; data classes; extended possibility based data model; extraction process; flexible data queries enhancement; information entropy; information redundancy; representative information extraction; representative tuples; Data mining; Databases; Diamond-like carbon; Redundancy; Silver; Uncertainty; Web search; Flexible data queries; information equivalence; relation compactness; representativeness; web search;
Journal_Title :
Neural Networks and Learning Systems, IEEE Transactions on
DOI :
10.1109/TNNLS.2012.2193415