DocumentCode
49157
Title
Data Acquisition for Probabilistic Nearest-Neighbor Query
Author
Yu-Chieh Lin ; De-Nian Yang ; Hong-Han Shuai ; Ming-Syan Chen
Author_Institution
Res. Center of Inf. Technol. Innovation, Taipei, Taiwan
Volume
27
Issue
2
fYear
2015
fDate
Feb. 2015
Firstpage
410
Lastpage
427
Abstract
Management of uncertain data in spatial queries has drawn extensive research interests to consider the granularity of devices and noises in the collection and the delivery of data. Most previous works usually model and handle uncertain data to find the required results directly. However, it is more difficult for users to obtain useful insights when data uncertainty dramatically increases. In this case, users are usually willing to invest more resources to improve the result by reducing the data uncertainty in order to obtain more interesting observations with the existing schemes. In light of this important need, this paper formulates a new problem of selecting a given number of uncertain data objects for acquiring their attribute values to improve the result of the Probabilistic k-Nearest-Neighbor (k-PNN) query. We prove that better query results are guaranteed to be returned with data acquisition, and we devise several algorithms to maximize the expected improvement. We first explore the optimal single-object acquisition for 1-PNN to examine the fundamental problem structure and then propose an efficient algorithm that discovers crucial properties to simplify the probability derivation in varied situations. We extend the proposed algorithm to achieve the optimal multi-object acquisition for 1-PNN by deriving an upper bound to facilitate efficient pruning of unnecessary sets of objects. Moreover, for data acquisition of k-PNN, we extract the k-PNN answers with sufficiently large probabilities to trim the search space and properly exploit the result of single-object acquisition for estimating the gain from multi-object acquisition. The experimental results demonstrate that the probability of k-PNN can be significantly improved even with only a small number of objects for data acquisition.
Keywords
data acquisition; probability; query processing; search problems; 1-PNN; attribute value acquition; data acquisition; data collection; data delivery; data uncertainty reduction; device granularity; gain estimation; k-PNN answer extraction; k-PNN probability improvement; k-PNN query; multiobject acquisition; noise granularity; optimal multiobject acquisition; optimal single-object acquisition; probabilistic k-nearest-neighbor query; probabilistic nearest-neighbor query; probability derivation; search space; single-object acquisition; spatial queries; uncertain data management; uncertain data object selection; unnecessary object set pruning; upper bound; Algorithm design and analysis; Data acquisition; Data models; Databases; Nearest neighbor searches; Probabilistic logic; Uncertainty; Uncertainty; algorithm design and analysis; nearest neighbor searches; query processing;
fLanguage
English
Journal_Title
Knowledge and Data Engineering, IEEE Transactions on
Publisher
ieee
ISSN
1041-4347
Type
jour
DOI
10.1109/TKDE.2013.2297916
Filename
6702497
Link To Document