• DocumentCode
    2853395
  • Title

    Probabilistic Similarity Search in Uncertain Time-Series Database

  • Author

    Qian, Aling ; Ding, Xiaofeng ; Lu, Yansheng

  • fYear
    2009
  • fDate
    19-20 Dec. 2009
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    The nearest neighbor search over time-series databases has been a hot research topic for a long time period, which is widely used in many applications, including information retrieval, genetic data matching, data mining, and so on. However, due to high dimensionality (i.e. length) and uncertainty of the time series, the similarity search over directly indexed precise time series usually encounters serious problems, such as the "dimensionality curse" and "trust ability curse". Conventionally, many dimensionality reduction techniques and uncertainty processing strategies are proposed separately to break such drawbacks by reducing the dimensionality of time series and simulating the data uncertainty. However, among all the proposed methods, there does not have indexing mechanisms to support similarity queries, which supports efficiently search over very large uncertain time-series databases. In this paper, we re-investigate PLA for approximating and indexing uncertain time series. In particular, we propose a novel distance function in the reduced PLA-space, and this function leads to a lower bound of the Euclidean distance between the original uncertain time series, which can lead to no false negatives during the similarity search. In the following step, based on three lemmas, we develop an effective approach to index these lower bounds to improve the nearest neighbor query efficiency. Finally, extensive experiments over synthetic data sets have demonstrated the efficiency and effectiveness of PLA together with the newly proposed lower bound lemmas, in terms of both pruning power and wall clock time, compared with the baseline algorithm.
  • Keywords
    indexing; query formulation; time series; uncertainty handling; very large databases; Euclidean distance; PLA-space reduction; dimensionality curse; dimensionality reduction technique; indexing mechanism; nearest neighbor search; probabilistic similarity search; similarity query support; trust ability curse; uncertain time series database; uncertainty processing strategy; Clocks; Data mining; Databases; Euclidean distance; Genetics; Indexing; Information retrieval; Nearest neighbor searches; Programmable logic arrays; Uncertainty;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Engineering and Computer Science, 2009. ICIECS 2009. International Conference on
  • Conference_Location
    Wuhan
  • Print_ISBN
    978-1-4244-4994-1
  • Type

    conf

  • DOI
    10.1109/ICIECS.2009.5365539
  • Filename
    5365539