• DocumentCode
    3321783
  • Title

    A Framework for Clustering Uncertain Data Streams

  • Author

    Aggarwal, Charu C. ; Yu, Philip S.

  • Author_Institution
    T.J. Watson Res. Center, IBM, Hawthorne, NY
  • fYear
    2008
  • fDate
    7-12 April 2008
  • Firstpage
    150
  • Lastpage
    159
  • Abstract
    In recent years, uncertain data management applications have grown in importance because of the large number of hardware applications which measure data approximately. For example, sensors are typically expected to have considerable noise in their readings because of inaccuracies in data retrieval, transmission, and power failures. In many cases, the estimated error of the underlying data stream is available. This information is very useful for the mining process, since it can be used in order to improve the quality of the underlying results. In this paper we will propose a method for clustering uncertain data streams. We use a very general model of the uncertainty in which we assume that only a few statistical measures of the uncertainty are available. We will show that the use of even modest uncertainty information during the mining process is sufficient to greatly improve the quality of the underlying results. We show that our approach is more effective than a purely deterministic method such as the CluStream approach. We will test the approach on a variety of real and synthetic data sets and illustrate the advantages of the method in terms of effectiveness and efficiency.
  • Keywords
    data mining; statistical analysis; CluStream approach; mining process; uncertain data management applications; uncertain data streams clustering; uncertainty statistical measures; Cleaning; Data mining; Data privacy; Hardware; Information retrieval; Measurement uncertainty; Probability density function; Probability distribution; Statistical analysis; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering, 2008. ICDE 2008. IEEE 24th International Conference on
  • Conference_Location
    Cancun
  • Print_ISBN
    978-1-4244-1836-7
  • Electronic_ISBN
    978-1-4244-1837-4
  • Type

    conf

  • DOI
    10.1109/ICDE.2008.4497423
  • Filename
    4497423