• DocumentCode
    1553404
  • Title

    Aggregation of imprecise and uncertain information in databases

  • Author

    Mcclean, Sally ; Scotney, Bryan ; Shapcott, Mary

  • Author_Institution
    Fac. of Informatics, Ulster Univ., Coleraine, UK
  • Volume
    13
  • Issue
    6
  • fYear
    2001
  • Firstpage
    902
  • Lastpage
    912
  • Abstract
    Information stored in a database is often subject to uncertainty and imprecision. Probability theory provides a well-known and well understood way of representing uncertainty and may thus be used to provide a mechanism for storing uncertain information in a database. We consider the problem of aggregation using an imprecise probability data model that allows us to represent imprecision by partial probabilities and uncertainty using probability distributions. Most work to date has concentrated on providing functionality for extending the relational algebra with a view to executing traditional queries on uncertain or imprecise data. However, for imprecise and uncertain data, we often require aggregation operators that provide information on patterns in the data. Thus, while traditional query processing is tuple-driven, processing of uncertain data is often attribute-driven where we use aggregation operators to discover attribute properties. The aggregation operator that we define uses the Kullback-Leibler information divergence between the aggregated probability distribution and the individual tuple values to provide a probability distribution for the domain values of an attribute or group of attributes. The provision of such aggregation operators is a central requirement in furnishing a database with the capability to perform the operations necessary for knowledge discovery in databases
  • Keywords
    data mining; data models; database theory; probability; query processing; relational databases; uncertainty handling; Kullback-Leibler information divergence; aggregation operators; attribute properties; databases; imprecise information aggregation; imprecise probability data model; individual tuple values; knowledge discovery; partial probabilities; probability distributions; probability theory; query processing; relational algebra; uncertain information aggregation; uncertain information storage; Algebra; Data models; Database systems; Deductive databases; Information retrieval; Probability distribution; Query processing; Relational databases; Stochastic processes; Uncertainty;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/69.971186
  • Filename
    971186