• DocumentCode
    3085642
  • Title

    MAMCost: Global and Local Estimates leading to Robust Cost Estimation of Similarity Queries

  • Author

    Baioco, Gisele Busichia ; Traina, Agma J M ; Traina, Caetano

  • Author_Institution
    Univ. of Sao Paulo at S. Carlos, Sao Carlos
  • fYear
    2007
  • fDate
    9-11 July 2007
  • Firstpage
    6
  • Lastpage
    6
  • Abstract
    This paper presents an effective cost model to estimate the number of disk accesses (I/O cost) and the number of distance calculations (CPU cost) to process similarity queries over data indexed by metric access methods. Two types of similarity queries were taken into consideration: range and k-nearest neighbor queries. The main point of the cost model is considering not only global parameters of the data set but also the local data distribution. The model takes advantage of the intrinsic dimension of the data set, estimated by its correlation fractal dimension. Experiments were performed on real and synthetic data sets, with different sizes and dimensions, in order to validate the proposed model. They confirmed that the estimations are accurate, within the range achieved by real queries.
  • Keywords
    data handling; query processing; MAMCost; correlation fractal dimension; data set; disk accesses number; distance calculations number; k-nearest neighbor queries; local data distribution; local estimates; robust cost estimation; similarity queries; Bioinformatics; Computational efficiency; Computer science; Cost function; Data structures; Extraterrestrial measurements; Fractals; Genomics; Information retrieval; Robustness;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Scientific and Statistical Database Management, 2007. SSBDM '07. 19th International Conference on
  • Conference_Location
    Banff, Alta.
  • ISSN
    1551-6393
  • Print_ISBN
    0-7695-2868-6
  • Electronic_ISBN
    1551-6393
  • Type

    conf

  • DOI
    10.1109/SSDBM.2007.17
  • Filename
    4274951