• DocumentCode
    2070989
  • Title

    Approximate matchings in scientific databases

  • Author

    Chatterjee, Abhirup ; Segev, Arie

  • Author_Institution
    Walter A. Haas Sch. of Bus., California Univ., Berkeley, CA, USA
  • Volume
    3
  • fYear
    1994
  • fDate
    4-7 Jan. 1994
  • Firstpage
    448
  • Lastpage
    457
  • Abstract
    Organizations often need access to scientific data stored in independently managed databases. In this paper, we analyze the data heterogeneity problem which occurs when the data conveying the same or similar information is represented differently in different databases. We introduce the matching join to process queries in scientific databases and discuss the three steps to evaluate it. First we transform the query using the functional dependencies in the database to incorporate additional knowledge. Second, we use rules and weights to compare the attributes. Matching joins can also be used to obtain approximate answers. In the third step, we propose a numeric measure, called the comparison value, c, to estimate the quality of matching and suggest deterministic and probabilistic ways of deriving it. Finally, we analyze the problem of estimating the cutoff value for c that would minimize the cost of errors during the join computation.<>
  • Keywords
    distributed databases; natural sciences computing; query processing; approximate answers; comparison value; cost of errors; data heterogeneity problem; functional dependencies; independently managed databases; join computation; matching join; queries; scientific data; scientific databases;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    System Sciences, 1994. Proceedings of the Twenty-Seventh Hawaii International Conference on
  • Conference_Location
    Wailea, HI, USA
  • Print_ISBN
    0-8186-5090-7
  • Type

    conf

  • DOI
    10.1109/HICSS.1994.323327
  • Filename
    323327