• DocumentCode
    3269698
  • Title

    LexEQUAL: supporting multilexical queries in SQL

  • Author

    Kumaran, A. ; Haritsa, Jayant R.

  • Author_Institution
    Dept. of Comput. Sci. & Autom., Indian Inst. of Sci., Bangalore, India
  • fYear
    2004
  • fDate
    30 March-2 April 2004
  • Firstpage
    845
  • Abstract
    Current database systems offer support for storing multilingual data, but are not capable of querying across languages, an important consideration in today´s global economy. We therefore propose a new multilexical operator called LexEQUAL that extends the standard lexicographic matching in database systems to matching of text data across languages, specifically for names, which form close to twenty percent of text corpora. The implementation of the LexEQUAL operator is based on transforming matches in language space into parameterized approximate matches in the equivalent phoneme space. A detailed evaluation of our approach on a real data set shows that there exist settings of the algorithm parameters with which it is possible to achieve both good recall and precision.
  • Keywords
    SQL; query processing; string matching; database querying; database systems; lexicographic matching; multilexical operator; multilingual data; Automation; Books; Computer errors; Computer science; Cost function; Database systems; Indexes; Matched filters; Natural languages; Runtime;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering, 2004. Proceedings. 20th International Conference on
  • ISSN
    1063-6382
  • Print_ISBN
    0-7695-2065-0
  • Type

    conf

  • DOI
    10.1109/ICDE.2004.1320075
  • Filename
    1320075