• DocumentCode
    641011
  • Title

    Introducing UWS - A fuzzy based word similarity function with good discrimination capability: Preliminary results

  • Author

    Carvalho, Jose P. ; Coheur, Luisa

  • Author_Institution
    Inst. Super. Tecnico, Tech. Univ. of Lisbon, Lisbon, Portugal
  • fYear
    2013
  • fDate
    7-10 July 2013
  • Firstpage
    1
  • Lastpage
    8
  • Abstract
    This paper introduces a novel word similarity function, the Uke Similarity Function (UWS), that fuses the most interesting characteristics of the two main philosophies in word and string matching: the edit distance and the n-gram similarity approach. It also uses fuzzy sets to integrate expert knowledge about typographical errors and to easily include phonetic and token related errors. The UWS was developed with the goal of automatic detection and correction of typographical and other word errors in unedited corpus data when creating word lists.
  • Keywords
    fuzzy set theory; natural language processing; string matching; UWS; Uke similarity function; discrimination capability; edit distance; fuzzy based word similarity function; fuzzy sets; n-gram similarity approach; natural language processing; phonetic errors; string matching; token related errors; typographical errors; word matching; Communities; Context; Dictionaries; Fuzzy sets; Keyboards; Measurement; Natural language processing; Fuzzy Sets; Typographical Error Detection and Correction; Unedited Corpus Data; Word Matching; Word Similarity;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Fuzzy Systems (FUZZ), 2013 IEEE International Conference on
  • Conference_Location
    Hyderabad
  • ISSN
    1098-7584
  • Print_ISBN
    978-1-4799-0020-6
  • Type

    conf

  • DOI
    10.1109/FUZZ-IEEE.2013.6622494
  • Filename
    6622494