• DocumentCode
    1587884
  • Title

    Arabic Script Documents Language Identifications Using Fuzzy ART

  • Author

    Selamat, Ali ; Ching, Ng Choon

  • Author_Institution
    Fac. of Comput. Sci. & Inf. Syst., Univ. Technologi Malaysia, Kuching
  • fYear
    2008
  • Firstpage
    528
  • Lastpage
    533
  • Abstract
    The volume of information available on the internet, intranet, digital libraries and newsgroup has increased dramatically in recent years. Therefore, there is a growing interest in helping user better find, filter, and manage these resources. Language identification is the first step of understanding text documents which is written in. It is usually a module within multilingual application. In this paper, we introduce language identification of Arabic script documents by letter frequency. Technique used for identification is fuzzy adaptive resonance theory (ART), which is belong to the neural network architectures that perform incremental unsupervised learning. Arabic script documents such as Arabic, Persian and Urdu were used for performing language identification. From the experiments, we have found that fuzzy ART is particularly promising in terms of accuracy on language identification.
  • Keywords
    ART neural nets; fuzzy neural nets; natural language processing; text analysis; Arabic script documents language identifications; Internet; digital libraries; fuzzy adaptive resonance theory; incremental unsupervised learning; intranet; language identification; neural network architectures; newsgroup; text documents; Frequency; Fuzzy neural networks; Information filtering; Information filters; Internet; Neural networks; Resonance; Resource management; Software libraries; Subspace constraints; Arabic script documents; Fuzzy ART; adaptive resonance theory (ART); language identification;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Modeling & Simulation, 2008. AICMS 08. Second Asia International Conference on
  • Conference_Location
    Kuala Lumpur
  • Print_ISBN
    978-0-7695-3136-6
  • Electronic_ISBN
    978-0-7695-3136-6
  • Type

    conf

  • DOI
    10.1109/AMS.2008.47
  • Filename
    4530531