• DocumentCode
    3102277
  • Title

    Identification of Closely Related Indigenous Languages: An Orthographic Approach

  • Author

    Ng, Ee-lee ; Yeo, Alvin W. ; Ranaivo-malançon, Bali

  • Author_Institution
    Fac. of Comput. Sci. & Inf. Technol., Univ. Malaysia Sarawak, Kota Samarahan, Malaysia
  • fYear
    2009
  • fDate
    7-9 Dec. 2009
  • Firstpage
    230
  • Lastpage
    235
  • Abstract
    The main focus of this study is to identify the closely related languages amongst the indigenous languages of Sarawak and major languages such as Bahasa Melayu and English. The indigenous languages involved in this study include Iban (standard), Bidayuh (Bau-Jagoi), Kelabit (Bario), Melanau (Matu-Daro), Sa´ban (Long Banga) and Penan (East Baram). The relationship between the languages is established via the proportion of cognates in the Swadesh list of the language pairs. The orthographic approach, which primarily examines the spelling of the vocabulary words, is used. The outcome of this study reveals that some indigenous languages are more closely related to to Bahasa Melayu than others. The findings from this research serve as an initial solution to answer the greater challenges in computational linguistics, such as the use of closely related languages as Pivot solutions in problems related to the under-resourced languages.
  • Keywords
    computational linguistics; natural languages; Bahasa Melayu; Swadesh list; closely related languages; computational linguistics; indigenous languages identification; language pairs; Computational linguistics; Natural languages; Vocabulary; orthography; swadesh list; under-resource languages;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Asian Language Processing, 2009. IALP '09. International Conference on
  • Conference_Location
    Singapore
  • Print_ISBN
    978-0-7695-3904-1
  • Type

    conf

  • DOI
    10.1109/IALP.2009.55
  • Filename
    5380771