• DocumentCode
    2364447
  • Title

    Name extraction for unstructured Malay text

  • Author

    Sharum, Mohd Yunus ; Abdullah, Muhamad Taufik ; Sulaiman, Md Nasir ; Murad, Masrah Azrifah Azmi ; Hamzah, Zaitul Azma Zainon

  • Author_Institution
    Fac. of Comp. Sci. & Info. Technol., UPM, Serdang, Malaysia
  • fYear
    2011
  • fDate
    20-23 March 2011
  • Firstpage
    787
  • Lastpage
    791
  • Abstract
    Names are categorized as proper nouns. Identifying nouns in unstructured text is very challenging since the number is almost unlimited. Name recognition can be used for part-of-speech (POS) tagging or automatic term acquisition in natural language processing (NLP). In this paper we proposed a general approach to recognize names in Malay text. Using the proposed approach, we implement an application of free indexing for indexing names from a collection of Malay texts. Our evaluation shows that the application reach 92% precision score, 54% recall score, and F-score 68% in indexing names from news´ articles.
  • Keywords
    feature extraction; natural language processing; text analysis; automatic term acquisition; indexing; name extraction; name recognition; natural language processing; part of speech tagging; unstructured Malay text; Bridges; Communities; Indexing; Natural language processing; Roads; Text recognition; Malay text processing; Name extraction; Name recognition; Pattern recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computers & Informatics (ISCI), 2011 IEEE Symposium on
  • Conference_Location
    Kuala Lumpur
  • Print_ISBN
    978-1-61284-689-7
  • Type

    conf

  • DOI
    10.1109/ISCI.2011.5959017
  • Filename
    5959017