• DocumentCode
    3581285
  • Title

    Arabic keyphrases extraction using a hybrid of statistical and machine learning methods

  • Author

    Ali, Nidaa Ghalib ; Omar, Nazlia

  • Author_Institution
    Fac. of Inf. Sci. & Technol., Univ. Kebangsaan Malaysia, Bangi, Malaysia
  • fYear
    2014
  • Firstpage
    281
  • Lastpage
    286
  • Abstract
    Keyphrases are single-word or multi-word lexemes that concisely and accurate describe the subject or side of the subject discuss in a document. Manually assigning keyphrases is tedious and time consuming, especially because of Web proliferation. Thus, automatic keyphrase generation systems are urgently needed. This study proposes a keyphrase extraction method that combines several keyphrase extraction methods with the use of machine learning approaches (linear logistic regression, linear discriminant analysis, and support vector machines). The proposed methods use the output of several keyphrase extraction methods as input features for a machine learning algorithm, which then determines whether each term is a keyphrase. Results show that the SVM algorithm achieves the best performance with F1-measures 88.31%. These values are relatively high and comparable with those of previous keyphrase extraction models for the Arabic language.
  • Keywords
    feature extraction; learning (artificial intelligence); natural language processing; pattern classification; regression analysis; support vector machines; text analysis; Arabic keyphrase extraction; Web proliferation; linear discriminant analysis; linear logistic regression; machine learning method; statistical method; support vector machine; Data mining; Feature extraction; Information technology; Learning systems; Logistics; Semantics; Support vector machines; arabic decoment machine learning; features; keyphrase extraction;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Technology and Multimedia (ICIMU), 2014 International Conference on
  • Type

    conf

  • DOI
    10.1109/ICIMU.2014.7066645
  • Filename
    7066645