Title :
Support vector machine based approach for quranic words detection in online textual content
Author :
Sabbah, Thabit ; Selamat, Ali
Author_Institution :
Fac. of Comput., Univ. Teknol. Malaysia (UTM), Skudai, Malaysia
Abstract :
Quran is the holy book for Muslims around the world. Since it was revealed to the Prophet Muhammad (PBUH) before about 14 hundreds years, Quran is preserved in all imaginable ways from distortion. The rapid and huge growth of digital media and internet usage, cause a wide spread of the Quranic knowledge as well as Quranic Verses, scripts, Translations, and many other Quranic sciences in its digital formats. Some of the online sources, websites, services and social network users are introducing a less authentic Quranic content, services and applications. The ordinary user of such online resources could not detect and authenticate the provided Quranic verses. In this paper, we propose a machine learning approach to detect Quranic words in a text extracted from online sources. The proposed approach of detection utilizes Support Vector Machine to generate a learning model of Quranic words by training the learner on the Quranic words dataset. The generated classification model is used later to classify the words from online content. Experiments based on different features categories such as the Diacritics, and Statistical features are performed and a prototype is developed, Results show that the accuracy and other evaluation measurements achieved by the proposed approach are higher than the previous measurement in the domain. The Future works will focus on incorporating more machine learning and optimization techniques for achieving higher evaluation measurements.
Keywords :
learning (artificial intelligence); pattern classification; support vector machines; text analysis; Quranic content; Quranic knowledge; Quranic sciences; Quranic words detection; Web sites; classification model; diacritics feature; learning model; machine learning approach; online textual content; social network; statistical feature; support vector machine; Accuracy; Authentication; Feature extraction; Prototypes; Support vector machines; Testing; Training; Arabic words; Quranic words; Support Vector Machine; classification; detection; learning model;
Conference_Titel :
Software Engineering Conference (MySEC), 2014 8th Malaysian
Conference_Location :
Langkawi
DOI :
10.1109/MySec.2014.6986038