Title of article :
Question Generator System of Sentence Completion in TOEFL Using NLP and K-Nearest Neighbor
Author/Authors :
Riza, Lala Septem Department of Computer Science Education - Universitas Pendidikan Indonesia - Bandung, Indonesia , Pertiwi , Anita Dyah Department of Computer Science Education - Universitas Pendidikan Indonesia - Bandung, Indonesia , Rahman, Eka Fitrajaya Department of Computer Science Education - Universitas Pendidikan Indonesia - Bandung, Indonesia , Munir, Department of Computer Science Education - Universitas Pendidikan Indonesia - Bandung, Indonesia , Abdullah, Cep Ubad Fakultas Pendidikan Ilmu Pengetahuan Sosial - Universitas Pendidikan Indonesia - Bandung, Indonesia
Pages :
18
From page :
294
To page :
311
Abstract :
Test of English as a Foreign Language (TOEFL) is one of learning evaluation forms that requires excellent quality of questions. Preparing TOEFL questions using a conventional way certainly spends a lot of time. Computer technology can be used to solve the problem. Therefore, this research was conducted in order to solve the problem of making TOEFL questions with sentence completion type. The built system consists of several stages: (1) input data collection from foreign media news sites with excellent English grammar quality; (2) preprocessing with Natural Language Processing (NLP); (3) Part of Speech (POS) tagging; (4) question feature extraction; (5) separation and selection of news sentences; (6) determination and value collection of seven features; (7) conversion of categorical data value; (8) target classification of blank position word with K-Nearest Neighbor (KNN); (9) heuristic determination of rules from human experts; and (10) options selection or distraction based on heuristic rules. After conducting the experiment on 10 news, it is obtained that 20 questions based on the results of the evaluation showed that the generated questions had a very good quality with percentage of 81.93% (after the assessment by the human expert), and 70% was the same blank position from the historical data of TOEFL questions. So, it can be concluded that the generated question has the following characteristics: the quality of the result follows the data training from the historical TOEFL questions, and the quality of the distraction is very good because it is derived from the heuristics of human experts.
Keywords :
Machine Learning , Education Learning , K-Nearest Neighbor , Natural Language Processing , Automatic question generation
Journal title :
Indonesian Journal of Science and Technology
Serial Year :
2019
Full Text URL :
Record number :
2603031
Link To Document :
بازگشت