DocumentCode :
2406652
Title :
The design and development of PELECAN: Pronunciation Errors from Learners of English Corpus and Annotation
Author :
Chotimongkol, Ananlada ; Thatphithakkul, Sumonmas ; Chootrakool, Patcharika ; Hansakunbuntheung, Chatchawarn ; Wutiwiwatchai, Chai
Author_Institution :
Nat. Electron. & Comput. Technol. Center (NECTEC), Pathumthani, Thailand
fYear :
2011
fDate :
26-28 Oct. 2011
Firstpage :
36
Lastpage :
41
Abstract :
This paper describes the design and construction of PELECAN (Pronunciation Errors from Learners of English Corpus and Annotation). PELECAN is created primarily for collecting pronunciation errors from Thai learners of English in order to develop a more suitable pronunciation assessment tool for Thais. A 2-phase data collection process is used to balance between recording effort and the coverage of interested acoustic phenomena. The data collected from the first phase contains 1.5 hours of speech from 30 Thai learners reading 2 English passages that cover all English phones. Recorded speech was annotated with 2 types of error annotation: phonetic transcription of incorrect pronunciation and level of correctness of each phone. A contrastive list was used to guide the error analysis process. We found that many pronunciation errors are influenced by L1 (Thai), e.g. incorrect pronunciations of suffixes and the deletion of /l/ and /r/ in consonant clusters. However, there are some errors that may not be predictable from contrastive analysis alone such as the case of schwa. Hence, the data driven approach could help identify errors that may not be foreseen from only a linguistic point of view.
Keywords :
computer aided instruction; error analysis; linguistics; natural language processing; speech processing; 2-phase data collection process; English phones; PELECAN development; Thai learners; acoustic phenomena; consonant clusters; error analysis process; pronunciation assessment tool; pronunciation errors from learners of English corpus and annotation; recording effort; Acoustics; Data models; Error analysis; Materials; Speech; Speech recognition; Tongue; Spoken learner corpus; Thai learners of English; error tagging; pronunciation errors;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Speech Database and Assessments (Oriental COCOSDA), 2011 International Conference on
Conference_Location :
Hsinchu
Print_ISBN :
978-1-4577-0930-2
Type :
conf
DOI :
10.1109/ICSDA.2011.6085976
Filename :
6085976
Link To Document :
بازگشت