Title :
Combining Documentation and Research: Ongoing Work on an Endangered Language
Author :
Michaud, Alain ; Hardie, Andrew ; Guillaume, Serge ; Toda, Masayoshi
Author_Institution :
MICA, Hanoi Univ. of Technol., Hanoi, Vietnam
Abstract :
This paper is intended for an audience of speech technology specialists who believe that "automatic processing of under-resourced languages is a way to study language diversity with a multi-disciplinary view" (L. Besacier, keynote speech at this conference). It aims (i) to provide an illustration of the way in which data are collected in fieldwork on endangered languages, bringing attention to the quality of the transcriptions and annotations created by linguists, (ii) to present the contents and format of a set of endangered-language documents synchronizing sound and text, which are currently available online, and (iii) to sketch out some of the research purposes and applications to which these documents lend themselves, and which we intend to pursue in future work.
Keywords :
natural language processing; speech processing; text analysis; annotations quality; automatic under-resourced languages processing; data collection; endangered language; language documentation; language research; sound synchronization; speech technology specialists; text synchronization; transcriptions quality; Acoustics; Adaptation models; Documentation; Educational institutions; Laboratories; Pragmatics; Speech; Sino-Tibetan; Yongning Na; endangered languages; interlinear glossing; language documentation; long-term preservation; multimedia corpora; online databases; spontaneous speech;
Conference_Titel :
Asian Language Processing (IALP), 2012 International Conference on
Conference_Location :
Hanoi
Print_ISBN :
978-1-4673-6113-2
Electronic_ISBN :
978-0-7695-4886-9
DOI :
10.1109/IALP.2012.32