Title :
The C-ORAL-BRASIL Corpus: Methodological Basis for the Treatment of Spontaneous Speech
Author :
Mittmann, Maryualê M. ; Raso, Tommaso ; Mello, Heliana R.
Author_Institution :
Fac. de Letras, Univ. Fed. de Minas Gerais (UFMG), Belo Horizonte, Brazil
Abstract :
This paper highlights the primary methods employed in the C-ORAL-BRASIL compiling process, i.e, recording, transcribing and segmenting oral texts. The C-ORAL-BRASIL is a Brazilian Portuguese corpus of spontaneous speech, designed for the study of informational structure. It is representative of the diaphasic variation, seeking to cover as many different comunicative situations as possible. This paper presents and exemplifies the processes of transcription and segmentation of speech into prosodic units as employed in our on-going research. It concludes with illustrations of some questions that the corpus will enable us to answer.
Keywords :
Context; Frequency; Humans; Monitoring; Natural languages; Software performance; Software quality; Speech analysis; Speech processing; Speech synthesis; Brazilian Portuguese; corpus; spontaneous speech;
Conference_Titel :
Information and Human Language Technology (STIL), 2009 Seventh Brazilian Symposium in
Conference_Location :
Sao Carlos, TBD, Brazil
Print_ISBN :
978-1-4244-6008-3
DOI :
10.1109/STIL.2009.22