مرکز منطقه ای اطلاع رساني علوم و فناوري - Automatic Utterance Segmentation Tool for Speech Corpus

DocumentCode :

1910447

Title :

Automatic Utterance Segmentation Tool for Speech Corpus

Author :

Ozawa, Mitsuhiro ; Kita, Kenji ; Kuroiwa, Shingo ; Tsuge, Satoru ; Fukumi, Minoru ; Shishibori, Masami ; Ren, Fuji

Author_Institution :

Grad. Sch. of Adv. Technol. & Sci., Tokushima Univ., tokushima

fYear :

2007

fDate :

Aug. 30 2007-Sept. 1 2007

Firstpage :

401

Lastpage :

406

Abstract :

We collect the speech data for investigating an intra-speakers´ speech variability over a short and long time. In general, to reduce the load of speakers, the speech data are collected as one file from collecting start to collecting end. Hence, there are some noises, non-speech sections and mistaken sections in this file. Consequently, we must segment this file into individual utterances and select the useful utterances. This process requires a lot of time and efforts. In this paper, we propose an automatic utterance segmentation tool for dividing the collected speech data. The proposed tool is composed of four processes, which are a voice activity detection, speech recognition, a DP matching, and a correct of speech section. For evaluating the proposed tool, we conduct the evaluation experiments using a female speaker´s speech data in our corpus. Experimental results show that the proposed method can reduce a filing time by 90% compared to a manual filing. In This paper, first, we introduced the large speech corpus. This speech corpus contains is the speech data collected by specific speaker over long and short time periods. And, we explained the automatic utterance segmentation tool which we made in the case of corpus build. And inspected the validity. As a result, it was demonstrated that the automatic utterance segmentation tool was high-performance. Furthermore, it was demonstrated that speech corpus build became simple by using the automatic utterance segmentation tool.

Keywords :

speech recognition; automatic utterance segmentation tool; speech corpus; speech recognition; speech variability; voice activity detection; Background noise; Cellular phones; Information technology; Noise robustness; Speech analysis; Speech processing; Speech recognition; Telecommunications;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Natural Language Processing and Knowledge Engineering, 2007. NLP-KE 2007. International Conference on

Conference_Location :

Beijing

Print_ISBN :

978-1-4244-1610-3

Electronic_ISBN :

978-1-4244-1611-0

Type :

conf

DOI :

10.1109/NLPKE.2007.4368062

Filename :

4368062

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1910447