Title :
Multilevel annotation of speech signals using weighted finite state transducers
Author :
Paulo, Sérgio ; Oliveira, Luís
Author_Institution :
Spoken Language Syst. Lab, INESC, Lisbon, Portugal
Abstract :
The purpose of this work was the development of a set of tools to automate the process of multilevel annotation of speech signals, preserving the alignments of the utterance´s different levels of the linguistic representation. Our goal is to build speech databases, using speech from non professional speakers with multilevel relational annotations, that can be used for the development of concatenative-based text-to-speech synthesizers or for training and testing statistical models. The method is based on the linguistic analysis of the transcription of the spoken material performed by a TTS system. The predicted phone sequence is then compared with the sequence produced by the speaker. The problem of aligning these two sequences is solved in a language-independent way using Weighted Finite State Transducers. After the alignment, a re-synchronization procedure is applied to the remaining levels to put them in agreement with the spoken utterance.
Keywords :
linguistics; speech processing; speech synthesis; statistical analysis; TTS system; concatenative-based text-to-speech synthesizers; linguistic analysis; linguistic representation; multilevel relational annotations; predicted phone sequence; speech databases; speech signals; statistical model testing; statistical model training; weighted finite state transducers; Natural languages; Performance analysis; Relational databases; Signal processing; Speech processing; Speech recognition; Speech synthesis; Synthesizers; Testing; Transducers;
Conference_Titel :
Speech Synthesis, 2002. Proceedings of 2002 IEEE Workshop on
Print_ISBN :
0-7803-7395-2
DOI :
10.1109/WSS.2002.1224384