DocumentCode :
2510009
Title :
Prosodically guided phonetic engine
Author :
Deekshitha, G. ; Mary, Leena
Author_Institution :
Dept. of Electron. & Commun. Eng., Rajiv Gandhi Inst. of Technol., Kottayam, India
fYear :
2015
fDate :
19-21 Feb. 2015
Firstpage :
1
Lastpage :
5
Abstract :
Phonetic Engine (PE) is the first stage of automatic speech recognition system that converts input speech to a sequence of phonetic symbols. A baseline phonetic engine is created using Malayalam speech database. A Graphical User Interface (GUI) is developed for the phonetic engine to perform real time recognition of phonemes. It is known that higher level of speech information such as intonation, duration and intensity collectively referred as `prosody´, aids human speech recognition. Prosody helps to segment speech to sentences/phrases and to disambiguate recognition process. This has motivated us to incorporate prosody for the improvement of the baseline phonetic engine. However incorporating prosody in automatic speech recognition is a challenging task. This paper describes an approach to automatic labeling of prosodic events and discusses about the possibility to implement a prosodically guided phonetic engine for Malayalam. Automatic phrase-like segmentation is realized by detecting long pauses with an Artificial Neural Network (ANN) based classifier. Broad Phoneme Classification is achieved using features derived from the speech at the signal level itself. Combination of broad phoneme transcription and pitch trend labels is used to obtain a temporal prosodic pattern. We have illustrated the effectiveness of this temporal prosodic pattern is for audio search application.
Keywords :
natural language processing; neural nets; signal classification; speech processing; speech recognition; ANN based classifier; Malayalam speech database; artificial neural network; automatic phrase-like segmentation; automatic prosodic events labeling; automatic speech recognition system; broad phoneme classification; broad phoneme transcription; pitch trend labels; prosodically guided phonetic engine; temporal prosodic pattern; Artificial neural networks; Databases; Engines; Feature extraction; Hidden Markov models; Speech; Speech recognition; ANN classifier; HMM; Phonetic engine; features; pitch; prosody; segmentation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal Processing, Informatics, Communication and Energy Systems (SPICES), 2015 IEEE International Conference on
Conference_Location :
Kozhikode
Type :
conf
DOI :
10.1109/SPICES.2015.7091457
Filename :
7091457
Link To Document :
بازگشت