Abstract :
Native speakers of any language use their knowledge base of prosodic features during speech production, acquired unconsciously in the course of language acquisition. With the help of these features, they are capable of expressing the meaning of any utterance and emotional states. It´s still a challenging task to bring similar naturalness in artificial speech. To bring similar naturalness, it is needed to investigate a model for sentence prosody, which involves two complementary phenomena, namely, intonation and prominence. They relate to the phonetic cues of F0, Amplitude, Duration, and Pauses. The main objective of this paper is to find the dependence of Prominence on the phonetic cues (F0, Amplitude, Duration) in the form of mathematical function. This mathematical form could be very useful for Hindi TTS (Text-to-Speech Synthesis) systems and Hindi ASR (Automatic Speech Recognition) systems. It could also be adopted as a prosodic module for ASR and TTS.
Keywords :
feature extraction; natural language processing; speech recognition; speech synthesis; F0; Hindi ASR system; Hindi TTS; amplitude; artificial speech; automatic speech recognition systems; duration; emotional states; intonation; language acquisition; mathematical function; pause; phonetic cues; prominence detection; prosodic features; sentence prosody; speech production; text-to-speech synthesis systems; utterance; Computational intelligence; Scientific computing; F0; Intonation; Prominence; Prosody;