Title :
S2S system for voice oriented tourism information delivery in Indian context
Author :
Mohanty, S. ; Swain, Basanta Kumar
Author_Institution :
Dept. of Comput. Sc. & Applic., Utkal Univ., Bhubaneswar, India
Abstract :
In this paper we address a speech-to-speech (S2S) system in Odia language using multi-pattern recognition approaches for voiced oriented tourism information delivery. S2S provides spoken interface for human machine communication in Odia language. The presented system incorporates an HMM-based continuous speech recognizer using trigram model, K-nearest neighbour (KNN) classifier and a speech synthesis system that outputs speech related to Indian tourism information in Odia language. Three different databases are employed in S2S system especially for speech recognizer that consists of 2500 tourism continuous spoken queries collected from 50 speakers, labelled database for KNN and diphone corpus for speech synthesizer. The overall accuracy of the system is measured in multanimous ways by emphasising on the architecture of the developed system. Firstly, it was measured in terms of HMM based speech recognizer output which is calculated in the terms of sentence and word accuracy rate over old users and new users. The sentence accuracy and word accuracy found as 71.22% and 92.35 % respectively for old users where as the performance was found as 67.41% and 88.44 % for new users respectively. Secondly, KNN was applied over the filtered text output of speech recognizer to find the solution of query of tourists. The accuracy of the KNN was found as 89.23%. Finally, the pattern selected by KNN was sent to unicode based Odia speech synthesizer to produce speech output in Odia language. The speech synthesizer performance was evaluated in terms of MOS (Mean Opinion Score) test. The average MOS value was found as 4.2.
Keywords :
hidden Markov models; human computer interaction; natural language interfaces; natural language processing; query processing; signal classification; speech recognition; speech synthesis; text analysis; travel industry; HMM-based continuous speech recognizer; Indian context; Indian tourism information; K-nearest neighbour classifier; KNN classifier; MOS value; Odia language; S2S system; diphone corpus; human machine communication; mean opinion score test; multipattern recognition approach; sentence accuracy rate; speech synthesis system; speech synthesizer; speech-to-speech system; spoken interface; text output filtering; tourism continuous spoken queries; trigram model; voice oriented tourism information delivery; word accuracy rate; Accuracy; Databases; Dictionaries; Speech; Speech recognition; Synthesizers; Vectors; mfce;knn; speech recognition; speech synthesis;
Conference_Titel :
Oriental COCOSDA held jointly with 2013 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE), 2013 International Conference
Conference_Location :
Gurgaon
DOI :
10.1109/ICSDA.2013.6709848