DocumentCode :
3327324
Title :
Public speech-oriented guidance system with adult and child discrimination capability
Author :
Nisimura, Ryuichi ; Lee, Akinobu ; Saruwatari, Hiroshi ; Shikano, Kiyohiro
Author_Institution :
Graduate Sch. of Inf. Sci., Nara Inst. of Sci. & Technol., Japan
Volume :
1
fYear :
2004
fDate :
17-21 May 2004
Abstract :
The Takemaru-kun system is a real world speech-oriented guidance system located at the Ikoma-City North Community Center. The system has been operated daily from November, 2002, to provide visitors a speech interface for information retrieval. This system also aims at the field test of a speech interface and collecting actual utterance data. By analyzing and evaluating the collected utterances, the flexible processing requirements are discovered according to the user´s age group. It becomes impossible to disregard the increase of child users when the system is installed in a public place. The paper proposes an automatic approach discriminating speakers between adult and child users, which is based on statistical learning. This proposal realizes a flexible spoken dialogue to both adult and child users. As for parameter vectors in machine learning, acoustic and linguistic properties extracted from speech recognition logarithm likelihood scores are adopted to discriminate a user´s age group. Although GMM-based recognition uses only acoustic properties, this method can also consider linguistic properties. In experiments with SVM-based screening, we obtained a 92.4% discrimination rate to the actual users´ utterances. The advantage of using linguistic properties is also shown. The paper also describes an overview of the Takemaru-kun system and the data collection status from the field test. Child speech recognition performance is evaluated using the collected utterances.
Keywords :
Gaussian processes; acoustic signal processing; information retrieval systems; interactive systems; learning (artificial intelligence); linguistics; natural language interfaces; speech recognition; speech-based user interfaces; statistical analysis; support vector machines; GMM; Ikoma-City North Community Center; SVM; Takemaru-kun system; acoustic properties; adult-child discrimination; child speech recognition; flexible spoken dialogue; information retrieval; linguistic properties; log likelihood scores; logarithm likelihood scores; machine learning; public guidance system; speech interface; speech-oriented guidance system; statistical learning; Animation; Data analysis; Information retrieval; Information science; Loudspeakers; Machine learning; Monitoring; Speech recognition; Statistical learning; System testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
ISSN :
1520-6149
Print_ISBN :
0-7803-8484-9
Type :
conf
DOI :
10.1109/ICASSP.2004.1326015
Filename :
1326015
Link To Document :
بازگشت