مرکز منطقه ای اطلاع رساني علوم و فناوري - Continuous vocal imitation with self-organized vowel spaces in Recurrent Neural Network

DocumentCode :

2388786

Title :

Continuous vocal imitation with self-organized vowel spaces in Recurrent Neural Network

Author :

Kanda, Hisashi ; Ogata, Tetsuya ; Takahashi, Toru ; Komatani, Kazunori ; Okuno, Hiroshi G.

Author_Institution :

Dept. of Intell. Sci. & Technol., Kyoto Univ., Kyoto, Japan

fYear :

2009

fDate :

12-17 May 2009

Firstpage :

4438

Lastpage :

4443

Abstract :

A continuous vocal imitation system was developed using a computational model that explains the process of phoneme acquisition by infants. Human infants perceive speech sounds not as discrete phoneme sequences but as continuous acoustic signals. One of critical problems in phoneme acquisition is the design for segmenting these continuous speech sounds. The key idea to solve this problem is that articulatory mechanisms such as the vocal tract help human beings to perceive speech sound units corresponding to phonemes. To segment acoustic signal with articulatory movement, we apply the segmenting method to our system by Recurrent Neural Network with Parametric Bias (RNNPB). This method determines the multiple segmentation boundaries in a temporal sequence using the prediction error of the RNNPB model, and the PB values obtained by the method can be encoded as kind of phonemes. Our system was implemented by using a physical vocal tract model, called the Maeda model. Experimental results demonstrated that our system can self-organize the same phonemes in different continuous sounds, and can imitate vocal sound involving arbitrary numbers of vowels using the vowel space in the RNNPB. This suggests that our model reflects the process of phoneme acquisition.

Keywords :

recurrent neural nets; speech processing; acoustic signal segmentation; continuous speech sounds; continuous vocal imitation system; human infants; parametric bias; perceive speech sounds; prediction error; recurrent neural network; segmentation boundaries; self-organized vowel spaces; temporal sequence; vocal tract; Computational modeling; Computer networks; Humans; Natural languages; Neuroscience; Pediatrics; Predictive models; Recurrent neural networks; Robotics and automation; Speech processing;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Robotics and Automation, 2009. ICRA '09. IEEE International Conference on

Conference_Location :

Kobe

ISSN :

1050-4729

Print_ISBN :

978-1-4244-2788-8

Electronic_ISBN :

1050-4729

Type :

conf

DOI :

10.1109/ROBOT.2009.5152818

Filename :

5152818

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2388786