• DocumentCode
    639264
  • Title

    Arabic text to speech synthesis based on neural networks for MFCC estimation

  • Author

    Rebai, Issam ; Benayed, Yassine

  • Author_Institution
    MIRACL: Multimedia Inf. Syst. & Adv. Comput. Lab., Sfax Univ., Sfax, Tunisia
  • fYear
    2013
  • fDate
    22-24 June 2013
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    With the increasing number of users of text to speech applications, high quality speech synthesis is required. However, only few researches concern Arabic text to speech applications. Compared with other languages such as English and French the quality of Arabic synthesis speech is still poor. For these reasons, we propose in this paper an Arabic text to speech synthesis system based on statistical parametric synthesis. Mel Frequency Cepstral Coefficients (MFCC), energy and pitch are predicted using back propagation artificial neural networks and then transformed into speech using Mel Log Spectrum Approximation filter. Often, in Arabic written text, the short vowels called diacritic marks are omitted. So, a diacritization system is proposed to resolve this problem. Different unit sizes are considered in speech database which are phoneme, diphone and triphone. MFCC neural network architecture and an objective evaluation with the MFCC distortion measure are given in this paper.
  • Keywords
    backpropagation; cepstral analysis; filtering theory; natural language processing; neural nets; speech synthesis; Arabic text to speech synthesis system; MFCC distortion measure; MFCC estimation; Mel frequency cepstral coefficients; Mel log spectrum approximation filter; backpropagation artificial neural networks; diacritic marks; diacritization system; diphone; phoneme; short vowels; speech database; statistical parametric synthesis; triphone; Artificial neural networks; Biological neural networks; Databases; Feature extraction; Mel frequency cepstral coefficient; Speech; Speech synthesis; Mel Frequency cepstral Coefficients; speech synthesis; statistical parametric synthesis; text diacritization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer and Information Technology (WCCIT), 2013 World Congress on
  • Conference_Location
    Sousse
  • Print_ISBN
    978-1-4799-0460-0
  • Type

    conf

  • DOI
    10.1109/WCCIT.2013.6618665
  • Filename
    6618665