• DocumentCode
    3370925
  • Title

    Pitch Marking Using the Fundamental Signal for Speech Modifications via TDPSOLA

  • Author

    Ykhlef, Faycal ; Bendaouia, Lotfi

  • Author_Institution
    Div. Archit. des Syst. et Multimedia, Centre de Dev. des Technol. Av., Algiers, Algeria
  • fYear
    2013
  • fDate
    9-11 Dec. 2013
  • Firstpage
    118
  • Lastpage
    124
  • Abstract
    The quality of synthetic speech offered by pitch and duration modifications via Time Domain Pitch Synchronous Overlap Add method (TD-PSOLA) relies on an accurate positioning of pitch marks. In this paper, we propose a new pitch marking technique of voiced regions based on the fundamental signal of the speech waveform. By using the valleys of the fundamental signal, we locate a set of precise intervals where the exact instants of pitch marks are expected to be found. The fundamental signal is composed only from the fundamental frequency (pitch) of speech. It is represented by a specific signal named "mean based signal" (MBS). The optimal pitch marks are found by extracting the set of global peak instants within the obtained intervals. To improve the performance of the proposed technique, we have proposed a post processing stage which allows us to correct the erroneous pitch marks that may occur due to some synchronization problems. The proposed technique is evaluated on CMU ACRTIC database by using objective and subjective measures. The experiments demonstrate that the proposed technique allows pitch and duration modifications via TD-PSOLA with high quality.
  • Keywords
    speech synthesis; CMU ACRTIC database; MBS; TD- PSOLA; duration modifications; fundamental signal; fundamental speech frequency; mean based signal; pitch marking; pitch marks; pitch modifications; speech modifications; speech waveform; synthetic speech; time domain pitch synchronous overlap add method; voiced regions; Context; Databases; Dynamic programming; Estimation; Larynx; Speech; Synchronization; Duration modification; pitch marking; pitch modification; speech quality;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Multimedia (ISM), 2013 IEEE International Symposium on
  • Conference_Location
    Anaheim, CA
  • Print_ISBN
    978-0-7695-5140-1
  • Type

    conf

  • DOI
    10.1109/ISM.2013.28
  • Filename
    6746779