• DocumentCode
    336786
  • Title

    Assessment and correction of voice quality variabilities in large speech databases for concatenative speech synthesis

  • Author

    Stylianou, Yannis

  • Author_Institution
    SIPS, AT&T Bell Labs., Florham Park, NJ, USA
  • Volume
    1
  • fYear
    1999
  • fDate
    15-19 Mar 1999
  • Firstpage
    377
  • Abstract
    In an effort to increase the naturalness of concatenative speech synthesis, large speech databases may be recorded. While it is desirable to have varied prosodic and spectral characteristics in the database, it is not desirable to have variable voice quality. We present an automatic method for voice quality assessment and correction, whenever necessary, of large speech databases for concatenative speech synthesis. The proposed method is based on the use of a Gaussian mixture model, GMM, to model the acoustic space of the speaker of the database and on autoregressive filters for compensation. An objective method to measure the effectiveness of the database correction based on a likelihood function for the speaker´s GMM, is presented as well. Both objective and subjective results show that the proposed method succeeds in detecting voice quality problems and successfully corrects them. Results show a 14.2% improvement of the log-likelihood function after compensation
  • Keywords
    Gaussian processes; autoregressive processes; filtering theory; spectral analysis; speech intelligibility; speech synthesis; Gaussian mixture model; acoustic space; automatic method; autoregressive filters; compensation; concatenative speech synthesis; database correction; large speech databases; likelihood function; listening tests; log-likelihood function; objective method; objective results; prosodic characteristics; spectral characteristics; subjective results; voice quality assessment; voice quality correction; voice quality variabilities; Acoustic measurements; Filters; Labeling; Loudspeakers; Quality assessment; Signal processing; Smoothing methods; Spatial databases; Speech synthesis; Synthesizers;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 1999. Proceedings., 1999 IEEE International Conference on
  • Conference_Location
    Phoenix, AZ
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-5041-3
  • Type

    conf

  • DOI
    10.1109/ICASSP.1999.758141
  • Filename
    758141