• DocumentCode
    3434658
  • Title

    AHUMADA: a large speech corpus in Spanish for speaker identification and verification

  • Author

    Ortega-García, J. ; Gonzalez-Rodriguez, J. ; Marrero-Aguiar, V. ; Díaz-Gómez, J.J. ; García-Jiménez, R. ; Lucena-Molina, J. ; Sánchez-Molero, J. A G

  • Author_Institution
    DIAC, Univ. Politecnica de Madrid, Spain
  • Volume
    2
  • fYear
    1998
  • fDate
    12-15 May 1998
  • Firstpage
    773
  • Abstract
    Speaker recognition is a major task when security applications through speech input are needed. Regarding speaker identity, several factors of variability must be considered: (a) factors concerning peculiar intra-speaker variability (manner of speaking, inter-session variability, dialectal variations, emotional condition, etc.) or forced intra-speaker variability (Lombard effect, cocktail-party effect), and (b) factors depending on external influences (kind of microphone, channel effects, noise, reverberation, etc). To cope with all these variability sources, a specific speech database called AHUMADA has been designed and collected for speaker recognition tasks in Castilian Spanish. AHUMADA incorporates six different recording sessions, including both in situ and telephone speech recordings. A total of 104 male speakers uttered isolated digits, digit strings, phonologically balanced short utterances, phonologically and syllabically balanced read text and more than one minute of spontaneous speech, so about 15 GB of speech material is available. Speaker verification results, concerning the available variability sources are also presented
  • Keywords
    natural languages; security; speaker recognition; AHUMADA; Castilian Spanish; Lombard effect; channel effects; cocktail-party effect; dialectal variations; digit strings; emotional condition; forced intra-speaker variability; in situ speech recording; inter-session variability; intra-speaker variability; isolated digits; large speech corpus; microphone; noise; phonologically balanced read text; phonologically balanced short utterances; recording sessions; reverberation; speaker identification; speaker recognition; speaker verification; speaking manner; speech database; spontaneous speech; syllabically balanced read text; telephone speech recording; variability sources; Electrets; Frequency measurement; Loudspeakers; Microphones; Microwave integrated circuits; Painting; Phase measurement; Speech; Telephony; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on
  • Conference_Location
    Seattle, WA
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-4428-6
  • Type

    conf

  • DOI
    10.1109/ICASSP.1998.675379
  • Filename
    675379