• DocumentCode
    3419715
  • Title

    New attempts in sound diarization

  • Author

    Costin, Ciprian ; Costin, Mihaela

  • Author_Institution
    Dept. of Comput. Sci., Al. I. Cuza Univ., Iasi, Romania
  • fYear
    2009
  • fDate
    July 29 2009-Aug. 1 2009
  • Firstpage
    71
  • Lastpage
    76
  • Abstract
    The paper discusses a new hybrid method in sound diarization (the process of segmenting an audio file into chunks that represent unique sources and clustering the obtained segments into groups that represent the same item). The most recent results are focusing mainly on the identification of voices during the telephonic recordings. In the hybrid method proposed here, a clustering is applied first, using an agglomerative approach regarding the construction of speaker models. Subsequently, when consistent amounts of data are gathered, special models are built using speaker factors. This idea gives good performance over the classical approach as the low-level clustering Bayesian Information Criterion scheme has poor performance on complex models, where speaker factors have very good precision. Speaker diarization improves speaker verification for multi-speaker audio (summed channel telephone data, single microphone interview data), is very important for speech recognition, and improves readability of an automatic transcription by structuring the audio stream into speaker turns and in some cases by providing the identity of the speakers. Sound diarization offers information which can be of interest for the multimedia documents indexing, in human-computer interaction, robotics, security systems, etc.
  • Keywords
    pattern clustering; speaker recognition; agglomerative approach; multimedia document indexing; sound diarization; speaker diarization; speaker verification; speech recognition; telephonic recording; voice identification; Audio recording; Bayesian methods; Human robot interaction; Indexing; Loudspeakers; Microphones; Multimedia systems; Speech recognition; Streaming media; Telephony;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Soft Computing Applications, 2009. SOFA '09. 3rd International Workshop on
  • Conference_Location
    Arad
  • Print_ISBN
    978-1-4244-5054-1
  • Electronic_ISBN
    978-1-4244-5056-5
  • Type

    conf

  • DOI
    10.1109/SOFA.2009.5254874
  • Filename
    5254874