مرکز منطقه ای اطلاع رساني علوم و فناوري - Application of the DYPSA algorithm to segmented time scale modification of speech

DocumentCode :

699815

Title :

Application of the DYPSA algorithm to segmented time scale modification of speech

Author :

Thomas, Mark R. P. ; Gudnason, Jon ; Naylor, Patrick A.

Author_Institution :

Imperial Coll. London, London, UK

fYear :

2008

fDate :

25-29 Aug. 2008

Firstpage :

Lastpage :

Abstract :

This paper presents a method for speech time scale modification. Voiced speech is pseudo-periodic, allowing time scale modification by the repetition or removal of cycles as necessary. However, in the case of unvoiced speech and at the boundaries of voiced speech, no such periodicity exists so the speech should not be modified. To address this issue, the proposed approach is novel in its use of the DYPSA algorithm to derive speech periodicity from glottal closure instants (GCIs), followed by a Gaussian Mixture model-based voiced/unvoiced/silence (VUS) classifier. A listening test based on ITU-T P800 has been conducted and has shown that, by employing VUS detection, the average mean opinion score of the perceptual quality of processed speech exceeds that of a method without VUS detection by 0.61 over a range of modification factors. Results are presented as a function of modification factor for normal and fast original talking rate. Reliable time scale modification of high audio quality enables many applications, such as time scale compression for fast scanning of recorded voicemail messages, slowing talking rate for improved intelligibility in forensics and lip synchronization in motion video.

Keywords :

Gaussian processes; mixture models; speech processing; DYPSA algorithm; GSI; Gaussian mixture model-based VUS classifier; ITU-T P800; average mean opinion score; fast original talking rate; forensics; glottal closure instants; high audio quality; lip synchronization; listening test; modification factor; motion video; normal talking rate; perceptual quality; pseudoperiodic speech; speech periodicity; speech time scale modification; voiced speech; voiced-unvoiced-silence classifier; voicemail messages; Abstracts; Speech;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Signal Processing Conference, 2008 16th European

Conference_Location :

Lausanne

ISSN :

2219-5491

Type :

conf

Filename :

7080347

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=699815