DocumentCode :
1534335
Title :
Estimation of Glottal Closing and Opening Instants in Voiced Speech Using the YAGA Algorithm
Author :
Thomas, Mark R P ; Gudnason, Jon ; Naylor, Patrick A.
Author_Institution :
Electr. & Electron. Eng. Dept., Imperial Coll., London, UK
Volume :
20
Issue :
1
fYear :
2012
Firstpage :
82
Lastpage :
91
Abstract :
Accurate estimation of glottal closing instants (GCIs) and opening instants (GOIs) is important for speech processing applications that benefit from glottal-synchronous processing including pitch tracking, prosodic speech modification, speech dereverberation, synthesis and study of pathological voice. We propose the Yet Another GCI/GOI Algorithm (YAGA) to detect GCIs from speech signals by employing multiscale analysis, the group delay function, and N-best dynamic programming. A novel GOI detector based upon the consistency of the candidates´ closed quotients relative to the estimated GCIs is also presented. Particular attention is paid to the precise definition of the glottal closed phase, which we define as the analysis interval that produces minimum deviation from an all-pole model of the speech signal with closed-phase linear prediction (LP). A reference algorithm analyzing both electroglottograph (EGG) and speech signals is described for evaluation of the proposed speech-based algorithm. In addition to the development of a GCI/GOI detector, an important outcome of this work is in demonstrating that GOIs derived from the EGG signal are not necessarily well-suited to closed-phase LP analysis. Evaluation of YAGA against the APLAWD and SAM databases show that GCI identification rates of up to 99.3% can be achieved with an accuracy of 0.3 ms and GOI detection can be achieved equally reliably with an accuracy of 0.5 ms.
Keywords :
dynamic programming; speech processing; APLAWD database; N-best dynamic programming; SAM database; YAGA algorithm; all-pole model; closed-phase linear prediction; electroglottograph; glottal closing instants; glottal opening instants; group delay function; pitch tracking; prosodic speech modification; speech dereverberation; speech processing; voiced speech; yet another GCI/GOI algorithm; Adaptation model; Algorithm design and analysis; Delay; Estimation; Heuristic algorithms; Speech; Speech processing; Dynamic programming; electroglottograph (EGG); glottal closing instants (GCIs); glottal opening instants (GOIs); group delay function; multiscale analysis; speech processing;
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1558-7916
Type :
jour
DOI :
10.1109/TASL.2011.2157684
Filename :
5784321
Link To Document :
بازگشت