DocumentCode :
2455079
Title :
The effect of pitch, intensity and pause duration in punctuation detection
Author :
Levy, Tal ; Silber-Varod, Vered ; Moyal, Ami
Author_Institution :
ACLP - Afeka Center for Language Process., Afeka Tel Aviv Acad. Coll. of Eng., Tel Aviv, Israel
fYear :
2012
fDate :
14-17 Nov. 2012
Firstpage :
1
Lastpage :
4
Abstract :
The purpose of this research is to automatically detect punctuation in speech using only prosodic cues. We aim to integrate prosodic elements such as pauses, changes in f0 and amplitude range, into an Automatic Speech Recognition engine in order to generate punctuation for read speech, without taking the context of the sentences into consideration. We trained acoustic models of the prosodic features of two Punctuation Marks (PMs): full-stop and comma, which we assume have distinct prosodic characteristics. A Neural Network was used to estimate the weights assigned to each prosodic feature that corresponds to a particular PM, later to be used by a PM classifier. Results show that 87% of full-stops were detected, with only 14% false alarms. Nevertheless, since most commas are realized with no pitch breaks, only 54% of the commas were detected, with 35% false alarms. Our results support the hypothesis that acoustic-prosodic cues provide useful evidence about phrases.
Keywords :
neural nets; speech recognition; PM classifier; acoustic models; acoustic-prosodic cues; automatic speech recognition engine; intensity duration; neural network; pause duration; pitch effect; prosodic elements; prosodic feature; punctuation detection; punctuation marks; Acoustics; Artificial neural networks; Feature extraction; Speech; Speech recognition; Standards; Training;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Electrical & Electronics Engineers in Israel (IEEEI), 2012 IEEE 27th Convention of
Conference_Location :
Eilat
Print_ISBN :
978-1-4673-4682-5
Type :
conf
DOI :
10.1109/EEEI.2012.6376934
Filename :
6376934
Link To Document :
بازگشت