Title :
Long audio alignment for automatic subtitling using different phone-relatedness measures
Author :
Alvarez, Alfredo ; Arzelus, Haritz ; Ruiz, Pablo
Author_Institution :
Human Speech & Language Technol., Vicomtech-IK4, San Sebastián, Spain
Abstract :
In this work, long audio alignment systems for Spanish and English are presented in an automatic subtitling scenario. Pre-recorded contents are automatically recognized at phoneme level by language-dependent phone decoders. A dynamic-programming alignment algorithm finds matches between the automatically decoded phones and the ones in the phonetic transcription for the content´s script. The accuracy of the alignment algorithm is evaluated when applying three non-binary scoring matrices based on phone confusion-pairs from each phone decoder, on phonological similarity and on human perception errors. Alignment results with the three continuous-score matrices are compared to results with a baseline binary matrix, at word and subtitle levels. The non-binary matrices achieved clearly better results. Matrix samples are given in the project´s website.
Keywords :
audio coding; codecs; dynamic programming; matrix algebra; speech coding; English; Spanish; automatic subtitling; automatically decoded phones; baseline binary matrix; continuous-score matrices; dynamic-programming alignment algorithm; human perception errors; language-dependent phone decoders; long audio alignment systems; non-binary matrices; non-binary scoring matrices; phone confusion-pairs; phone-relatedness measures; phoneme level; phonetic transcription; prerecorded contents; Accuracy; Decoding; Hidden Markov models; Matrices; Signal processing algorithms; Speech; Speech processing; Long audio alignment; automatic subtitling; perceptual confusion matrices; phonological similarity matrices;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location :
Florence
DOI :
10.1109/ICASSP.2014.6854812