مرکز منطقه ای اطلاع رساني علوم و فناوري - A factor automaton approach for the forced alignment of long speech recordings

DocumentCode :

3531263

Title :

A factor automaton approach for the forced alignment of long speech recordings

Author :

Moreno, Pedro J. ; Alberti, Christopher

Author_Institution :

Speech Res. Group, Google Inc., New York, NY

fYear :

2009

fDate :

19-24 April 2009

Firstpage :

4869

Lastpage :

4872

Abstract :

This paper addresses the problem of aligning long speech recordings to their transcripts. Previous work has focused on using highly tuned language models trained on the transcripts to reduce the search space. In this paper we propose the use of a factor automaton, a well known method to represent all substrings from a string. This automaton encodes a highly constrained language model trained on the transcripts. We show competitive results with n-gram models in several testing scenarios. Preliminary experiments show perfect alignments at a reduced computational load and with a smaller memory footprint when compared to n-gram models.

Keywords :

automata theory; learning (artificial intelligence); speech coding; constrained language model; encoding; factor automaton approach; long speech forced recording alignment; transcript; Automata; Data mining; Dictionaries; Indexing; Natural languages; Search engines; Sequences; Speech recognition; Video sharing; Vocabulary; finite state transducers; speech alignment; speech recognition;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on

Conference_Location :

Taipei

ISSN :

1520-6149

Print_ISBN :

978-1-4244-2353-8

Electronic_ISBN :

1520-6149

Type :

conf

DOI :

10.1109/ICASSP.2009.4960722

Filename :

4960722

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3531263