DocumentCode
730732
Title
Memory-aware i-vector extraction by means of sub-space factorization
Author
Cumani, Sandro ; Laface, Pietro
Author_Institution
Politec. di Torino, Turin, Italy
fYear
2015
fDate
19-24 April 2015
Firstpage
4669
Lastpage
4673
Abstract
Most of the state-of-the-art speaker recognition systems use i-vectors, a compact representation of spoken utterances. Since the “standard” i-vector extraction procedure requires large memory structures, we recently presented the Factorized Sub-space Estimation (FSE) approach, an efficient technique that dramatically reduces the memory needs for i-vector extraction, and is also fast and accurate compared to other proposed approaches. FSE is based on the approximation of the matrix T, representing the speaker variability sub-space, by means of the product of appropriately designed matrices. In this work, we introduce and evaluate a further approximation of the matrices that most contribute to the memory costs in the FSE approach, showing that it is possible to obtain comparable system accuracy using less than a half of FSE memory, which corresponds to more than 60 times memory reduction with respect to the standard method of i-vector extraction.
Keywords
approximation theory; feature extraction; matrix decomposition; speaker recognition; FSE; factorized subspace estimation; i-vectors; matrix T approximation; memory costs; speaker variability subspace; standard i-vector extraction procedure; state-of-the-art speaker recognition systems; Continuous wavelet transforms; I-vector extraction; I-vectors; Probabilistic Linear Discriminant Analysis; Speaker Recognition; matrix rotation;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location
South Brisbane, QLD
Type
conf
DOI
10.1109/ICASSP.2015.7178856
Filename
7178856
Link To Document