Accurate client-server based speech recognition keeping personal data on the client

Author

Georges, Munir ; Kanthak, Stephan ; Klakow, Dietrich

Author_Institution

Automotive Speech R&D, Nuance Commun., Aachen, Germany

fYear

2014

fDate

4-9 May 2014

Firstpage

3271

Lastpage

3275

Abstract

In this paper, a novel technique is proposed that recognizes speech on a server but all private knowledge is processed on the client. Private knowledge could be address book entries, calendar entries or medical patient data. The technique combines the advantage of a powerful server with almost unlimited memory and the advantage using locally available user dependent knowledge. A dynamic language model is used to recognize speech with the help of content dependent acoustic fillers on a server. The result is then recognized including user dependent knowledge on a client, e.g., a smart phone. We achieved a word error rate reduction of 17% on the Wall Street Journal Corpus.

Keywords

client-server systems; speech recognition; Wall Street Journal Corpus; book entry; calendar entry; client-server based speech recognition; content dependent acoustic filler; dynamic language model; medical patient data; smart phone; user dependent knowledge; word error rate reduction; Acoustics; Computational modeling; Grammar; Servers; Speech; Speech recognition; Transducers; Acoustic Filler; Client-Server Speech Recognition; Data Privacy; Dynamic Language Model;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on

Conference_Location

Florence

Type

conf

DOI

10.1109/ICASSP.2014.6854205

Filename

6854205