Elimination of person names in spoken documents for privacy protection

Author

Kawaguchi, Ryo ; Tsuchiya, Masatoshi ; Nakagawa, Seiichi

Author_Institution

Dept. of Comput. Sci. & Eng., Toyohashi Univ. of Technol., Toyohashi, Japan

fYear

2014

fDate

9-12 Dec. 2014

Firstpage

1

Lastpage

4

Abstract

There is an increasing use of sensor networks capable of sensing multimedia data including audio data. Unfortunately, public use of these is not allowed because they contain crucial privacy information such as person and location names. Person name extraction (PNE), which is a widely investigated research topic, is an effective technique to resolve this problem. However, there is an important difference between traditional PNE and PNE for privacy protection: traditional PNE often misses out-of-vocabulary (OOV) person names that do not occur in a training corpus, and PNE for privacy protection must cover OOV person names because of the demand for privacy protection. To resolve the issue of PNE for privacy protection, this study proposes a method consisting of two stages: the first stage is speech recognition using a language model modified to over-extract person names including OOV person names, and the second stage is filtering over-extracted person names using an SVM (Support Vector Machine). The experiments show that our method is effective in detecting / eliminating person names, and listening tests also show that the performance of our method in removing person names is promising.

Keywords

data protection; multimedia computing; speech recognition; support vector machines; OOV person names; PNE; SVM; audio data; crucial privacy information; language model; multimedia data sensing; out-of-vocabulary person names; over-extracted person name filtering; person name elimination; person name extraction; privacy protection; sensor networks; speech recognition; spoken document; support vector machine; Acoustics; Mathematical model; Privacy; Speech; Speech recognition; Support vector machines; Vocabulary;

fLanguage

English

Publisher

ieee

Conference_Titel

Asia-Pacific Signal and Information Processing Association, 2014 Annual Summit and Conference (APSIPA)

Conference_Location

Siem Reap

Type

conf

DOI

10.1109/APSIPA.2014.7041603

Filename

7041603