• DocumentCode
    118064
  • Title

    Elimination of person names in spoken documents for privacy protection

  • Author

    Kawaguchi, Ryo ; Tsuchiya, Masatoshi ; Nakagawa, Seiichi

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Toyohashi Univ. of Technol., Toyohashi, Japan
  • fYear
    2014
  • fDate
    9-12 Dec. 2014
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    There is an increasing use of sensor networks capable of sensing multimedia data including audio data. Unfortunately, public use of these is not allowed because they contain crucial privacy information such as person and location names. Person name extraction (PNE), which is a widely investigated research topic, is an effective technique to resolve this problem. However, there is an important difference between traditional PNE and PNE for privacy protection: traditional PNE often misses out-of-vocabulary (OOV) person names that do not occur in a training corpus, and PNE for privacy protection must cover OOV person names because of the demand for privacy protection. To resolve the issue of PNE for privacy protection, this study proposes a method consisting of two stages: the first stage is speech recognition using a language model modified to over-extract person names including OOV person names, and the second stage is filtering over-extracted person names using an SVM (Support Vector Machine). The experiments show that our method is effective in detecting / eliminating person names, and listening tests also show that the performance of our method in removing person names is promising.
  • Keywords
    data protection; multimedia computing; speech recognition; support vector machines; OOV person names; PNE; SVM; audio data; crucial privacy information; language model; multimedia data sensing; out-of-vocabulary person names; over-extracted person name filtering; person name elimination; person name extraction; privacy protection; sensor networks; speech recognition; spoken document; support vector machine; Acoustics; Mathematical model; Privacy; Speech; Speech recognition; Support vector machines; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Asia-Pacific Signal and Information Processing Association, 2014 Annual Summit and Conference (APSIPA)
  • Conference_Location
    Siem Reap
  • Type

    conf

  • DOI
    10.1109/APSIPA.2014.7041603
  • Filename
    7041603