• DocumentCode
    1368411
  • Title

    A unified neural-network-based speaker localization technique

  • Author

    Arslan, Güner ; Sakarya, F. Ayhan

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Texas Univ., Austin, TX, USA
  • Volume
    11
  • Issue
    4
  • fYear
    2000
  • fDate
    7/1/2000 12:00:00 AM
  • Firstpage
    997
  • Lastpage
    1002
  • Abstract
    Locating and tracking a speaker in real time using microphone arrays is important in many applications such as hands-free video conferencing, speech processing in large rooms, and acoustic echo cancellation. A speaker can be moving from the far field to the near field of the array, or vice versa. Many neural-network-based localization techniques exist, but they are applicable to either far-field or near-field sources, and are computationally intensive for real-time speaker localization applications because of the wide-band nature of the speech. We propose a unified neural-network-based source localization technique, which is simultaneously applicable to wide-band and narrow-band signal sources that are in the far field or near field of a microphone array. The technique exploits a multilayer perceptron feedforward neural network structure and forms the feature vectors by computing the normalized instantaneous cross-power spectrum samples between adjacent pairs of sensors. Simulation results indicate that our technique is able to locate a source with an absolute error of less than 3.5° at a signal-to-noise ratio of 20 dB and a sampling rate of 8000 Hz at each sensor
  • Keywords
    acoustic radiators; acoustic signal processing; backpropagation; direction-of-arrival estimation; feedforward neural nets; microphones; multilayer perceptrons; acoustic echo cancellation; feature vectors; hands-free video conferencing; large rooms; microphone arrays; multilayer perceptron feedforward neural network structure; narrow-band signal sources; normalized instantaneous cross-power spectrum samples; real-time speaker localization; speech processing; unified neural-network-based speaker localization technique; wide-band signal sources; Acoustic applications; Acoustic arrays; Echo cancellers; Loudspeakers; Microphone arrays; Multilayer perceptrons; Narrowband; Speech processing; Videoconference; Wideband;
  • fLanguage
    English
  • Journal_Title
    Neural Networks, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1045-9227
  • Type

    jour

  • DOI
    10.1109/72.857779
  • Filename
    857779