• DocumentCode
    2874834
  • Title

    Spoken document retrieval: acoustic variability over the past 100 years

  • Author

    Hansen, John H L

  • Author_Institution
    Erik Jonsson Sch. of Eng. & Comput. Sci., Texas Univ., Dallas, TX
  • fYear
    2005
  • fDate
    27-27 Nov. 2005
  • Firstpage
    6
  • Lastpage
    7
  • Abstract
    Summary form only given. The problem of reliable speech recognition for information retrieval is a challenging problem when data is recorded across different media, known/unknown equipment, and different speaking environments. In this talk, we consider problems in audio stream phrase recognition for spoken document retrieval from audio materials spanning the past 110 years. When considering audio transcription for SDR, what should be transcribed? Audio content for broadcast news includes commercials, competing speakers, radio call-in shows, background music, over a wide range of recording conditions. This talk considers the evolution of SDR needed over the past 100 years, with emphasis on acoustics due to speaker, noise, and equipment, while text processing based concepts are considered in the following presentation by Jerome Bellegarda, Apple Corp. Early recordings during the late 1890´s and early 1900´s were carefully structured and scripted, but employed Edison wax cylinder disk recording formats resulting in reasonable speech structure but poor acoustic recordings. As the cost and ease of recording speeches, debates, and broadcast transmissions evolved, less structured audio content becomes more common with a wider range of equipment. The explosion of audio materials, audio Web portals, audio file-sharing frameworks, makes cataloging and organizing audio content for SDR increasingly important and challenging. Varying audio formats for file sharing, as well as the need to ensure ownership through digital watermarking, introduces a number of issues that can also impact speech recognition performance for SDR. We consider a number of areas and approaches taken for effective SDR, and discuss directions for future information detection schemes for richer information retrieval for the next generation of SDR. Finally, as audio material continues to expand at a rapid pace, automatic transcription support for digital archives and libraries is needed in the future
  • Keywords
    audio signal processing; information retrieval; peer-to-peer computing; speech recognition; watermarking; audio Web portals; audio file-sharing frameworks; audio materials; audio stream phrase recognition; automatic transcription support; digital watermarking; information retrieval; speech recognition; spoken document retrieval; Acoustic noise; Audio recording; Disk recording; Information retrieval; Loudspeakers; Music information retrieval; Radio broadcasting; Speech recognition; Streaming media; Text processing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition and Understanding, 2005 IEEE Workshop on
  • Conference_Location
    San Juan
  • Print_ISBN
    0-7803-9478-X
  • Electronic_ISBN
    0-7803-9479-8
  • Type

    conf

  • DOI
    10.1109/ASRU.2005.1566463
  • Filename
    1566463