• DocumentCode
    672847
  • Title

    Hindi speech corpora: A review

  • Author

    Nivedita ; Ahmed, P. ; Dev, Amita ; Agrawal, S.S.

  • Author_Institution
    Sch. of Eng. & Technol., Sharda Univ., Noida, India
  • fYear
    2013
  • fDate
    25-27 Nov. 2013
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    A benchmark dataset provides insight into the phenomena that generate the data. Hence, it is an essential requirement to conduct research that requires concept discovery from data. In this paper, we examine the current status of 26 (twenty-six) datasets for Hindi speech (or Hindi speech corpora). This paper also aims at studying their impacts on Hindi speech based computer mediated application development. During this study, we discovered that researchers have paid little attention to issues relating to data collection from a realistic environment through mobile phone. Out of the twenty-six Hindi speech corpora reviewed only one is created for speaker recognition, in which conversation speech samples are recorded through mobile phone for noisy as well as clear condition.
  • Keywords
    natural language processing; speaker recognition; Hindi speech corpora; benchmark dataset; computer mediated application development; data collection; data discovery; mobile phone; speaker recognition; speech samples; Databases; Educational institutions; Microphones; Mobile communication; Mobile handsets; Speech; Speech recognition; recording enviornment; speech corpora;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Oriental COCOSDA held jointly with 2013 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE), 2013 International Conference
  • Conference_Location
    Gurgaon
  • Type

    conf

  • DOI
    10.1109/ICSDA.2013.6709872
  • Filename
    6709872