Title :
Hindi speech corpora: A review
Author :
Nivedita ; Ahmed, P. ; Dev, Amita ; Agrawal, S.S.
Author_Institution :
Sch. of Eng. & Technol., Sharda Univ., Noida, India
Abstract :
A benchmark dataset provides insight into the phenomena that generate the data. Hence, it is an essential requirement to conduct research that requires concept discovery from data. In this paper, we examine the current status of 26 (twenty-six) datasets for Hindi speech (or Hindi speech corpora). This paper also aims at studying their impacts on Hindi speech based computer mediated application development. During this study, we discovered that researchers have paid little attention to issues relating to data collection from a realistic environment through mobile phone. Out of the twenty-six Hindi speech corpora reviewed only one is created for speaker recognition, in which conversation speech samples are recorded through mobile phone for noisy as well as clear condition.
Keywords :
natural language processing; speaker recognition; Hindi speech corpora; benchmark dataset; computer mediated application development; data collection; data discovery; mobile phone; speaker recognition; speech samples; Databases; Educational institutions; Microphones; Mobile communication; Mobile handsets; Speech; Speech recognition; recording enviornment; speech corpora;
Conference_Titel :
Oriental COCOSDA held jointly with 2013 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE), 2013 International Conference
Conference_Location :
Gurgaon
DOI :
10.1109/ICSDA.2013.6709872