DocumentCode
672847
Title
Hindi speech corpora: A review
Author
Nivedita ; Ahmed, P. ; Dev, Amita ; Agrawal, S.S.
Author_Institution
Sch. of Eng. & Technol., Sharda Univ., Noida, India
fYear
2013
fDate
25-27 Nov. 2013
Firstpage
1
Lastpage
6
Abstract
A benchmark dataset provides insight into the phenomena that generate the data. Hence, it is an essential requirement to conduct research that requires concept discovery from data. In this paper, we examine the current status of 26 (twenty-six) datasets for Hindi speech (or Hindi speech corpora). This paper also aims at studying their impacts on Hindi speech based computer mediated application development. During this study, we discovered that researchers have paid little attention to issues relating to data collection from a realistic environment through mobile phone. Out of the twenty-six Hindi speech corpora reviewed only one is created for speaker recognition, in which conversation speech samples are recorded through mobile phone for noisy as well as clear condition.
Keywords
natural language processing; speaker recognition; Hindi speech corpora; benchmark dataset; computer mediated application development; data collection; data discovery; mobile phone; speaker recognition; speech samples; Databases; Educational institutions; Microphones; Mobile communication; Mobile handsets; Speech; Speech recognition; recording enviornment; speech corpora;
fLanguage
English
Publisher
ieee
Conference_Titel
Oriental COCOSDA held jointly with 2013 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE), 2013 International Conference
Conference_Location
Gurgaon
Type
conf
DOI
10.1109/ICSDA.2013.6709872
Filename
6709872
Link To Document