Title :
High priority in highly ranked documents in spoken term detection
Author :
Konno, Keita ; Itoh, Yoshio ; Kojima, Keisuke ; Ishigame, Masaaki ; Tanaka, Kiyoshi ; Shi-wook Lee
Author_Institution :
Iwate Prefectoral Univ., Iwate, Japan
fDate :
Oct. 29 2013-Nov. 1 2013
Abstract :
In spoken term detection, the retrieval of OOV (Out-Of-Vocabulary) query terms are very important because query terms are likely to be OOV terms. To improve the retrieval performance for OOV query terms, the paper proposes a re-scoring method after determining the candidate segments. Each candidate segment has a matching score and a segment number. Because highly ranked candidate is usually reliable and a user is assumed to select query terms so that they are the special terms for the target documents and they appear frequently in the target documents, we give a high priority to the candidate segments that are included in highly ranked documents by adjusting the matching score. We conducted the performance evaluation experiments for the proposed method using open test collections for SpokenDoc-2 in NTCIR-10. Results showed the retrieval performance was more than 7.0 points improved by the proposed method for two test sets in the test collections, and demonstrated the effectiveness of the proposed method.
Keywords :
document handling; query processing; speech processing; vocabulary; NTCIR-10; OOV query terms retrieval; SpokenDoc-2; candidate segment determination; document priority; document ranking; matching score; open test collections; out-of-vocabulary; performance evaluation; re-scoring method; segment number; spoken term detection; target documents; Acoustics; Equations; Hidden Markov models; Mathematical model; Reliability; Speech; Speech recognition;
Conference_Titel :
Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2013 Asia-Pacific
Conference_Location :
Kaohsiung
DOI :
10.1109/APSIPA.2013.6694116