DocumentCode :
2701065
Title :
Comparing Evaluation Metrics for Sentence Boundary Detection
Author :
Yang Liu ; Shriberg, Elizabeth
Author_Institution :
Dept. of Comput. Sci., Texas Univ., Richardson, TX, USA
Volume :
4
fYear :
2007
fDate :
15-20 April 2007
Abstract :
In recent NIST evaluations on sentence boundary detection, a single error metric was used to describe performance. Additional metrics, however, are available for such tasks, in which a word stream is partitioned into subunits. This paper compares alternative evaluation metrics - including the NIST error rate, classification error rate per word boundary, precision and recall, ROC curves, DET curves, precision-recall curves, and area under the curves - and discusses advantages and disadvantages of each. Unlike many studies in machine learning, we use real data for a real task. We find benefit from using curves in addition to a single metric. Furthermore, we find that data skew has an impact on metrics, and that differences among different system outputs are more visible in precision-recall curves. Results are expected to help us better understand evaluation metrics that should be generalizable to similar language processing tasks.
Keywords :
speech processing; speech recognition; DET curves; ROC curves; classification error rate; precision-recall curves; sentence boundary detection; word stream; Computer errors; Computer science; Error analysis; Event detection; Humans; Machine learning; NIST; Speech analysis; Speech recognition; System performance; ROC curve; precision; recall; sentence boundary detection;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on
Conference_Location :
Honolulu, HI
ISSN :
1520-6149
Print_ISBN :
1-4244-0727-3
Type :
conf
DOI :
10.1109/ICASSP.2007.367194
Filename :
4218068
Link To Document :
بازگشت