مرکز منطقه ای اطلاع رساني علوم و فناوري - Comparing Evaluation Metrics for Sentence Boundary Detection

DocumentCode :

2701065

Title :

Comparing Evaluation Metrics for Sentence Boundary Detection

Author :

Yang Liu ; Shriberg, Elizabeth

Author_Institution :

Dept. of Comput. Sci., Texas Univ., Richardson, TX, USA

Volume :

fYear :

2007

fDate :

15-20 April 2007

Abstract :

In recent NIST evaluations on sentence boundary detection, a single error metric was used to describe performance. Additional metrics, however, are available for such tasks, in which a word stream is partitioned into subunits. This paper compares alternative evaluation metrics - including the NIST error rate, classification error rate per word boundary, precision and recall, ROC curves, DET curves, precision-recall curves, and area under the curves - and discusses advantages and disadvantages of each. Unlike many studies in machine learning, we use real data for a real task. We find benefit from using curves in addition to a single metric. Furthermore, we find that data skew has an impact on metrics, and that differences among different system outputs are more visible in precision-recall curves. Results are expected to help us better understand evaluation metrics that should be generalizable to similar language processing tasks.

Keywords :

speech processing; speech recognition; DET curves; ROC curves; classification error rate; precision-recall curves; sentence boundary detection; word stream; Computer errors; Computer science; Error analysis; Event detection; Humans; Machine learning; NIST; Speech analysis; Speech recognition; System performance; ROC curve; precision; recall; sentence boundary detection;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on

Conference_Location :

Honolulu, HI

ISSN :

1520-6149

Print_ISBN :

1-4244-0727-3

Type :

conf

DOI :

10.1109/ICASSP.2007.367194

Filename :

4218068

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2701065