• DocumentCode
    2329486
  • Title

    Unbiased discourse segmentation evaluation

  • Author

    Niekrasz, John ; Moore, Johanna D.

  • Author_Institution
    Sch. of Inf., Univ. of Edinburgh, Edinburgh, UK
  • fYear
    2010
  • fDate
    12-15 Dec. 2010
  • Firstpage
    43
  • Lastpage
    48
  • Abstract
    In this paper, we show that the performance measures Pk and Window Diff, commonly used for discourse, topic, and story segmentation evaluation, are biased in favor of segmentations with fewer or adjacent segment boundaries. By analytical and empirical means, we show how this results in a failure to penalize substantially defective segmentations. Our novel unbiased measure k-κ corrects this, providing a single score that accounts for chance agreement. We also propose additional statistics that may be used to characterize important properties of segmentations such as boundary clumping. We go on to replicate a recent spoken-language topic segmentation experiment, drawing conclusions that are substantially different from previous studies concerning the effectiveness of state-of-the-art topic segmentation algorithms.
  • Keywords
    natural language processing; Pk; Window Diff; boundary clumping; spoken-language topic segmentation; story segmentation evaluation; unbiased discourse segmentation evaluation; unbiased measure k-κ; agreement measures; discourse analysis; evaluation; spoken conversation; topic segmentation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Spoken Language Technology Workshop (SLT), 2010 IEEE
  • Conference_Location
    Berkeley, CA
  • Print_ISBN
    978-1-4244-7904-7
  • Electronic_ISBN
    978-1-4244-7902-3
  • Type

    conf

  • DOI
    10.1109/SLT.2010.5700820
  • Filename
    5700820