• DocumentCode
    2126258
  • Title

    Increasing NER Recall with Minimal Precision Loss

  • Author

    Kuperus, Jasper ; Veenman, Cor J. ; van Keulen, Maurice

  • Author_Institution
    Sogeti Nederland B.V., Vianen, Netherlands
  • fYear
    2013
  • fDate
    12-14 Aug. 2013
  • Firstpage
    106
  • Lastpage
    111
  • Abstract
    Named Entity Recognition (NER) is broadly used as a first step toward the interpretation of text documents. However, for many applications, such as forensic investigation, recall is currently inadequate, leading to loss of potentially important information. Entity class ambiguity cannot be resolved reliably due to the lack of context information or the exploitation thereof. Consequently, entity classification introduces too many errors, leading to severe omissions in answers to forensic queries. We propose a technique based on multiple candidate labels, effectively postponing decisions for entity classification to query time. Entity resolution exploits user feedback: a user is only asked for feedback on entities relevant to his/her query. Moreover, giving feedback can be stopped anytime when query results are considered good enough. We propose several interaction strategies that obtain increased recall with little loss in precision.
  • Keywords
    pattern classification; query processing; text analysis; NER recall; entity classification; entity resolution; forensic investigation; forensic queries; minimal precision loss; multiple candidate labels; named entity recognition; query time; text documents; Context; Europe; Forensics; Probabilistic logic; Probability; Proposals; Semantics; Ambiguity; NER; Named Entity Recognition; PNER; Precision; Probabilistic Named Entity Recognition; Recall; Reference Ambiguity; Semantic Ambiguity; Structural Ambiguity; Targeted Feedback;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligence and Security Informatics Conference (EISIC), 2013 European
  • Conference_Location
    Uppsala
  • Type

    conf

  • DOI
    10.1109/EISIC.2013.23
  • Filename
    6657133