• DocumentCode
    3300635
  • Title

    The effects of high quality translations of named entities in cross-language information exploration

  • Author

    Wu, Dan ; He, Daqing ; JI, Heng ; Grishman, Ralph

  • Author_Institution
    Sch. of Inf. Manage., Wuhan Univ., Wuhan
  • fYear
    2008
  • fDate
    19-22 Oct. 2008
  • Firstpage
    1
  • Lastpage
    8
  • Abstract
    Named entities (NEs) are the expressions in human languages that explicitly link notations in languages to the entities in the real world. They play important role in cross-language information retrieval (CLIR) because most users´ requests have been found to have NEs, and majority of out-of-vocabulary terms are NEs. Therefore, missing their translations has a significant impact to the retrieval effectiveness. In this paper, we examined the effect of high quality translations of NEs in event driven information exploration, where the existence of NEs is even more common. With the focus on the effect of NE translations obtained by using information extraction (IE) techniques, we conducted several experiments using TDT test collections. Our results demonstrate that NEs and their translations play critical roles in improving CLIR effectiveness, and it makes positive impact in CLIR to use high quality translations of NEs obtained by IE techniques.
  • Keywords
    information retrieval; language translation; natural language processing; TDT test collections; cross-language information exploration; cross-language information retrieval; human languages; information extraction techniques; named entities translations; Data mining; Dictionaries; Helium; Humans; Information analysis; Information management; Information retrieval; Radiofrequency interference; Testing; USA Councils; Named entity; cross-language information exploration;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Natural Language Processing and Knowledge Engineering, 2008. NLP-KE '08. International Conference on
  • Conference_Location
    Beijing
  • Print_ISBN
    978-1-4244-4515-8
  • Electronic_ISBN
    978-1-4244-2780-2
  • Type

    conf

  • DOI
    10.1109/NLPKE.2008.4906770
  • Filename
    4906770