Title :
Named Entity Disambiguation Using HMMs
Author :
Alhelbawy, Ayman ; Gaizauskas, Robert
Author_Institution :
Sheffield Univ., Sheffield, UK
Abstract :
In this paper we present a novel approach to disambiguate textual mentions of named entities against the Wikipedia knowledge base. The conditional dependencies between different named entities across Wikipedia are represented as a Markov network. In our approach, named entities are treated as hidden variables and textual mentions as observations. The number of states and observations is huge and naively using the Viterbi algorithm to find the hidden state sequence that emits the query observation sequence is computationally infeasible, given a state space of this size. Based on an observation that is specific to the disambiguation problem, we propose an approach that uses a tailored approximation to reduce the size of the state space, making the Viterbi algorithm feasible. Results show good improvement in disambiguation accuracy relative to the baseline approach and to some state-of-the-art approaches. Also, our approach shows how, with suitable approximations, HMMs can be used in such large-scale state space problems.
Keywords :
Web sites; approximation theory; hidden Markov models; maximum likelihood estimation; natural language processing; text analysis; HMM; Markov network; Viterbi algorithm; Wikipedia knowledge base; computationally infeasible; hidden Markov model; hidden state sequence; hidden variables; named entity disambiguation; query observation sequence; tailored approximation; textual mentions; Approximation methods; Context; Electronic publishing; Encyclopedias; Hidden Markov models; Internet; Entity Linking; HMM; Named Entity Disambiguation;
Conference_Titel :
Web Intelligence (WI) and Intelligent Agent Technologies (IAT), 2013 IEEE/WIC/ACM International Joint Conferences on
Conference_Location :
Atlanta, GA
Print_ISBN :
978-1-4799-2902-3
DOI :
10.1109/WI-IAT.2013.173