Title :
Grammar-Based Automatic Extraction of Definitions
Author :
Iftene, Adrian ; Pistol, Ionut ; Trandabat, Diana
Author_Institution :
Fac. of Comput. Sci., Al. I. Cuza Univ., Iasi, Romania
Abstract :
The paper describes the development and usage of a grammar developed to extract definitions from documents. One of the most important practical usages of the developed grammar is the automatic extraction of definitions from web documents. Three evaluation scenarios were run, the results of these experiments being the main focus of the paper. One scenario uses an e-learning context and previously annotated e-learning documents; the second one involves a large collection of unannotated documents (from Wikipedia) and tries to find answers for definition type questions. The third scenario performs a similar question-answering task, but this time on the entire web using Google web search and the Google Translation Service. The results are convincing, further development as well as further integration of the definition extraction system in various related applications are already under way.
Keywords :
Internet; document handling; grammars; information retrieval; natural language processing; Google Translation Service; Google Web search; Web documents; definition extraction; definition type questions; e-learning context; e-learning documents; grammar-based automatic extraction; question-answering task; unannotated documents; Buildings; Computational linguistics; Computer science; Data mining; Dictionaries; Electronic learning; Scientific computing; Semantic Web; Web search; Wikipedia; E-learning; Google Search API; Google Translation Service; Grammar; Question Answering; Wikipedia;
Conference_Titel :
Symbolic and Numeric Algorithms for Scientific Computing, 2008. SYNASC '08. 10th International Symposium on
Conference_Location :
Timisoara
Print_ISBN :
978-0-7695-3523-4
DOI :
10.1109/SYNASC.2008.12