Title :
English Access to Structured Data
Author :
Richardson, Kyle D. ; Bobrow, Daniel G. ; Condoravdi, Cleo ; Waldinger, Richard ; Das, Amar
Author_Institution :
Palo Alto Res. Center, Palo Alto, CA, USA
Abstract :
We present work on using a domain model to guide text interpretation, in the context of a project that aims to interpret English questions as a sequence of queries to be answered from structured databases. We adapt a broad-coverage and ambiguity-enabled natural language processing (NLP) system to produce domain-specific logical forms, using knowledge of the domain to zero in on the appropriate interpretation. The vocabulary of the logical forms is drawn from a domain theory that constitutes a higher-level abstraction of the contents of a set of related databases. The meanings of the terms are encoded in an axiomatic domain theory. To retrieve information from the databases, the logical forms must be instantiated by values constructed from fields in the database. The axiomatic domain theory is interpreted by the first-order theorem prover SNARK to identify the groundings, and then retrieve the values through procedural attachments semantically linked to the database. SNARK attempts to prove the logical form as a theorem by reasoning over the theory that is linked to the database and returns the exemplars of the proof(s) back to the user as answers to the query. The focus of this paper is more on the language task, however, we discuss the interaction that must occur between linguistic analysis and reasoning for an end-to-end natural language interface to databases. We illustrate the process using examples drawn from an HIV treatment domain, where the underlying databases are records of temporally bound treatments of individual patients.
Keywords :
computational linguistics; natural language processing; query processing; question answering (information retrieval); theorem proving; vocabulary; English access; English question; NLP system; ambiguity-enabled natural language processing; axiomatic domain theory; broad-coverage natural language processing; deductive question answering; domain-specific logical form; end-to-end natural language interface; first-order theorem prover SNARK; higher-level abstraction; language task; linguistic analysis; query; reasoning; structured database; vocabulary; Bridges; Cognition; Databases; Drugs; Natural languages; Pragmatics; Semantics; Deductive question answering; HIV drug resistance database; Natural language interfaces to databases; Natural language processing; Theorem proving;
Conference_Titel :
Semantic Computing (ICSC), 2011 Fifth IEEE International Conference on
Conference_Location :
Palo Alto, CA
Print_ISBN :
978-1-4577-1648-5
Electronic_ISBN :
978-0-7695-4492-2
DOI :
10.1109/ICSC.2011.67