PRECISE
A PRECISE is a Natural Language Database Interface System that is based on a Max-Flow Algorithm.
- Context:
- It was intially developed by Popescu et al. (2003).
- It includes the following system modules:
- a Lexicon,
- a Tokenizer,
- a Matcher,
- a Parser Plug-in,
- a Query Generator,
- an Equivalence Checker.
- …
- Counter-Example(s):
- See: User Interface, Natural Language Processing, Natural Language Understanding, Natural Language Generation, Question Answering Task, SQL, Parse Tree.
References
2005
- (Popescu et al., 2003) ⇒ Ana-Maria Popescu, Oren Etzioni, and Henry Kautz. (2003). “Towards a Theory of Natural Language Interfaces to Databases.” In: Proceedings of the 8th International Conference on Intelligent user interfaces. ISBN:1-58113-586-6 doi:10.1145/604045.604120
- QUOTE: In order for the sentence to be interpreted in the context of the given database, at least one complete tokenization must map to some set of database elements E as follows:
1) each token maps to a unique database element in E. This means that even if it matches more than one database element, the token refers to only one matching element in the context of a given sentence tokenization.
2) each attribute token corresponds to a unique value token. This means that (a) the database attribute matching the attribute token and the database value matching the value token are compatible and (b) the attribute token and the value token are attached (...)
3) each relation token is corresponds to either an attribute token or a value token.
(...)
Given a question q, PRECISE determines whether it is semantically tractable and if so, it outputs the corresponding SQL query (queries). The problem of finding a mapping from a complete tokenization of q to a set of database elements such that the semantic constraints imposed by conditions 1 through 3 are satisfied is reduced to a graph matching problem. PRECISE uses the max-flow algorithm to efficiently solve this problem. Each max-flow solution corresponds to a possible semantic interpretation of the sentence. PRECISE collects max-flow solutions, discards the solutions that do not obey syntactic constraints, and retains the rest as the basis for generating SQL queries corresponding to the question q.
Figure 4: PRECISE System Architecture
- QUOTE: In order for the sentence to be interpreted in the context of the given database, at least one complete tokenization must map to some set of database elements E as follows: