Title :
Incremental syntactic parsing of natural language corpora with simple synchrony networks
Author :
Lane, Peter C R ; Henderson, James B.
Author_Institution :
Sch. of Psychol., Nottingham Univ., UK
Abstract :
The article explores the use of Simple Synchrony Networks (SSNs) for learning to parse English sentences drawn from a corpus of naturally occurring text. Parsing natural language sentences requires taking a sequence of words and outputting a hierarchical structure representing how those words fit together to form constituents. Feedforward and simple recurrent networks have had great difficulty with this task, in part because the number of relationships required to specify a structure is too large for the number of unit outputs they have available. SSNs have the representational power to output the necessary O(n2) possible structural relationships because SSNs extend the O(n) incremental outputs of simple recurrent networks with the O(n) entity outputs provided by temporal synchrony variable binding. The article presents an incremental representation of constituent structures which allows SSNs to make effective use of both these dimensions. Experiments on learning to parse naturally occurring text show that this output format supports both effective representation and effective generalization in SSNs. To emphasize the importance of this generalization ability, the article also proposes a short-term memory mechanism for retaining a bounded number of constituents during parsing. This mechanism improves the O(n2) speed of the basic SSN architecture to linear time, but experiments confirm that the generalization ability of SSN networks is maintained
Keywords :
computational complexity; feedforward neural nets; generalisation (artificial intelligence); grammars; learning (artificial intelligence); natural languages; recurrent neural nets; text analysis; English sentence parsing; SSNs; constituent structures; corpus; entity outputs; generalization ability; hierarchical structure; incremental outputs; incremental representation; incremental syntactic parsing; natural language corpora; natural language sentences; naturally occurring text; output format; representational power; short-term memory mechanism; simple recurrent networks; simple synchrony networks; structural relationships; temporal synchrony variable binding; unit outputs; Feedforward systems; Natural language processing; Natural languages;
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on