Recursive Structures & Learning Machines: Can compositional connectionism scale?
Abstract
The modelling and manipulation of recursive data structures in Artificial Intelligence has traditionally been provided via symbolic techniques. Indeed, symbolism, with its rich tool set of explicit pointers and search algorithms, is well suited to encoding tree data structures. However, symbolic encodings of such data lack elegant solutions to measures such as similarity i.e., is one encoded structure like another encoded structure? In contrast, although connectionism has a much shorter history in providing techniques to encode recursive structures (necessary for cognitive tasks such as natural language processing (NLP)), employing artificial neural networks brings a host of benefits, not least of which, the ability to encode a data structure in a vector representation. Suitably generated vector representations of recursive structures are useful for (a) similarity judgements i.e., if vectors are close in high-dimensional space then statements can be made as to the constituents of the data structure without explicit searching of the structure and (b) the vector representations may be easily passed to additional connectionist modules for further processing. However, even though there exists connectionist techniques that can encode recursive data structures, these techniques have not been shown to scale adequately for use in real-world problem domains (such as parsing natural language). To address this, an existing [cutting-edge] hybrid (connectionist/symbolic) parsing architecture has been modified to make use of simplifiedRecursive Auto-Associative Memory ((S)RAAM). (S)RAAM is employed to encode parsed sentences (from the Lancaster Parsed Corpus) in recursive connectionist representations. These representations act as the final output from the parsing architecture and maintain the parse state during processing. The use of (S)RAAM removes the need for a symbolic stack mechanism in the parser and allows the first in- depth study of (S)RAAM using real-world data. Overall, (S)RAAM shows significant promise in the learning phase, but limited generalisation properties due to overfitting of the data. However, the representations still retain the useful similarity properties, and as such, may be useful in other NLP areas (such as identification of analogies) or data processing fields (such as vision).
No comments:
Post a Comment