Abstract :
Planar languages offer an alternative to grammars and automata for representing languages. They are based on hyperplanes in a feature space associated with a string kernel, which corresponds to a set of linear equalities over features. This makes planar languages inherently learnable, in the sense of being identifiable in the limit from positive data, i.e. learnable in an unsupervised setting, even under strong constraints on the learner´s behaviour and on computational resources used. The position in the Chomsky hierarchy of any class of planar languages depends on the particular kernel it is based on. Even quite simple string kernels generate mildly context-sensitive languages, and so already have enough expressive power to deal with linguistic constructions. We extend the notion of planar language to tree planar language. The motivation for this is mainly linguistic. These trees could be used to represent derivations, or meaning as logical forms, for example, but there are also applications in bioinformatics or web mining. In this paper the expressive power of some tree planar languages is analysed and it is shown that some extend beyond the range of regular tree languages. They thus are able to capture all sorts of linguistic phenomena, making them potentially interesting for NLP applications.
Keywords :
computational linguistics; context-sensitive languages; trees (mathematics); context-sensitive languages; linguistic phenomena; string kernels; tree planar languages; Bioinformatics; Computer science; Conferences; Data mining; Kernel; Learning automata; Power generation; Supervised learning; Support vector machines; Web mining;