The specification of subcategorisation structure is perhaps the most urgent and complex type of basic linguistic information which a lexicon suitable for NLP applications must provide. Urgency arises from the need to encode information concerning the local context of words in order to constrain the analysis and generation of natural language. Complexity results from the multidimensional aspect of subcategorisation structure. Because subcategorisation involves reference to diverse levels of grammatical description, the achievement of efficient standards in this area presupposes a minimal normalisation of
Work on subcategorisation has been carried out from a variety of perspectives by linguists, lexicographers and developers of NLP applications for a variety of languages. Consequently, the formulation of standards in this area necessitates a comparison of linguistic theories, practical NLP lexicons, dictionaries and bracketted corpora as well as language-specific aspects of subcategorisation structure. The main aim of this document is to then assess how and to what extent such a comparison can be carried out in order to achieve an inter-theoretical specification which captures grammatical equivalences among system- or theory-specific coding schemes for the main European languages.
In carrying out this work, our goals are to: