The typical objects that are described in this proposal are lemmas (even though we do not deal here with lexical decisions as to what is to be to considered as a lemma) and wordforms at the morphosyntactic level. Such description essentially includes information on the grammatical category or part of speech, the subtypes of these as found in lexicons, and inflectional phenomena to be encoded in attributes such as gender, number, tense, etc.
What we propose here is the basic set of core features, derived from a detailed analysis of the major European lexicon and corpus projects; we do not aim at giving a completely worked out set of specifications ready to be implemented as such. This task is to be left at the level of the language specific development of concrete application lexicons.
One problem which we encounter, insofar as the objects described are concerned, is how to deal with the two complementary phenomena of
In general, EAGLES recommends handling multiwords as belonging to a single grammatical category, and contractions as two separate grammatical categories, but the option of a different treatment is left open.