Up: Questions to be discussed
Previous: Main differences
A detailed version of guidelines can hardly be produced before the syntax group
has discussed the status of syntax information, which has to be decided for an
architecture proposal, i.e.
- The main entities of the model
- The articulation between the main entities
The architecture group will need to continue its activity as an
infrastructural service for the other groups over the whole period
of the EAGLES exercise, to ensure that these recommendations remain consistent
with those of the other task groups, and to amend them if necessary.
There should be more work, in both the morphosyntax group and the architecture
group, on problems of the representation of multiword units;
the following two reasons need to be kept track of:
- even though CGPMU may provide a straight forward way of
representing certain types of multiword units, it is not sure that
this device is sufficient to cater for the wide variety of phenomena
likely to occur. It would be useful to set up a list of the facts
about multiword items which need to be treated.
- EAGLES should not commit too early to a given descriptive
device for multiword units: it may be better to keep the guidelines
flexible and to be able to insert, later, some proposal which may be
based on more and also more detailed analysis of phenomena from
different languages.
The architecture group considers the introduction of classifcations
and, along with these of inheritance hierarchies a useful structuring
device on top of flat (and maybe redundant) dictionaries. However,
it is less
evident whether EAGLES should ``prescribe'' the use of this device
or
even of a given instance of it.
Proposal:
- EAGLES recommendations ideally come in two layers:
- a flat layer of basic lexical descriptions, which may be
somewhat redundant, because it does not make use of any
classificatory devices;
- a structuring layer, which introduces classifications,
generalizations, etc., and which may involve certain
representational ways of expressing these generalizations, e.g.
through inheritance hierarchies.
- If an EAGLES user introduces classifcations at the second
level,
this should be done via specification of relations or generalization
rules which express a two-way correspondence between the flat
representations and the hierarchical ones. See the discussion about
morphological dictionary (base level) and corpus analysis tagsets
(applied, second level) in the minutes of the morphology group.
It is assumed that
- different descriptive levels may make use of classificatory
devices with different properties (e.g. monotonic vs. non-
monotonic
inheritance hierarchies);
- different applications of EAGLES lexicons may later introduce
generalizations of their own.
If we think in terms of larger actual dictionaries, it would evidently
be useful to have
some
generalization and classification already at the level of basic
lexical descriptions, e.g. to avoid redundancy in the different word
forms of a lemma (have lemma-specific information instead, and
additional ideosyncratic information on word forms separately).
The
question there is which classifications may usefully be proposed by
EAGLES: evidently, there is no single best generalization over a
heterogenous set of objects, which is why a generalization process
on
a ``data-only'' basis is not possible. Usually, generalizations at
this level are based on the classificatory assumptions of the
approach
advocated.
Up: Questions to be discussed
Previous: Main differences