- ...corpora?
- This point is of relevance when it comes to
tagging texts for specific purposes, e.g. technical manuals for
maintenance, legal texts, etc. Do we have to retrain taggers each time
we switch to a new text type?
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...sense
- A major effort on the evaluation of
annotation tools and techniques is the French project GRACE,
supported, among others, ANPELF, and organized
by the CNRS.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...English
- Cf.
[&make_named_href('',
"node40.html#Rekowski:95","[von Rekowski 1995]")] for French,
[&make_named_href('',
"node40.html#Teufel:95a","[Teufel 1995a]")] for German, [&make_named_href('',
"node40.html#Monachini:95","[Monachini 1995]")] for Italian, [&make_named_href('',
"node40.html#Teufel:95b","[Teufel 1995b]")] for English.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...respectively
- ELM stands for EAGLES/Lexicon/Morphosyntax.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...much
- We would like
to thank in particular Prof. Geoffrey LEECH (Lancaster) and his
team for much help and feedback.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...Tübingen
- Particular thanks go to Christine THIELEN
and Helmut FELDWEG, of SNS, Universität Tübingen, and to the
ELWIS project at SNS, for many fruitful discussions and much practical
feedback on the versions of the tagset.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...vector
- The elements of the output vector are
better named scores since they are not probabilities in the strict
mathematical sense. E.g. they do normally not sum up to 1.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...TreeBank
- It would have been necessary to adapt
it for German and to the actual tagset in order to include it in our
experiments. But this turned out to represent more work than what was
possible in the given time frame.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...etc.
- Future tests should be devoted to
testing very ``deviant'' material of this type;
in the present exercise, we restricted ourselves to testing fairy
tales vs. newspaper texts
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...%
- Similar improvements have been obtained with English
data suggesting that it is not a language specific effect.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...determined
- As there are only 6 modal verb
lemmata which are easy to distinguish lexically.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...Ib
- As reference material, we always
compare results of the single tests with the reference test, that is,
the corpora tagged in the current STTS practice.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...passive
- German: Zustandspassiv
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...pronouns
- In a mostly
distribution-based TS, special lemma-based classes like PBEID,
PALL, PVIEL, PNFL are not desirable. Distinction in adverbial and
pronominal function should be given.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...manches/PI.AT
- The point symbolizes, that the actual form found
in the corpus might have another letter there (PIDAT vs. PIAT)
as indefinite pronouns are interseperated later.
This distinction however is not relevant for us here,
as we only consider demonstrative vs. indefinite pronouns
here.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.