next up previous contents
Next: Practical setups Up: Tagset evaluation Previous: Purpose

Questions of relevance for tagset design, addressed in the tests

 

  1. Which categories from ``traditional'' grammar, semantics, lexicon design can be used in tagsets? How much ``linguistics'' can there be in tagsets?
  2. Which features are error-prone in the tagging of a given language, i.e. which distinctions affect the overall error rate of the taggers?

    Example: GENDER is not relevant for French, i.e. the results are the same, whether GENDER is annotated or not ( cf. [&make_named_href('', "node40.html#Elworthy:95","[Elworthy 1995]")] and [&make_named_href('', "node40.html#Chanod+Tapanainen:95","[Chanod, Tapanainen 1995]")]). This question is especially interesting for applications of ELM.

  3. Which lexical ambiguities can be solved by means of distributional classification?


next up previous contents
Next: Practical setups Up: Tagset evaluation Previous: Purpose