Some kind of standardisation is becoming urgent, particularly in the area of morphosyntactic annotation. This is an area in which most annotation has been done, and morphosyntactic tagging is likely to be undertaken for many different languages in the next few years. In the interests of interchangeability and reusability of annotated corpora, and particularly for the development of multilingual corpora, it is important to avoid a free-for-all in tagging practices.
On the other hand, the varied needs and constraints which govern any annotation project, or which might govern such projects in the future, urge caution in setting out to achieve a rigid standardisation. Where possible, it is important to offer a default specification which can be adopted where there are no overriding reasons for departing from it. In this way, invariance will establish itself across different projects and languages, and a de facto standard will progressively come into being.
However, the need to go beyond a preferred standard -- a principle of extensibility -- should also be recognised. There will be a need to extend the specification to new phenomena and sometimes a need to represent different perspectives on the same data. Extensibility means, on the one hand, the ability to extend the specification to language-specific phenomena, and on the other, the ability to vary the degree of granularity for this or that annotation task.
The use of the term guidelines, in reference to the documentary specification of annotation standards, is salutary in suggesting that there is no absolute normative prescription of annotation practices, but at most a set of recommendations, from which the annotator may justify departures or extensions for particular purposes. Even the term recommendations is too strong a word in some cases: often we can only point out the range of practices which exist, without offering advice to prefer one to another.
We consider, in the following three sections, the feasibility of achieving a measure of standardisation in three important areas.