next previous contents
Next: Dealing with ambiguity Up: Underspecification and ambiguity in Previous: Underspecification and ambiguity in

Recommendations

Dealing with underspecification

Underspecification is the phenomenon (sometimes called neutralisation) illustrated by the use of 0 in the Intermediate Tagset. It means that the distinction between the different values of an attribute is not relevant in this instance. One could also say that the particular attribute marked 0 is not applied to the textword under consideration. The possible reasons for this are threefold:

Language underspecifies:
The attribute does not apply to the part-of-speech in the language under consideration. For example, Gender does not apply to Nouns in English. Case does not apply to Adjectives in French.

Tagset underspecifies:
Although the attribute does apply to the part-of-speech in the language under consideration, the tagset is not fine-grained enough to represent it. For example, a particular tagset for English may omit representation of Gender (he, she) for pronouns.

Word underspecifies:
Although the attribute does apply to the language, and is represented in the tagset, it is not marked on this particular word, because it is neutralised. For example, in French, the plural article les is unspecified for Gender. Invariable adjectives, such as German prima, are unspecified for Gender, Case and Number.

There is room for different viewpoints on whether morphological syncretism should lead to underspecification of values, or whether values, even where they are not morphologically signalled, should be specified on the basis of context. There is also room for difference of opinion about whether the unmarked value of a binary attribute should be applied to the absence of the marked value. (E.g. should we mark all verbs which are not passive in Danish as active? Or should we leave Voice unspecified, except with those verbs for which the passive is an option?)

However, the important point to make here is that underspecification is normally signalled, in a tagset, simply by the absence of any indicator of the attribute. Alternatively, as in the Intermediate Tagset, a 0 is used to make the absence of an attribute explicit.


next up previous contents
Next: Dealing with ambiguity Up: Underspecification and ambiguity in Previous: Underspecification and ambiguity in