Next: Bracketing of single-word constituents
Up: Issues in practical application
Previous: Ambivalence
Recommendations
In the phrase structure examples given in the following sections, we
have included punctuation within the bracketing. This again is a choice
to be made by the scheme designers, and will probably depend on the
format in which the corpus will finally be presented. However, in the
syntactic annotation of written texts, there is a tradition of treating
punctuation marks as `words' for the purpose of parsing. This can be
advantageous for automatic parsing systems since punctuation is
typically used to mark major syntactic boundaries.
As for sentence-initial and sentence-final punctuation, it seems
sensible to enclose them within the parse bracketing, as in
85:
as opposed to 86:
As regards medial punctuation, the most generally
applicable guideline is to attach punctuation to the
highest available node in the parse tree, thus assigning
to medial punctuation symbols (especially commas) their
value as delimiters of major constituents, as in
87:
(87) | [S [NP The words
[PP at [NP level four NP] PP] NP] , [PP
on [NP the other hand NP] PP] , [VP are [CO [ADJP relatively rare
ADJP] , and [VP not often used [PP in speech PP] VP] CO] VP] .
S]
|
However, this guideline does not always give a satisfactory
analysis of correlative punctuation, such as matching
commas or dashes to indicate the opening and closing of a
parenthetical constituent. Thus in 88,
it makes better sense to place the second comma inside the
NP, rather than to make it an immediate constituent of the
sentence:
(88) | [S [NP the teacher
, [CL-REL who arrived late CL-REL] , NP] [VP had
noticed [NP nothing NP] VP] . S]
|
However, our purpose here is not to dictate solutions. The
principle to be adhered to is simply to make explicit in the
annotation scheme whatever solution for the treatment of
punctuation is adopted by the annotator. (Sampson 1995: 153f
provides a considered treatment of punctuation isses.)
Next: Bracketing of single-word constituents
Up: Issues in practical application
Previous: Ambivalence