- ...here
- This means that the issue of tokenisation (segmentation
of text into words and sentences) is not dealt with in this report. For
example, the issues of whether to split off enclitics, and whether to
treat multi-word expressions such as compounds as a single token, belong
in part to morphosynctactic analysis and in part to text representation.
For the same reason, we do not deal here with the representation of
merged forms such as French du (=de + le).
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...coextensive
- In spoken language, `orthographic
sentence' plays no role, sentences being delimited in practice
on the basis of syntax and possibly intonation.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...annotation
- In a skeleton
parse, constituents are identified with brackets, but much
detail is omitted: for example, some brackets are left
unlabelled, and functional labels such as Subject and Object are
not applied.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...Corpus
- Some of these names refer to the parsing system
employed and others to the resulting treebank or syntactically
annotated corpus.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...brackets
- Currently the Helsinki Constraint
Grammar is also being extended to Basque, but since Basque is not an
official EU language, it will not be covered in this report.
However, as Basque is typologically distinct from the Indo-European
languages with which the EAGLES guidelines are initially largely
concerned, efforts at parsing it may produce interesting results.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...HREF=#deptree#617>
- But not always: note that a
dependency analysis may have crossing branches, or tokens which are
dependents of more than one head.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...exemplar
- In the sense that ENGCG is the only wide
coverage parser. Other parsers using a dependency syntax exist. See
references in Fraser & Hudson (1992).
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...indices
- These logical relations are marked only in
the second phase of the Penn Treebank Project.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ......
- Discourse notions like `utterance' will not be
dealt with in this report, although they are sometimes used to
mark the outer bounds of parse trees.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...letter
- Of
course,the initial capital letter may in some cases be preceded by
sentence initial punctuation marks, such as the initial inverted
question mark preceding Spanish questions.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...[S ... S]
- By `exhaustive' we mean that all tokens which are part of
the verbal content of the text should be included in an [S ... S]. This
recommendation does not apply, of course, to words or symbols which are
part of the mark-up of the text, such as those representing (in spoken
transcriptions) pauses or the beginning and ending of a speaker's turn.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...HREF=#issue#1015>
- Throughout these
guidelines, as in this example, the syntactic annotations are purely
illustrative, and are not meant to serve as a model to be imitated in
all details.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...Subject
- At this time, the
representation of the Lexicon/Syntax Subgroup is still in progress, but
the properties of syntactic features such as case and t
+/- predicative combine together to give grammatical function
information such as Object (from the combination of features: t
-subject, -predicative, case=accusative), Subject Predicative
Complement (from the combination of features: +subject,
+predicative) etc.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...HREF=#exdepprop#2110>
- Other columns can be added to hold
other types of information, such as POS tags.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...annotation
- Since
we are here concerned with syntactic ambiguity only, we
ignore types of ambiguity (e.g. purely lexical ambiguity or
pragmatic ambiguity) which have no bearing on syntactic
annotation.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...speak
- A third type of use ambiguity occurs where
the human interpreter cannot make sense of the syntax of
the sentence. For example, the following sentence from a
computer manual cannot be readily interpreted or parsed by
a non-specialist in computing: In this situation, the
operator must ready a spool volume and IPL again.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...labelling
- We exclude from
consideration here ambiguities of part-of-speech tagging at
word level. A special case of labelling ambiguity, these
have been dealt with in draft guidelines for
morphosyntactic annotation (see EAGLES (1996a)).
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.