The Categorial Grammar approach to subcategorisation will be exemplified with a description of the UCG framework used in the ACQUILEX project (Sanfilippo, 1993b). For ease of presentation, only a brief summary is given -- see Sanfilippo (1993b) for a full account.
Words and phrases are represented as (typed) feature structures where orthographic, syntactic and semantic information is simultaneously represented as a conjunction of attribute-value pairs forming a sign:
[ORTH: orth CAT: cat SEM: sem]The category attribute of a sign is either basic or complex. Basic categories are binary feature structures consisting of a category type, and a series of attribute value pairs encoding morphosyntactic information:
[CAT-TYPE: cat-type M-FEATS: m-feats]
Three basic cat-types are used
cat-type[m-feats]Morphosyntactic features are included only where needed.
Complex categories are recursively defined by letting the type `cat' instantiate a feature structure with attributes RESult, DIRection and ACTive. RESult can take as value either a basic or complex category, ACTive is of type `sign', and the direction attribute encodes order of combination relative to the active part of the sign (e.g. forward or backward):
[RES: cat DIR: dir ACT: sign]In verbs, the active part of the category structure encodes the subcategorisation properties, e.g. subject and object for transitives:
[ORTH: < love > CAT:[RES:[RES:sent ACT:[np-sign CAT:nom]] ACT:[np-sign CAT:np[acc]]]]
The semantics of a sign is a formula. A formula consists of an index, a
predicate and at least one argument which can be either an entity or a
formula (which are both subsumed by sem
)
[IND: entity PRED: pred ARG1: sem]The index of a formula is an entity which provides partial information about the ontological type denoted by the formula, e.g. `e' for eventualities and `o,x,y,z' for individual objects. In addition a contentless entity, `dummy', is employed in the semantic characterisation of pleonastic noun phrases, e.g. subject of extraposition verbs. The argument of a predicate can be either an entity or a formula. For ease of exposition formulas are linearised, e.g. the feature structure
[IND: [1] x PRED: book ARG1: [1]]where
[1]
flags reentrant (e.g. identical) values is
abbreviated as <x1>book(x1)
where x1
is a named variable.
The classification of subcategorisation types involves defining
Verbs are characterised as properties of eventualities, and thematic roles are relations between eventualities and individuals, e.g.
<e1>and(<e1>sleep(e1), <e1>agent(e1,john))Following Dowty (1989), the semantic content of thematic relations is expressed in terms of prototypical cluster-concepts -- the proto-agent and proto-patient roles (`p-agt', `p-pat') -- determined for each choice of predicate through attribution of selected entailments which qualify the relative agentive strength and affectedness of event participants. Dowty's insights are augmented by introducing a third proto-role, `prep' for prepositional arguments (`semantically restricted' in LFG terms) and the contentless predicate `no-' to characterise the relation between a pleonastic NP to its governing verb. In addition, proto-roles are formalised as supersets of specific clusters of meaning components which are instrumental in the identification of semantic verb classes (Sanfilippo & Poznański, 1992; Sanfilippo, 1993b; Sanfilippo, 1993a) -- see examples.
A primary semantic classification of verb types is obtained in terms of argument arity. Further distinctions are made according to what kind of verbal arguments are encoded:
Here are some of the semantic structures distinguished:
STRICT-INTRANS-SEM <e1>and(<e1>pred(e1), <e1>p-agt(e1,x)
STRICT-TRANS-SEM <e1>and(<e1>pred(e1), <e1>and(<e1>p-agt(e1,x), <e1>p-pat(e1,y)))
OBL-TRANS/DITRANS-SEM <e1>and(<e1>pred(e1), <e1>and(<e1>p-agt(e1,x), <e1>and(<e1>p-pat(e1,y), <e1>prep(e1,y))))
P-AGT-SUBJ-INTRANS-XCOMP/COMP-SEM <e1>and(<e1>pred(e1), <e1>and(<e1>p-agt(e1,x), verb-sem))
Category structures are distinguished according to the values for the features RES and ACT. For example, the CAT of strict intransitives states that the result is a basic category of type `sent' and the active part is a noun phrase (i.e. there is only subject selection):
STRICT-INTRANS-CAT [RES: sent ACT: np-sign]More complex category types can be built using more basic category types, e.g.
STRICT-TRANS-CAT [RES: strict-intrans-cat ACT: [np-sign CAT: np[acc]]]
DITRANS-CAT [RES: strict-trans-cat ACT: [np-sign CAT: np[acc]]]
OBL-TRANS-CAT [RES: strict-trans-cat ACT: [np-sign CAT: np[p-case]]]
Control categories are used to describe the syntactic structure of
both equi and raising verbs. All control categories follow (inherit
from) the following pattern where the reentrancy tag [1]
says that the
complement active sign (e.g. the complement subject) is controlled by
the immediately preceding active sign (control is expressed by
equating entities which partially describe the semantics of active
signs):
CONTROL-CAT [RES: [RES: cat ACT: [sign SEM:ARG2: [1] entity]] ACT: [sign CAT:ACT: [sign SEM:ARG2: [1]]]]The controlling argument can be the subject or the object according to whether the verb is transitive or intransitive (transitivity is determined by the presence of an accusative active np-sign).
INTRANS-CONTROL-CAT [RES: [RES: sent ACT: [sign SEM:ARG2: [1] entity]] ACT: [sign CAT:ACT: [sign SEM:ARG2: [1]]]] TRANS-CONTROL-CAT [RES: [RES: strict-intrans-cat ACT: [np-sign CAT: np[acc] SEM:ARG2: [1] entity]] ACT: [sign CAT:ACT: [sign SEM:ARG2: [1]]]]Actual control categories are built adding further specialisations to the control descriptions above. For example, the category structure for intransitive equi verbs is defined as follows:
For intransitive control verbs which take an infinitive VP complement e.g. ``Jon wants/seems to leave''
INTRANS-VPINF-CONTROL-CAT Inherits from INTRANS-CONTROL-CAT [RES: strict-intrans-cat ACT: [vp-sign CAT:RES: sent[fin]]]
Verbs signs are defined by linking active signs in the category
structure to argument slots in predicate argument structures. This is
done by means of reentrancy links, as indicated by the tag [1]
in the structure for strict intransitive verbs below.
[strict-intrans-sign CAT:ACT: [np-sign SEM: [1] <e1>p-agt(e1,x)] SEM: [strict-intrans-sem <e1>and(<e1>pred(e1), [1])]]Since only templates for verbs which have a maximum of 3 arguments are given, only two additional general linking patterns are needed:
[two-arguments-verb-sign CAT: [RES: [RES: sent ACT: [sign SEM: [1]]] ACT: [sign SEM: [2]]] SEM: <e1> and(and(pred(e1),[1]),[2])] [three-arguments-verb-sign CAT: [RES: [RES: [RES: sent ACT: [sign SEM: [0]]] ACT: [sign SEM: [1]]] ACT: [sign SEM: [2]]] SEM: <e1> and(and(and(pred(e1),[0]),[1]),[2])]
To conclude, here are some sample two-arguments-verb-sign and three-arguments-verb-sign structures
STRICT-TRANS-SIGN [CAT: strict-trans-cat SEM: strict-trans-sem]
SUBJ-EQUI-INTRANS-VPINF-SIGN [CAT: intrans-vpinf-control-cat SEM: p-agt-subj-intrans-xcomp/comp-sem]
DITRANS-SIGN [CAT: ditrans-cat SEM: obl-trans/ditrans-sem ]
OBL-TRANS-SIGN [CAT: [RES: strict-intrans-cat ACT: [np-sign CAT: np[p-case]]] SEM: intrans-obl-sem]
DITRANS-SIGN
and OBL-TRANS-SIGN
above) is the outermost sign in the category
structure, even though only in ditransitives does it precede the `theme'
object (the difference in word order is handled syntactically, see
Sanfilippo (1993b) and references therein).