In Italian, pronouns (pronominal particles) can accompany verbs in order to form the:
As far as the lexicon is concerned, the possibility of taking clitics of different types (reflexive, reciprocal, etc.) has to be encoded in the entry, but depends on the syntactic type of the verb; these problems should therefore be dealt with on the syntactic level.
In these forms, the unstressed variant of the pronoun is, in general, separated graphically by the verb (it precedes the verb), except with the infinitive, the gerund and the imperative form and sometimes (rarely) with the past participle:
``egli si lava''
`lavarsi', `lavandosi', `lavati', `lavatosi'.
At the corpus level, a clitic pronoun can be attached to the verb in three situations, when it expresses:
Sequences of a verb with more than one pronoun in a unique graphical form are found if both the direct object and the indirect object of a pronominal verb form are represented by a clitic pronoun, e.g. `dandomelo' (`dando' + `me' + `lo') -- `giving it to me'. These types of compounds can present the phenomenon of epenthesis, i.e. the insertion of a letter for euphonetic reasons, e.g. `dandoglielo' (`dando' + `gli' + `e'(epenthetic) + `lo') -- `giving it to him'.
As to the dictionary encoding, `verb-pronoun' compounds have to be represented; the problem of encoding specific `verb-more-than-one-pronoun' compounds only concerns corpus encoding.
The strategy adopted in tagging these is to assign differents tags to the different parts forming the word-token, with a special mark which maintains the graphical links, thus permitting the recovery of the unique graphical form.