The tagset mapping exercise was carried out with the following objectives:
Mapping rules deal with 1:1, 1:n, n:1 and n:m cases. A small scale exception lexicon was created for idiosyncratic cases (e.g. that as `IN' ( = preposition) in the UPenn tagset).
The following are a few sample mapping rules, for the BNC:
cqp_name(upenn, 'BNC'). [pos = 'AJ0'] => [adj & po]. [pos = 'AJC'] => [adj & comp]. [pos = 'AJS'] => [adj & sup]. [pos = 'AT0'] => [art]. [pos = 'AV0'] => [adv & (general '|' degree)]. [pos = 'CJC'] => [conj & coord]. [pos = 'CJS'] => [conj & subord]. [pos = 'CRD'] => [numeral & card]. [pos = 'DT0'] => [det & (indf '|' dem) '|' pron & dem]. [pos = 'DTQ'] => [det & wh]. [pos = 'EX0'] => [unique & existential]. [pos = 'NN0'] => [noun & com].
The following are a few sample entries from the exception lexicon:
[no] << [pos = 'AT0'] >> [det & indf]. [that] << [pos = 'CJT'] >> [conj & subord '|' pron & rel]. [of] << [pos = 'PRF'] >> [conj & subord]. ['bound'] << [pos = 'VVN'] >> [verb & s_aux]. ['going'] << [pos = 'VVG'] >> [verb & s_aux].
If used with the LIQUY tagset mapping tool, the following output is generated; the queries serve to retrieve BNC evidence by using ELM-EN as a query language:
| ?- test(bnc). [noun] -------------------------------------------------------------- [(pos = "NN0" | pos = "NN1" | pos = "NN2" | pos = "NP0")] [com & noun] -------------------------------------------------------------- [(pos = "NN0" | pos = "NN1" | pos = "NN2")]
[noun & com & pl] -------------------------------------------------------------- %%%%%%% Warning: Constraint [pl] ignored %%%%%%% due to 1:n case in [com&noun] %%%%%%% [(pos = "NN0" | pos = "NN2")] [verb & fin & aux & 2] -------------------------------------------------------------- %%%%%%% Warning: Noise to be expected: %%%%%%% [aux&fin&pres&verb&1](Due to tag "[pos=VHB]")! %%%%%%% Warning: Noise to be expected: %%%%%%% [aux&fin&pres&verb&1](Due to tag "[pos=VDB]")! %%%%%%% Warning: Noise to be expected: %%%%%%% [aux&fin&pres&verb&1](Due to tag "[pos=VBB]")! %%%%%%% Warning: Constraint [2] ignored %%%%%%% due to 1:n case in [aux&fin&past&verb] %%%%%%% Warning: Constraint [2] ignored %%%%%%% due to 1:n case in [aux&fin&pres&verb&pl] %%%%%%% Warning: Constraint [2] ignored %%%%%%% due to 1:n case in [aux&fin&impr&pres&verb] %%%%%%% [(pos = "VBB" | pos = "VBD" | pos = "VDB" | pos = "VDD" | pos = "VHB" l pos = "VHD")]
The latter example shows the treatment of noise and silence in non-1:1 situations.