For the speech community, transcription of a spoken corpus is linked to the notion of labelling. According to Barry & Fourcin (1992:2):
The `labeling' of a recorded utterance involves the temporal definition and naming of its parts with reference to the acoustic signal. These `parts' may be temporarily discrete or over-lapping, and may be defined in acoustic, physiological, phonetic or higher level linguistic terms.
It is clear from this definition that, apart from the orthographic representation, other levels are necessary in this particular domain.
Various levels of labelling have been defined and used in different projects. Two of these will be reviewed here, since they are the basis of the recommendations presented in this document.
Barry & Fourcin (1992) offer a comprehensive system consisting of five levels:
Moreover, prosodic labels should be added. As Barry & Fourcin (1992:11) point out prosodic labels can be defined as a separate tier at each level since different categories of prosodic events can be found at different levels.
Similar levels are found in other proposals, such as the levels of segmentation and labelling discussed by Tillmann & Pompino-Marschall (1993) and used in the German PHONDAT project. Again, five levels of representation are defined: orthographic, canonical word forms, actual word realizations, sound segments and sub-segmental acoustic-phonetic events. The conventions developed within the German VERBMOBIL project (Kohler et. al, 1994; Hess at al., 1995; more information on the project is found at URL http://www.dfki.uni-sb.de/verbmobil/overview-us.html and at URL http://www.ims.uni-stuttgart.de/projekte/verbmobil/index-en.html), and those proposed by Autesserre et al.(1989), together with the work carried out under the SAM project (chapter 5 of Fourcin et al. (Eds.), 1989) should also be mentioned in this context.
The EAGLES Hanbook on Spoken Language Systems (EAGLES Spoken Language Working Group, 1995) provides a careful discussion of transcription and labelling levels in the chapter devoted to corpus representation. The following types of transcriptions are described:
Labelling levels are also proposed by the EAGLES Spoken Language Working Group (1995) and are defined as follows: