We have tried to explore in this document some ways to achieve compatibility between NERC and TEI proposals, developed within the corpus linguistics community, and the practices usually followed by the speech community as documented in the EAGLES Handbook on Spoken Language Systems and in other sources reviewed.
Recommendations suggested here are based on surveys of current practice and tend to be based in common elements found in different traditions. For this reason, they are of a very general nature and have to be further developed to cover more specific needs.
For the encoding of spoken texts, the following set of elements to be encoded is suggested (see 2.5.3):
The need to develop conversion software between a user-friendly system of transcription and the TEI encoding scheme is also acknowledged.
A proposal for transcription and labelling has been put forward, consisting in three levels (see 2.5.3):
Of course, all these levels have to be linked to the speech signal itself, and the use of automatic alignment techniques to do so is encouraged.
As far as the orthographic representation is concerned, the following recommendations can be suggested (see 4.1):
It has to be noted that punctuation is still one aspect which would need a more in-depth discussion.
The rationale behind these recommendations is the possibility to create an automatic link between the orthographic transcription and the phonemic representation in level S2.
Concerning the choice of a segmental transcription system (see 5.1.3), the IPA (International Phonetic Alphabet) is to be recommended. Whenever a machine-readable equivalent is necessary, SAMPA (SAM Phonetic Alphabet) is recommended for phonemic transcriptions such as those proposed at level S2, and the X-SAMPA extension is to be considered for a phonetic transcription such as the one proposed at level S3.
The prosodic elements to be encoded are discussed in 5.2.2, where it is suggested to represent, at least, the two TEI elements Utterance and Pause.
The choice of a prosodic transcription system is also discussed in 5.2.2. ToBI (Tone and Break Indices) and SAMPROSA (SAM Prosodic Alphabet) - complemented by the X-SAMPA extension - are considered standard machine-readable systems, and the need to develop mappings between different systems is acknowledged. In general, the use of a multi-tiered, machine-readable and multilingual prosodic transcription system is recommended.
Some recommendations for data acquisition are also provided in 3, and can be summarised as follows:
It is clear that these recommendations can only be provisional in the sense that they have to be validated and refined by applying them to different types of spoken materials, although most of them are based on current practice in different scientific communities. However, they are intended to be a first step towards a common set of working conventions which could improve the reusability of speech and spoken language resources.