Next: Recommendation for a minimal
Up: Spoken Texts
Previous: Interface between the transcription
As has been discussed in the introduction to this chapter, data acquisition
procedures are essentially different in speech and in corpus linguistics
research, due to the different aims of both communities. However, Sinclair
(1993:67) points out that:
For any level of transcription, a high quality
recording improves the efficiency of the transcription process: for anything
beyond Level Two the quality must be well above domestic.
In some cases, it would be practical for corpus linguistics work to follow some of the data
acquisition techniques traditionally used by speech scientists. Although
sometimes this might be unpractical -- field recordings may not allow the use
of the standard SAM workstation with its associated software EUROPEC (SAM,
1992; Fourcin et al. (Eds.), 1989) and the environmental conditions required
-- some benefits might be
obtained from the experience acquired in speech research.
The chapter on corpus collection in the EAGLES Handbook on Spoken Language Systems
(EAGLES Spoken Language Working Group, 1995) contains recomendations concerning procedures for
the acquisition of spoken data. A discussion of microphone types and recording techniques and devices leds to the following
recommendations that can be also of importance for the collection of spoken corpora to be used in
corpus linguistics and are thus summarised here. More details are found in the chapter mentioned above.
- If acceptable in the recording environment, and for optimal acoustical quality,
use headset microphones
- The use of headset microphones is recommended in order to avoid problems found with other
types of microphones. Close-up microphones attached to the speaker clothes can record noises like
the frothing of clothes; table-top microphones, on the other hand, are sensitive to echoes in the
room, to eventual tapping on table, movement of papers, and to overlaps in the recordings when more than
one speaker is present and the microphones are not properly spaced; finally, room microphones suffer from the interference of surronding noises.
However, it has to be pointed out that some speakers might be unconfortable with a head set and
other alternatives can be considered if care is taken not to introduce extraneous noises in the
recording (see also Sinclair, 1996:29).
- It is recommended to place the microphone slightly to the left or the right of the mouth and a bit below
the lower lip to avoid breathing noises. Cables should not touch the microphone arm, and the speaker should be confortable with the headset
- Use digital recording devices
- This recommendation is based in the fact that analogue speech recordings
suffer from a degradation in quality after repeated copies, offer a poor quality in terms of signal-to-noise
ratio and are not easy to access when they need to be studied; the recording equipment is also subject to mechanical problems. DAT (Digital Audio Tape) is
recommended then as a medium of recording. In a laboratory environment, the use of a computer to
make direct recordings on a hard disk is also strongly recommended, although this might be not
always feasible in all corpus collection situations; when this is the case, the use of DAT
is to be favoured (see also Sinclair, 1996:29)
It is worth reminding that the documentation of the corpus should contain information
concerning the recording session - date and time, recording environment -,
the microphone - make, type, position -, and the recording equipment used.
Legal issues in data acquisition are not discussed here, and the reader is referred to the
chapter on corpus collection of the EAGLES Handbook on Spoken Systems (EAGLES Spoken
Language Working Group, 1995) for further details. A more extensive presentation of this topic
can be found in a booklet edited by the American Dialect Society (1992).
Next: Recommendation for a minimal
Up: Spoken Texts
Previous: Interface between the transcription