Similarly, any text composed to be presented in written form can be read out, but its expression need only change in ways required by the change of medium. It is, therefore, primarily a written text.
Our preference is for the latter interpretation of `spoken corpus'. There are doubtful areas whichever meaning is chosen -- how impromptu is impromptu, how informal is informal, etc. How does one know whether a composer intends a text to be written or spoken, or both? But to reserve the term for only one small class of spoken language texts seems to distort the meaning of the words involved.
The problem is that informal, impromptu speech is regarded by many scholars as the most important variety of all, closest to the core of language, revealing the characteristic patterns of a language in a way that no other variety does. It is also the most difficult and expensive to acquire, difficult to classify and manage. The crudities of transcription make a spoken corpus unsatisfactory as held in most centres, and there is no consensus as yet about the conventions of transcription. The nearest we have is the recommendations of NERC=1, ; (NERC1994)