The SPICE-Ireland Corpus is based on the spoken material of the International Corpus of English: Ireland Component (ICE-Ireland). That spoken component comprises a collection of text samples from 15 discourse types comprising private and public dialogues ranging from casual Face-to-Face and Telephone conversation to Parliamentary Debates and Legal Cross-Examinations, and also scripted and unscripted monologues ranging from Spontaneous commentaries and demonstrations to broadcast news and scripted talks. Each text used in the SPICE-Ireland corpus is approximately 2,000 words long; the total size of the corpus is 626,597 words. This range thus covers formal and informal discourse situations and involves a total of 945 speakers. Each text category contains equal numbers of texts geopolitically from Northern Ireland and the Republic of Ireland, so that the corpus provides a unique cross-border data resource, of a kind and on a scale never before made available for linguistic research on present-day Irish English – and now with its rich pragmatic and prosodic enhancement, for any variety of English.
SPICE stands for 'Systems of Pragmatic Annotation in the Spoken Component of ICE-Ireland'. SPICE-Ireland Corpus builds on the transcriptions provided by the ICE-Ireland Corpus by providing annotations for the following features:
An example transcription from a face-to-face conversation – showing SPICE annotations in bold – is as follows:
<P1A-002$A> <#> That's like Malcolm Dunne <#>
Do you remember Malcolm Dunne said do you remember when we were working in X <,> he said+ one day <,> 1whY is it that 2gIrls you-know%* like* 1jUst get 2Orange% and put it round 2hEre% and then 2dOn't put any makeup on the rest of their 2fAce% <&> laughter
<#>
2MAlcolm%
<P1A-002$A> <#> Uh 2hUh% <#>
Cos he thought you-know% <,> you 1knOw the way people don't rub their makeup 2In%
The SPICE-Ireland CD-ROM offers each text in its own consecutively numbered file in both .doc and .txt formats. It also combines those texts into single, composite zone files for all the texts from the North, or the South, or combined as whole. The corpus includes single composite text- category files, either for the whole of Ireland or in separate files for the North and South. Users may thus pick and choose those types and quantities of text which best suit their research enquiries. Copies of the SPICE-Ireland Corpus may be obtained upon request to the Directors.
SPICE-Ireland: A User's Guide provides comprehensive information on the corpus data, including decisions made regarding ICE-Ireland about transcribing and copy-editing the spoken texts, and, wherever available, biodata for each of the 954 speakers, to facilitate sociolinguistic searches and numerous other enquiries. But SPICE-Ireland: A User's Guide also provides the rationale for the SPICE-Ireland annotation system and a description of all its components, with copious exemplification and discussion so as to leave the reader in little doubt about the thousands of analytic decisions which have been made. The Guide also present what we regard to be definite primary results about the numbers of occurrences of each type of pragmatic annotation and all of their great many realisations.
Figures are given for each speech act type and each tone movement in terms of their geo-political and text type distribution; for syntactic, lexical and phonological discourse markers; for major and minor types of quotatives; for variant and invariant sentence tags, and vocatives. All these Tables of Results will be invaluable to future researchers. For users of the corpus, the Guide is essential. ISBN 978 0 85389 938 9. Download the SPICE-Ireland: A User's Guide in PDF format.
Dr John M. Kirk (University of Vienna; formerly Queen's University Belfast) info@johnmkirk.co.uk
Dr Jeffrey L. Kallen (Trinity College Dublin) jkallen@tcd.ie