John M Kirk

register | login


The International Corpus of English

The International Corpus of English comprises a collection of text samples from the speech and writing of users of English from some 18 countries around the world where English is used as a mother tongue or as an official language. As a data resource, the corpus enables researchers to study the relationship between national standard varieties of English and the development of English as a world language. Each national ICE corpus comprises a database of 300 transcriptions of spoken language and 200 written text excerpts. Each text used in the ICE corpus (spoken or written) is approximately 2000 words long; the total size of each national corpus is approximately one million words. At the instigation of the late Sidney Greenbaum at the Survey of English Usage at University College London, ICE protocols have come to provide an internationally valid methodology for defining speakers of standard English, for collecting materials in a wide range of spoken and written genres, and for analysing corpora which result from data collection and transcription. For the ICE project website, see


The Ireland Component of the International Corpus of English (ICE-Ireland) makes available an unprecedented resource. It provides 300 transcribed spoken texts in 15 different discourse situations, ranging from casual Face-to-Face and Telephone conversation to Parliamentary Debates and Legal Cross-Examinations. It also makes available 200 written texts from 17 published and unpublished domains. This range thus covers formal and informal discourse situations and involves a total of 945 speakers.

Each text category contains equal numbers of texts from Northern Ireland and the Republic of Ireland, so that the corpus provides a unique cross-border data resource, of a kind and on a scale never before made available for linguistic research on present-day Irish English.

The ICE-Ireland CD-ROM

The ICE-Ireland CD-ROM offers each text in its own consecutively numbered file in both .doc and .txt formats. It also combines those texts into single, composite zone files for all the texts from the North, or the South, or combined as whole. The corpus includes single composite text-category files, either for the whole of Ireland or in separate files for the North and South. Users may thus pick and choose those types and quantities of text which best suit their research enquiries. Copies of the ICE-Ireland CD-ROM may be obtained upon request to the directors.

ICE-Ireland A User's Guide

ICE-Ireland: A User's Guide provides comprehensive information on the corpus data, including decisions made about transcribing and copy-editing the spoken texts, and, wherever available, speaker biodata, to facilitate sociolinguistic searches and numerous other enquiries. Bibliographical details for published works and statistical information about writers contributing to the written text categories of the corpus are also provided. ISBN 978 0 85389 923 5. Download the ICE-Ireland: A User's Guide in PDF format.

ICE-Ireland Directors

Dr John M. Kirk (formerly Queen's University Belfast)
Dr Jeffrey L. Kallen (Trinity College Dublin)