Open Book Publishers logo Open Access logo
  • button
  • button
  • button
GO TO...
Contents
Copyright
book cover
BUY THE BOOK

1. Introduction

© 2015 Greg Brooks, CC BY http://dx.doi.org/10.11647/OBP.0053.01

1.1 Context

English spelling is notoriously complicated and difficult to learn, and is correctly described as much less regular and predictable than any other alphabetic orthography. The 40+ distinctive speech sounds (phonemes) of the spoken language are represented by a multiplicity of letters and letter-combinations (graphemes) in the written language; correspondingly, many graphemes have more than one pronunciation. This is because English has absorbed words from many other languages (especially French, Latin and classical Greek) into its Germanic base, and mainly taken over spellings or transliterations of those words without adapting them to the original system. Two recent books (Crystal, 2012; Upward and Davidson, 2011) tell this story with wit and insight.

However, there is more regularity in the English spelling system than is generally appreciated. This book, based on a very detailed analysis of the relationships between the phonemes and graphemes of British English, provides a thorough account of the whole complex system. It does so by describing how phonemes relate to graphemes, and vice versa. It is intended to be an authoritative reference guide for all those with a professional interest in English spelling, including and especially those who devise materials for teaching it, whatever their students’ age and whether their own or their students’ mother tongue is English or not. It may be particularly useful to those wishing to produce well-designed materials for teaching initial literacy via phonics (for guidance on the phonetics which should underpin accurate phonics teaching see Burton, 2011), or for teaching English as a foreign or second language, and to teacher trainers.

The book is intended mainly as a work of reference rather than theory. However, all works of reference are based on some theory or other, whether or not explicitly stated for readers, and even if not consciously known to the writer. For the assumptions I have made and for discussion of technical issues, see Appendix A.

1.2 Aims

My aims are to set out:

1) the distinctive speech sounds (phonemes) of spoken English

2) the letters and letter-combinations (graphemes, spelling choices) of written English

3) how the phonemes of spoken English relate to the graphemes of written English

4) the mirror-image of that, that is, how the graphemes of written English relate to the phonemes of spoken English

5) some guidance on the patterning of those relationships.

The core of the book is the chapters in which I set out the relationships (correspondences) between phonemes and graphemes:

Phoneme-grapheme correspondences

Grapheme-phoneme correspondences

Consonants

Chapter 3

Chapter 9

Vowels

Chapter 5

Chapter 10

Although chapters 9 and 10 are concerned with how the graphemes of English are pronounced, those seeking guidance on how to pronounce (including where to stress) whole English words, given only their written form, should instead consult a pronouncing dictionary in which the International Phonetic Alphabet is used, e.g. the Cambridge English Pronouncing Dictionary, 18th Edition (Cambridge: Cambridge University Press, 2011). The phonetic transcription system used in this book (see chapter 2) is identical to the system used in that dictionary. A useful guide for those who are uncertain whether, for example, an English word beginning with a ‘yuh’-sound begins with the letter <y> or the letter <u> is the ACE (Aurally Coded English) Spelling Dictionary by David Moseley (1998).

I make only a few suggestions in this book about how to teach English spelling – my aim is mainly to set out my analysis of the system. However, some findings may have pedagogical applications – see especially chapter 11, section A.7 in Appendix A, and Appendix B. I also make no attempt to justify English spelling or summarise its history (see again Crystal, 2012; Upward and Davidson, 2011), and make only a very few remarks on changes that might be helpful. For a few other things which this book does not attempt to do see p.xii.

My analysis is confined to the main vocabulary of English – I almost entirely omit the extra complexities of spellings which occur only in personal, place- and brand-names (though I mention a few where they parallel rare spellings in ordinary words; see also section A.5 in Appendix A), archaic or obsolete words, words which occur in non-standard dialects of English but not in Standard English, culinary terms with spelling patterns which occur in no other word, words known only to Scrabble addicts, and new spellings in text messaging. And there are intricacies which I have glossed over or passed over in silence – if you want to go further consult one or more of the books listed in the references.

1.3 Some terminology

Some familiarity with linguistic and grammatical terminology is assumed, e.g. ‘indefinite article’, ‘noun’, ‘adjective’, ‘verb’, ‘adverb’, ‘content word’, ‘function word’, ‘singular’, ‘plural’, ‘third person’, ‘present’, ‘past’, ‘tense’, ‘participle’, ‘possessive’, ‘bound forms’, ‘affix’, ‘prefix’, ‘suffix’, ‘syllable’, ‘penultimate’, ‘antepenultimate’. Some terms, however, are used in different senses by different writers and/or are less familiar – most of those I find indispensable are explained in the remaining sections of this chapter (and various others in sections 2.3, 2.5, 3.6, 5.5.3, 6.4-6, 6.10, 7.1, 7.2).

Throughout the book,

  • I refer to the distinctive speech sounds of spoken English as ‘phonemes’ and show them between forward slashes; for example, /b/ is the first phoneme in the word bad;
  • I refer to the spelling choices of written English as ‘graphemes’ and show them between angled brackets; for example, <p> is the first grapheme in the word pad; and
  • I refer to the relationships between phonemes and graphemes (in both directions) as ‘correspondences’.

An asterisk before a word indicates that the word is misspelt, e.g. *accomodation, *hastle, *occured.

1.4 Phonemes

Phonemes are distinctive speech sounds, that is, they make a difference to the meanings of words. For example, the difference between /b/ and /p/ makes the difference in meaning between bad and pad. (There is of course much more to this – for some discussion, see Appendix A, section A.2).

In English, phonemes fall into two main categories, consonants and vowels. These terms may well be familiar to you as categories of letters, but the very familiarity of these labels for letters may cause confusion when thinking about phonemes. For one thing, there are many more phonemes in spoken English (44 or thereabouts) than there are letters in the English version of the Roman alphabet (26). For another, some graphemes are used to represent both consonant and vowel phonemes – the most familiar example being the letter <y>.

To phoneticians, the difference between consonant and vowel phonemes is that consonants require some obstruction of the airflow between lungs and lips, whereas vowels do not. For technical details on this see Peter Roach (2009) English Phonetics and Phonology, Fourth edition, Cambridge: Cambridge University Press. However, for practical purposes a test for distinguishing between consonant and vowel phonemes which works for English is that the indefinite article, when immediately followed by a word which begins with a consonant phoneme, takes its a form, but when immediately followed by a word which begins with a vowel phoneme takes its an form. So hand, union and one-off begin with consonant phonemes, but hour, umbrella and on-off begin with vowel phonemes.

Vowel phonemes can consist of one or two sounds. Those which consist of one sound are pure vowels, and those which consist of two sounds are diphthongs. When you pronounce a pure vowel, your jaw, lips, etc., remain relatively stationary; when you pronounce a diphthong, they move. Try saying the words awe (which consists in speech of one pure vowel) and then owe (which consists in speech of a diphthong), and feel the difference.

(For long and short vowels see sections 1.5 and 2.4).

In contrast, most consonant phonemes consist of only one sound, though they can of course occur in clusters, for example at the beginning and end of strengths. The only consonant phonemes in English which consist of two sounds are those at the beginning of chew and jaw – see the complex symbols for these phonemes in Table 2.1 in chapter 2.

(For consonant clusters and blends see section 1.7).

1.5 Long and short vowels

To many teachers, a short vowel is a sound related within the teaching approach known as phonics to one of the letters <a, e, i, o, u>, and a long vowel is a different sound related within phonics to one of the same five letters. In this book the terms ‘short vowel’ and ‘long vowel’ are not used in this way, but in the senses they have in phonetics. To phoneticians, a short vowel is a phoneme that takes only a few milliseconds to pronounce, and a long vowel is a phoneme that takes rather longer to pronounce. Both are pure vowels in the sense defined in section 1.4, and both categories are listed and exemplified in section 2.4, where it is shown that the English accent on which this book is based has seven short pure vowel phonemes and five long pure vowel phonemes.

Five of the short pure vowels are indeed the sounds associated with the letters <a, e, i, o, u> in phonics teaching, but there are two more short vowels in the phonetic sense: the sound represented by letter <u> in put, and the sound represented by letter <a> in about. And of the five so-called long vowels associated with the letters <a, e, i, o, u> in phonics teaching, only the name of letter <e> is a long pure vowel in the phonetic sense; three are diphthongs (the names of <a, i, o>), and the name of <u> is a sequence of two phonemes, the sound of letter <y> when it begins a word followed by the sound of the exclamation ‘Oo!’.

However, the sounds which are the names of the letters <a, e, i, o, u>, plus the phoneme whose sound is ‘oo’ (phonetic symbol /uː/), do have some useful spelling properties as a set. I make use of this fact in chapters 5 and 6, where you will find them grouped together as the ‘letter-name vowels plus /uː/’. See also section 1.10 below.

1.6 Graphemes

I define graphemes as single letters or letter-combinations that represent phonemes. (Again, there is more to it than this – for some discussion, see Appendix A, section A.4).

Graphemes come in various sizes, from one to four letters. I call graphemes consisting of one, two and three letters ‘single-letter graphemes’, ‘digraphs’ and ‘trigraphs’ respectively. Where it is necessary to mention four-letter graphemes (of which there are 19, in my analysis – see Tables 8.1-2), for example <ough> representing a single phoneme as in through, I call them ‘four-letter graphemes’ (and not ‘tetragraphs’ or ‘quadgraphs’). Graphemes of all four sizes are used in English to spell both consonant and vowel phonemes.

1.7 Consonant clusters and ‘blends’

As already illustrated with the word strengths, consonant phonemes (and letters) can occur in groups. Many teachers use the term ‘blend’ for such groups, but I have observed that it is often used to cover not only groups of consonant phonemes or letters, but also digraphs and trigraphs representing single consonant phonemes – which can and does create two sources of confusion. First, using ‘blend’ in this way means that letters and sounds are being muddled up; it is a central tenet of my approach that graphemes and phonemes must be carefully distinguished.

Secondly, it encourages some teachers to think that ‘blends’ need to be taught as units, rather than as sequences of letters and phonemes. For example, it makes more sense to teach <bl> at the beginning of the word blend itself as two units, <b> pronounced /b/ and <l> pronounced /l/ (segmentation, in the terminology of synthetic phonics), and then merge them into /bl/ (blending (!), again in the terminology of synthetic phonics, where this term is entirely appropriate). For both analytical and teaching purposes the two categories of clusters and multi-letter graphemes are best kept apart. I therefore stick with the term ‘clusters’ for groups of consonant phonemes or letters, and avoid the term ‘blend’ completely.

1.8 Split digraphs and ‘magic <e>’

In English spelling there are six digraphs which are not written continuously but are interrupted by a consonant letter (or occasionally two consonant letters or a consonant letter plus <u>). These digraphs have one of the letters <a, e, i, o, u, y> as the first letter and <e> as the second letter, and in most cases that <e> marks the first vowel letter as having what teachers call its ‘long’ sound (if we accept, which never seems to be pointed out, that the ‘long’ sound of <y> when used as a vowel letter is the same as that of <i>). For example, in bite the ‘eye’ sound is represented by the letters <i, e> even though they are separated by the <t>. I call digraphs which consist of two separated letters ‘split digraphs’. To symbolise split digraphs I write the two relevant letters with a dot between them; for example, the split digraph representing the ‘eye’ sound in bite is written as <i.e>. In my analysis, the full set of split digraphs is <a.e, e.e, i.e, o.e, u.e, y.e>. I have not found it necessary to posit more complicated graphemes such as <ae.e> (‘split trigraphs’?) - see section A.6 in Appendix A.

Split digraphs occur only towards the end of written stem words. They have no place in conventional alphabetical order, so when I need to include them in alphabetical lists, I place them immediately after the digraph consisting of the same two letters but not split, for example <a.e> comes after <ae> (or sometimes where the unsplit digraph would be, if it happens not to be needed in a particular list).

(I also posit four graphemes containing apostrophes: <e’er, e’re, ey’re, ou’re> - these I place in lists as if the apostrophe were a 27th letter of the alphabet. See section A.9 in Appendix A).

Many teachers refer to the split digraph use of <e> as ‘magic <e>’. While this seems perfectly valid pedagogically (and I use the expression occasionally in this book), I mostly use the term ‘split digraph’ because not all occurrences of the split digraphs contain ‘magic <e>’ in the sense that the other vowel letter has its usual ‘long’ pronunciation. (See the entries for <a.e, e.e, i.e, o.e, u.e, y.e> in chapter 10, sections 10.4/17/24/28/38/40).

For a more technical discussion of split digraphs see section A.6 in Appendix A, and for a pedagogical discussion of ‘magic <e>’ rules see section 11.4.

1.9 Stem words and derived forms

Stem words are those which are indivisible into parts which still have independent meaning; derived forms are all other words, i.e. those which contain either a stem word and one or more prefixes or suffixes, and/or two (or more) stem words combined into a compound word. This book is mainly concerned with stem words, but some sections apply specifically to derived forms (e.g. section 4.2 on the rule for doubling stem-final consonant letters before suffixes beginning with a vowel letter). I try throughout to indicate where rules or correspondences differ between stem words and derived forms, sometimes in separate lists, sometimes by using brackets round prefixes and suffixes; and I often refer to derived forms as ‘derivatives’.

1.10 Positions within words

Many correspondences are specific to particular positions in words, some to the beginnings of words (‘word-initial position’), some to the middle of words (‘medial position’), some to the ends of words (‘word-final position’). In chapters 3-7, that is, all the chapters concerned with the sound-to-symbol direction, I have tried to be consistent in using ‘initial’, ‘medial’ and ‘final’ only in terms of phonemes (or, where specifically indicated, syllables – see third and fourth paragraphs below). For example, the phoneme /j/ (the sound of letter <y> at the beginning of a word) is in word-initial position in both yell and union.

Word-final position applies to consonant phonemes even when the letters representing them occur within split digraphs, e.g. the /t/ phoneme in bite is in word-final position even though the letter <t> is not. Correspondingly, vowel phonemes and diphthongs spelt by the split digraphs are never word-final – as I’ve just implied with the example of bite, there is always a consonant phoneme after the vowel phoneme or diphthong, even though the letter <e> is at the end of the written word. In section 5.5.3 (only) I also refer to ‘pre-final’ position, that is, the phoneme immediately preceding the last phoneme in a word.

In chapters 3-7 I frequently use the term ‘word-final position’ to mean the end of stem words. For instance, when I say that the grapheme <sh> is the regular spelling of the ‘sh’ phoneme in word-final position I include its occurrences in both fish and fishing. Even more generally, when I say that a particular correspondence occurs in a stem word, this also applies to words derived from it, unless otherwise stated.

Other correspondences are specific to particular syllables in words; some are specific to monosyllabic words and the final syllables of polysyllabic words – I call these collectively ‘final syllables’ – and others to syllables before the last one in words of more than one syllable (‘non-final syllables’). In sections 10.27 and 10.36 I also distinguish between penultimate and antepenultimate syllables, that is, those immediately before the final syllable and immediately before that in words with enough syllables; and in section 10.42 antepenultimate syllables reappear, along with the fourth syllable from the end of a word.

The largest set of exceptions to analysing phoneme-grapheme correspondences according to intial, medial and stem-final phonemic positions within words relates to the letter-name vowels, plus /uː/. As will be shown in section 5.1, these need instead to be analysed according to final v. non-final syllables.

Some authors use the terms ‘polysyllables’ and ‘polysyllabic words’ to refer to words of three or more syllables, and therefore distinguish systematically between monosyllables, disyllables (two-syllable words) and polysyllables. However, in my analysis I have mainly found it unnecessary to distinguish between disyllables and longer words, and therefore use the terms ‘polysyllables’ and ‘polysyllabic words’ to refer to words of two or more syllables. On the few occasions when a process operates specifically in words of two syllables (see especially the second part of the main consonant-doubling rule, section 4.2) I refer to them as two-syllable words, and similarly for longer words.

In chapters 9 and 10, which deal with the grapheme-phoneme direction, the meanings of ‘initial’, ‘medial’ and ‘final’ referring to positions in words necessarily change: there they refer to positions in written words. So, for instance, there the ‘magic <e>’ in split digraphs is described as being in word-final position, and consonant letters enclosed within split digraphs are in medial position.

1.11 Open and closed syllables

Many vowel correspondences differ between open and closed syllables. Open syllables end in a vowel phoneme; closed syllables end in a consonant phoneme. The distinction is clearest in monosyllabic words; for example, go is an open syllable, goat is a closed syllable.

For more on syllables in general, see section A.3 in Appendix A.

1.12 ‘2-phoneme graphemes’

In English spelling, the letter <x> frequently spells /ks/, which is a sequence of two phonemes, /k/ and /s/. An example is the word box. So when <x> spells /ks/ I call it a ‘2-phoneme grapheme’. (Carney, 1994: 107-8 has a rather different approach to ‘two-phoneme strings’.) My analysis has uncovered 36 of these in all (see Tables 8.1-2).

When dealing with phoneme-grapheme correspondences in chapters 3 and 5, I mention each 2-phoneme grapheme in two places, one for each of the phonemes it spells. For example, you will find <x> spelling /ks/ under both /k/ and /s/ (sections 3.7.1, 3.7.6). However, in chapters 9 and 10, which deal with grapheme-phoneme correspondences, each multi-phoneme grapheme is mentioned in only one place, under its leading letter.

One of the 2-phoneme graphemes – <u> spelling /juː/ (the sound of the whole words ewe, yew, you and the name of the letter <u>) – is so frequent that I have infringed my otherwise strictly phonemic analysis to accord the 2-phoneme sequence /juː/ special status as a quasi-phoneme that is important enough to have its own entry – see Table 2.2 and section 5.7.5 – as does Carney (1994: 200-2).

Two of the 2-phoneme graphemes also function as 3-phoneme graphemes: <x> spelling /eks/ in X(-ray), etc., and <oir> spelling /waɪə/ (the pronunciation of the whole word wire) only in choir. Logically, therefore, each of these is dealt with in several places in chapters 3 and 5 (but in chapters 9 and 10, only once, under <x> and <o> respectively).

For what I have called ‘2-phoneme graphemes’ Haas (1970: 49, 70) suggested the term ‘diphone’, to parallel ‘digraph’ – but it never caught on (though ‘diphone’ is used in phonetics to mean a sequence of two sounds or the transition between them). If it had caught on, my identification of 3-phoneme graphemes would logically have required coining ‘triphone’ (which also exists in phonetics and means ‘a sequence of three phonemes’). I have stuck with my terminology.

1.13 ‘Regular’ correspondences

I refer to many correspondences as ‘regular’. This does not mean that they apply always and without exception. Very few spelling correspondences in English have no exceptions. (At least in the main system – many minor correspondences have no exceptions, but are very restricted in scope. One example is the grapheme <aigh>, which is always pronounced like the name of letter <a> – but since it occurs only in the word straight, this is not much help.) So I use the word ‘regular’ to mean ‘predominant’, the major tendency.

Where lesser generalisations are possible I try to state only those that are helpful. For instance, Carney (1994: 185), in the course of analysing the correspondences of the vowel phoneme /ɔː/ (the sound of the word awe) shows that only spellings with <or> occur before four particular consonants or consonant clusters, and that spellings with <or> never occur before six others. But these generalisations only cover just over 30 words, so I have ignored them. For a contrast, see Table 3.5, where I organise spellings of word-final /s/ as in hiss into 11 subcategories – justified by the very large number of words with this final consonant phoneme and relatively small amounts of overlap between the subcategories.

Also, some words which seem quite irregular in the phoneme-grapheme (spelling) direction are less so in the grapheme-phoneme (reading aloud) direction, for example ocean. This is partly irregular in the phoneme-grapheme direction: every other word which ends in the sound of the word ocean is spelt <-otion>, so in ocean the spellings of the ‘sh’ phoneme as <ce> and the following schwa vowel (see chapter 2) as <a> are unusual in this context. However, in the grapheme-phoneme direction ocean is entirely regular: all words ending in <-cean> have the stress on the preceding vowel, which has its letter-name pronunciation, ‘Oh’ in this case, and the <-cean> ending, though rare, is always pronounced roughly like the word shun.

On the other hand, when I speak of ‘regular verbs’ the word ‘regular’ has its usual sense – these are the verbs (the great majority) that form both past tense and past participle (in writing) by adding <-ed> (see sections 3.5.2, 3.5.7, 5.4.3 and 10.15 for the phonetic equivalents). Some oddities can be noted here: the past tense and participle forms of the verbs lay, pay are pronounced regularly as /leɪd, peɪd/ but are spelt irregularly: laid, paid (regular spellings would be *layed, *payed – which do appear occasionally – see sections 3.7.1, 5.7.1 and 6.5). Similarly, regular spellings of the adverbs daily (also an adjective), gaily would be *dayly, *gayly (see again section 6.5). Conversely, there is one plural noun with a regular spelling but an irregular pronunciation: houses, which is pronounced /ˈhaʊzɪz/ with irregular change of the stem-final consonant from /s/ to /z/ (if its pronunciation were regular it would be /ˈhaʊsɪz/ ‘haussiz’).

But those quirks are tiny compared to the overall irregularities in the relationships between pronunciation and spelling. For many languages the complete set of both phoneme-grapheme and grapheme-phoneme correspondences could be set out on one page. The complexities of English spelling, especially of vowels, which entail that this book is so large are a measure of the task facing learners who wish to write correctly-spelt English and (try to) derive accurate pronunciations of English words from their written forms.