The Essential Linnea Ehri

A Research-Based Explanation of How Children Learn to Read Words

With 160 published papers, Linnea Ehri has been one of the most influential and cited reading scientists in the past 40 years. One of 14 members of the U.S. National Reading Panel (1997-2000), she served as Chair of both the Phonemic Awareness and the Phonics subgroups of the Panel. (The full report of the Panel can be found here.)

Ehri earned her Ph.D. in Educational Psychology from the University of California in Berkeley. She is a Distinguished Professor Emerita at the Graduate Center of the City University of New York where she was a member of the faculty in Educational Psychology and in Speech, Language, and Hearing Sciences. Since the 1970s she has conducted scientific research on how children learn to read. Her findings have provided support for the theory described below. For a portrayal of the course of her research findings and their implications for instruction, see here.

This blog will focus on Ehri’s most important contribution to reading science, that is, her Phase Theory for automatic sight word creation. In doing so, we’ll need to investigate what Ehri means by “sight words,” and by the term “orthographic mapping” – a term she first coined in 2014. We’ll finish by taking a look at some implications of Ehri’s Phase Theory for early reading instruction.

Sight Words

Ehri distinguishes 4 ways to read words:

“The first three ways help us read unfamiliar words. The fourth way explains how we read words we have read before. One way is by decoding, also called phonological recoding. We can either sound out and blend graphemes into phonemes, or we can work with larger chunks of letters to blend syllabic units into recognizable words. Another way is by analogizing. This involves using words we already know to read new words – for example, using the known word, bottle, to read throttle. Another way is by prediction. This involves using context and letter clues to guess unfamiliar words. The fourth way of reading words is by memory or sight. This applies to words we have read before. We can just look at the words and our brain recognizes them.” [1] [boldface mine]

[Note: If you’re unfamiliar with the terms “decode,” “blend,” "encode," “segment,” “grapheme,” or “phoneme,” you can find easy-to-understand definitions here. Understanding these terms is a necessity for reading this blog.]

Since the first 3 strategies for reading (above) all involve conscious effort and time, using any of them will impede reading comprehension. Reading by sight, however, requires no conscious effort - all the brain’s resources can be directed toward comprehending the text.

“Given that there are multiple ways to read words, consider which way makes text reading most efficient. If readers know words by sight and can recognize them automatically as they read text, then word reading operates unconsciously. In contrast, each of the other ways of reading words requires conscious attention. If readers attempt to decode words, to analogize, or to predict words, their attention is shifted from the text to the word itself to identify it, and this disrupts comprehension, at least momentarily. It is clear that being able to read words automatically from memory is the most efficient, unobtrusive way to read words in text. Hence, building a sight vocabulary is essential for achieving text-reading skill.” [2]

If you’re a skilled reader, you’ll likely read every word in this blog effortlessly – by sight. A mere glimpse of each word will immediately link to that word’s pronunciation and meaning. The brain’s ability to do this is astonishing. How does it happen?

A traditional view of how sight words are created holds that beginners memorize some type of association between a visual characteristic of the word (perhaps its overall shape) and its meaning. The pronunciation of the word is activated only after the meaning of the word has been retrieved. Ehri calls this notion “incorrect.”

“Consider the feat that skilled readers perform when they read words by sight. They are able to recognize in an instant any one of many thousands of words. They recognize one unique word and bypass many other similarly spelled words. For example, consider all the words that must be overlooked to read the word “stick” accurately: not only stink, slick, and slink, which have similar shapes as well as letters, but also sting, sling, string, as well as sick, sing, and sink. Moreover, skilled readers can remember how to read new sight words with very little practice. Memorizing arbitrary associations between the shapes and meanings of words cannot explain how skilled readers do what they do. Sight word reading must involve remembering letters in the words. These alone are the distinctive cues that make one word different from all the others.” [3]

But without connecting that specific sequence of letters to something already in the brain, sight word learning would be a difficult and onerous task, something akin to memorizing sequences of arbitrary digits, such as phone numbers. Here’s where orthographic mapping comes into play.

Orthographic Mapping

Orthographic Mapping (hereafter: OM) might be an unknown term for some readers. The prefix ortho means “correct,” while graphy has to do with writing. So a word is orthographic if it is spelled or written according to accepted convention (Cambridge Dictionary). The word “mapping” implies a word’s spelling is connected or associated with something else – and that is indeed the case.

For Ehri, OM is essentially the connection-making process that makes sight words possible. The connections that must be made, at least initially, are between the individual graphemes (seen in a word’s spelling), and the individual phonemes symbolized by those graphemes (and heard in that word’s pronunciation). If these grapheme-phoneme connections can be made, consciously and explicitly, by a reader just a few times, the word will automatically become a sight word for her – a word that will never need to be decoded, analogized, or predicted again.

“Words that have become sight words are read from memory. Sight of the word immediately activates both pronunciation and meaning. To build sight words, OM is required. Readers must form connections between spellings and pronunciations of specific words by applying knowledge of the general writing system. When readers see a new word, and say or hear its pronunciation, its spelling becomes mapped onto its pronunciation and meaning. It’s the connections that serve to ‘glue’ spelling to pronunciations in memory.” [4]

Evolution has made spoken language (the pronunciations and meanings of words) easy for most children to learn. Not so for written language. Our brains never evolved to accept language, through the eyes, as a sequence of letter symbols. For written letter sequences to become part of the brain’s already-functioning spoken language system, spellings must be linked to pronunciations at the grapheme-phoneme level.

“It is in performing this grapheme-phoneme analysis for individual words that the spellings of words penetrate and become attached to readers’ knowledge of spoken words in a way that links written language to the central mechanisms [in the brain] governing spoken language.” [5]

In her recent study showing how grapheme-phoneme analysis is critical even for a transparent language (Portuguese), Ehri sums up the situation for learning to read English this way. Note the emphasis on decoding:

“Learning to decode involves transforming graphemes into phonemes and blending them to pronounce words with recognizable meanings (Beck, 2006). Share (2008b) described decoding as a self-teaching mechanism that readers can apply to unlock the identities of unfamiliar words as they are reading text. Ehri (1992, 1998, 2014) proposed a connectionist theory and presented evidence to show how decoding enables readers to store written words in memory so that the words can be read by sight. When readers apply their grapheme–phoneme knowledge to decode new words, connections are formed between graphemes in written words and phonemes in spoken words. This bonds the spellings of those words to their pronunciations and meanings and stores all of these identities together as lexical units in memory. Subsequently, when these words are seen, readers can read the words as single units from memory automatically by sight. Decoding letter by letter is no longer needed to read the words.” [boldface mine] [6]

It’s important to understand that for Ehri, the term “sight word” does not apply simply to high-frequency words, or irregular words, or Dolch words, or Fry words. It applies to all words. If a word is irregular, how does OM work? It turns out that most letters in irregularly-spelled words do conform to standard grapheme–phoneme conventions. Hence they can still be secured in memory as sight words by the same connections as regularly spelled words, with only the exceptional letters unconnected and unsecured.

Examples given by Ehri are the words ISLAND and SWORD where only the S and W, respectively, remain unsecured [7]. Elsewhere [8], Ehri suggests that assigning spelling pronunciations to words with silent letters or unstressed vowels can help to connect and secure these letters in memory, for example, pronouncing LISTEN as “lis – ten” rather than “lis – in,” or CHOCOLATE as “choc – o – late” rather than “choc – lut.”

OM helps to explain a remarkable fact. Most skilled readers recognize, at a glance, somewhere between 50,000 and 70,000 sight words. Fifty to seventy thousand words! No one would claim to have consciously memorized even 10 percent of these totals. And yet, for each of these words, a mere glance instantaneously calls up both pronunciation and meaning.

Amalgamation Theory

Before a child ever begins the process of learning how to read, thousands of what Ehri calls amalgams already exist in the child’s memory – one amalgam for every word in the child’s spoken or listening vocabulary. Each amalgam contains, at a minimum, two identities: the word’s pronunciation and the word’s (semantic) meaning. Some amalgams may already contain morphological (word roots and affixes) and syntactic (grammatical function in sentences) identities as well. OM can be viewed as a process whereby written words (spellings) are added to the amalgams already in memory.

“In order for written words to be added to the amalgams in memory, readers must bond spellings to pronunciations by applying their knowledge of letter–sound relations to connect letter units to sound units within specific words...The first few times a student reads a word, these connections are formed and stored in memory. Subsequently, when the word is seen, these connections are activated in memory to read the word. Once words’ identities are amalgamated in memory, readers can read them as whole units quickly and automatically with all of their identities activated. When practiced in this way, words become recognized from memory by sight. This supplants the need for guessing or decoding words.” [9]

Sight words, properly learned via OM, penetrate the brain’s language system and provide a visual, but now alphabetic, representation of speech phonemes:

“According to amalgamation theory, when students learn to read and spell words, a visual alphabetic representational system for speech is acquired and used to store words in memory. Letters in spellings come to penetrate and represent phonemes in pronunciations in the brain.” [10]

Now let's take a look at the skills necessary for specific spellings to be added to the amalgams already in memory. In other words, let’s look at what skills are needed for OM to take place.

What Skills are Necessary for OM to Occur?

OM will begin to occur spontaneously as long as the beginner has been taught some non-negotiable prerequisite skills. These skills are:

Letter recognition (shapes, names, upper versus lower case).
Knowledge of letter-sound relationships, not only for the 24 consonant sounds of English, but also for the 20 vowel sounds. In her research, Ehri refers to these relationships as “alphabetic knowledge,” “grapheme-phoneme correspondences,” or “grapheme-phoneme relations.”
Two specific phonemic awareness skills needed for decoding and spelling words: blending separate phonemes to form whole spoken words, and segmenting whole spoken words into constituent phonemes. Decoding requires blending the sounds of letters to read words. Encoding requires segmenting the phonemes in spoken words to spell them. [Note: For Ehri, the ability to segment spoken words into phonemes is facilitated by helping beginners detect the sequence of mouth positions and movements involved in producing those phonemes, for example, segmenting “bat” by feeling the lips closed for /b/, then open for the vowel, then the tongue tapping the roof of the mouth for /t/.]
Repeated practice using decoding and encoding skills to read and spell words.

Here’s Ehri in her own words. Note her reference to David Share, one of Ehri’s peers. (The two have known each other throughout their careers and have communicated on a regular basis.) Share would certainly agree with the above 4 prerequisites. (See my blog on his Self-Teaching Hypothesis here.)

“To form connections and retain words in memory, readers need some requisite abilities. They must possess phonemic awareness, particularly segmentation and blending. They must know the major grapheme-phoneme correspondences of the writing system…Then they need to be able to read unfamiliar words on their own by applying a decoding, analogy, or prediction strategy. Application of these strategies activates orthographic mapping to retain the words’ spellings, pronunciations, and meanings in memory. David Share referred to this as a self-teaching mechanism. With repeated readings that activate orthographic mapping, written words are retained in memory to support reading and spelling.” [11] [boldface mine]

Six years later, Ehri says it this way:

“Acquisition of sight word learning ability requires that students learn regularities of the writing system beginning with grapheme–phoneme relations, phoneme segmentation, and decoding skill, so graphemes become connected to phonemes within spellings of specific words in memory.” [12]

Ehri specifies the type of instruction necessary to ensure that the skills needed for OM are explicitly taught. Notice the phrase “produced or heard.” The student himself can produce the word, by decoding it, or the word can be produced for him, in which case he will need segmentation skills.

“Use of alphabetic [letter-sound] knowledge to connect spellings to pronunciations and retain sight words in memory is an internal process that is activated spontaneously when words are seen and their pronunciations are produced or heard…What sort of instruction produces spontaneous OM? First, it entails teaching students the knowledge and skills that enable connections to be activated when words are seen and read. This includes teaching the writing system, beginning with grapheme-phoneme relations, teaching phonemic segmentation, and teaching a decoding strategy for reading novel words. Second, it entails having students practice using these skills to read and spell words.” [13]

Commenting again on Share’s Self-Teaching Hypothesis, Ehri describes why knowledge of grapheme-phoneme correspondences are the key to sight word learning:

“Share (2004) reported that even one exposure to words enabled Israeli third graders to retain information about the spellings of specific words in memory, and this memory persisted a month later. Sight word learning this rapid and lasting is possible only because readers possess a powerful mnemonic system in the form of alphabetic knowledge that is activated when words are read.”

“To summarize, readers learn to process spellings of words as phonemic maps that lay out elements of their pronunciations visually. Beginners become skilled at computing these mapping relations spontaneously when they read new words. This is the critical event for sight word learning. Grapheme-phoneme connections provide a powerful mnemonic system. They provide the glue that bonds letters in written words to their pronunciations in memory along with meanings. Once the alphabetic mapping system is known, readers can build a vocabulary of sight words easily.” [14] [boldface mine]

Phases of Development in Learning to Read and Spell

Naturally, the above skills needed for OM take time to develop. And, at least initially, sight words accruing from OM build up slowly, word by word. Ehri describes 4 phases that students typically go through on their journey from immature to mature forms of reading and spelling. Take a look at how Ehri portrays these 4 phases in her most recent research paper:

“Ehri (2005) portrayed the development of decoding and sight word reading as a sequence of overlapping phases, each characterized by the predominant type of connection readers form to link spellings of words to their pronunciations in memory.”

“Development begins with small graphophonemic units that later become consolidated into larger syllabic units. In the pre-alphabetic phase, nonphonological visual cues may be formed but are idiosyncratic and easily forgotten. Use of systematic alphabetic cues emerges in the partial alphabetic phase when readers form partial grapheme–phoneme connections, such as initial and final letters, to store words in memory. However, readers in this phase lack the ability to decode novel words.”

“In the full alphabetic phase, knowledge of the major grapheme–phoneme relations and decoding skill are acquired and enable readers to form more complete grapheme–phoneme connections to fully bond spellings to pronunciations in memory. This makes word reading much more accurate. In the consolidated alphabetic phase, grapheme–phoneme subunits are consolidated into larger graphosyllabic and graphomorphemic units that readers can use to decode multisyllabic words and to form connections that secure the words in memory.” [15] [boldface mine]

Note how she views these phases as applying not only to sight word development, but to decoding skill as well. Decoding, as we’ll see, is the primary (though not the only) way orthographic mapping occurs.

Note, too, there is one non-phonetic, pre-alphabetic, initial phase, followed by 3 alphabetic phases: partial, full, and consolidated. Let’s take a look at each of these phases in turn.

The Pre-Alphabetic Phase

I won’t spend much time here. Ehri herself describes this phase as the “default” phase of learning to read. It’s the way children will attempt to “read” before they’ve had any competent instruction. Here are some examples:

A child “reads” the word TRASH when the word is affixed to a trash can, but can’t read the word in isolation.
A child “reads” the word “McDonald’s” if it’s attached to golden arches, but can’t read it if the arches are absent.
A child “reads” the word STOP but can do so only if it’s embedded in a red octagon.
A child “reads” an entire favorite story – but does so by heart.

A child remembers the word SPOON as the word with 2 “eyes” in the middle, but then may “misread” words such as MOON, SOON, BROOM, GOOF, FOOD, and WOOD as the word SPOON.
A child “reads” a word by naming an accompanying picture on the page. For example, the word WHEELS might be depicted by a car, but then a child “misreads” WHEELS as CAR.

What’s true about pre-alphabetic reading is that, however it’s done, it makes no use of the sound value of any letters in the word’s spelling.

“Because most written words do not contain easily remembered cues, children in this phase are essentially nonreaders. Of course, they can pretend read stories they have heard many times, and they can guess words from pictures. However, all of their feats of reading are performed by using cues that do not involve the alphabetic system.” [16]

Let’s move on to Ehri’s 3 alphabetic phases. It’s here that genuine reading begins to occur, and that genuine sight words begin to accrue.

The Partial Alphabetic Phase

What allows children to progress from the pre-alphabetic phase to the partial alphabetic phase? The answer is simple. They learn the names and/or sounds of some letters, and they acquire some basic phonemic segmentation skill. Then they use this information to (imperfectly) remember how to read specific words and to invent partial-sound spellings of words. In describing results of a phonemic segmentation training study, Ehri reports:

“Segmentation training enabled students to function at the partial alphabetic phase in their word reading. In contrast, students who received no segmentation training showed little ability to read words on posttests and, hence, remained at the pre-alphabetic phase. These results support the claim that letter knowledge and phoneme segmentation skill are central in enabling readers to move from the pre-alphabetic phase to the partial alphabetic phase of word reading development.” [17]

Remember how the pre-alphabetic child (above) might remember SPOON as the word with two “eyes”? In the partial phase, the scenario might look like this. The child comes across the word SPOON in a simple text. The child knows some letter-sound correspondences, including those for S and N. Let’s say the child predicts the word SPOON successfully from context, or from a picture, or from the word’s first and last letters. (Or, if all that fails, perhaps the teacher simply tells him the word.)

In any case, now the child has the pronunciation of SPOON in his head and the spelling of SPOON right in front of him. Will SPOON become a sight word for him? It might, but only if he has some basic segmentation skills. He needs to be capable of hearing the “sss” sound at the beginning of the word and the “nnn” sound at the end. If he can hear those two sounds, and make the connections between them and the letters S and N, SPOON can become an imperfect sight word for him.

“Imperfect” [18] because the child has made only partial grapheme-phoneme connections. The middle letters, P-O-O, and the sounds that P and OO symbolize, are lost to the child. Those connections are not made. As a result, the child may confuse many words that start with S, and end in N, with the word SPOON – words like STAIN, SPAIN, SPIN, SHEEN, SOON, SEVEN, SPURN, SPAWN, and dozens of others.

Worse, if the child predicts the word incorrectly (and receives no correction) or if he does not have simple segmentation skill which allows him to hear /s/ and /n/ in the spoken word, he will have learned nothing in this encounter with the written word SPOON. Next time he sees it, he’ll be starting from scratch. Ehri says it this way:

“Children progress to the partial alphabetic phase when they learn the names or sounds of alphabet letters and use these to remember how to read words. However, they form connections between only some of the letters and sounds in words, often only the first and final letter sounds, which are easier to detect, for example, the letters s and n to read SPOON…They may confuse similarly spelled words such as SPIN and SKIN having the same boundary letters. They are limited to forming partial connections because they are unable to segment the word’s pronunciation into all of its phonemes. Also they lack full knowledge of the alphabetic system, especially vowels. Because of this, partial phase readers have much difficulty decoding unfamiliar words.” [19]

It’s easy for a child to get stuck in this partial alphabetic phase – even for years at a time. How? He can be in a reading program that doesn’t teach all the major grapheme-phoneme correspondences in a systematic manner – especially those having to do with the 20 vowel sounds of English. He can be in a program that delays, teaches poorly, or never teaches, the strategy of transforming graphemes into phonemes, and then blending those phonemes to decode unknown words into complete pronunciations with recognizable meanings.

Such children are forever “predicting” what the word might be, based on partial alphabetic cues or context cues. And such word-guessing can quickly become an unbreakable habit. “Not only novice beginning readers but also older children with a reading disability qualify as partial alphabetic phase readers.” [20] The relevant question here is: Do readers get stuck in the partial alphabetic phase due to poor instruction, and/or due to some congenital condition such as dyslexia? And, in either case, how do we help such children?

Ehri has no expectation for how long this partial alphabetic phase might last. “If beginners quickly acquire the skills necessary for the next phase, they may not exhibit partial alphabetic stage reading.” In other words, “Instructional method influences how long beginners show evidence of the partial alphabetic stage.” [21]

The Full Alphabetic Phase

The full alphabetic phase emerges when beginners acquire more complete knowledge of the major grapheme-phoneme correspondences of the writing system, plus the crucial skill of decoding [22] [23] [24]. With these 2 skills firmly in place, complete connections can be made between all the letters seen in the written forms of words and all the phonemes detected in their pronunciations.

A distinguishing characteristic of full phase readers is “the ability to decode words never read before, by blending letters into a pronunciation. This knowledge enables full-phase readers to form fully connected sight words in memory.” [25] Sixteen years later, Ehri says it this way: “Decoding letters into blended sounds helps readers figure out words they have not read before. Reading them a few times moves the words into memory so they can be read by sight.” [26] [Note: Ehri states some version of this in nearly every one of her research papers. Decoding – blending all a word’s sounds, left to right, into a complete pronunciation – leads to orthographic mapping, and to fully-connected sight words in memory.]

The advantages to students in this phase are enormous. Sight words are no longer “imperfect.” SPOON won’t be confused with SKIN, or SPIN, or any of the other similarly-spelled words listed above. That’s because now, all 4 graphemes in SPOON (S, P, OO, N) have consciously been linked to the 4 phonemes which, when blended together, form the word SPOON.

Recall that, for Ehri, OM consists in deliberate connection-making between letters seen in a word’s spelling and sounds detected in the word’s pronunciation. This connection-making is precisely what decoding does. The student connects a phoneme to each of a word’s graphemes, and then blends those phonemes to pronounce (read) the word. Naturally, she can hear all the phonemes because she just identified them and successfully blended them together to say the word. And she does this while studying the spelling. For most students, performing this decoding process 2-5 times is all that is required for that word to become a sight word.

Does segmenting also have a role in this full alphabetic phase? It does. In the previous phase, partially-connected (imperfect) sight words could be created only if the student had some segmentation skill. That’s because decoding was not yet a possibility in the partial phase. The question here is this: Can a student segment her way into a fully-connected sight word instead of using decoding as just described?

She can. But if decoding is not used, then someone other than the student herself must pronounce the word (or she must correctly guess it). Assuming this is done, and assuming she has full segmenting ability (can detect all a word’s phonemes) and has knowledge of the graphemes symbolizing these phonemes, she can then make the complete connections necessary for accurate sight word creation. The very act of seeing graphemes in a word’s spelling helps her detect the separate phonemes in the word’s pronunciation.

In reading Ehri’s many articles it would be easy to selectively pick quotes here and there to make the case that, according to Ehri, OM depends exclusively on either segmenting or on decoding. I wanted to be sure of my own interpretations in this blog, so I asked Ehri about this issue directly this past July (2021). Here's the relevant part of our exchange:

Parker: In reading your various articles, my understanding of OM is that it is neither tied exclusively to decoding (blending/synthesis), nor to encoding (segmenting/analysis). Rather, OM is a connection-making process. The connections need to be at the phoneme/grapheme level (at least initially, prior to the consolidated phase) and the connections that need to be made explicit are between the graphemes seen in a printed word’s spelling and the phonemes heard in that word’s pronunciation.

Ehri: Correct.

Parker: In the partial phase, Phase 2, decoding is not yet possible, so naturally, segmentation is the only way the partial connections can be made. The teacher says the word, and the student must have enough letter knowledge and phonemic awareness segmentation skill to make “some” salient connections – typically between the initial letter in the printed word and the initial phoneme heard.

Ehri: Or between the final letter-sound as well. Yes.

Parker: This is an improvement over Phase 1 but is still subject to all kinds of errors, chiefly because the connections are only partial. (A lot of words, for example, start with B, or start with B and end in T)

Ehri: Yes

Parker: When (and if) a student finally becomes capable of through-the-word decoding, Phase 3 begins to open up for him. Now, for the first time, complete “connections” can be explicitly examined by him – indeed, must be explicitly examined by him so that he can pronounce the word. In Phase 3, it seems to me – and in line with David Share’s Self-Teaching Hypothesis – that blending/synthesis becomes the chief mechanism for OM, rather than the segmenting/analysis of the previous partial phase.

Ehri: Yes. When readers decode words on their own, this enables them to form connections and retain the spelling-pronunciation bonds in memory. But also if readers encounter a new word and someone reads it to them, or the context allows them to predict the word accurately, connections can be formed as well, provided students' grapheme-phoneme knowledge is activated and applied to that word. In our orthographic facilitation of vocabulary learning studies, we found that simply exposing students to the spellings of new words as they heard their pronunciations and meanings, without drawing attention to spellings, enhanced their ability to remember pronunciations when no spellings were present. Students recalled pronunciations much better in this condition than when they were not shown spellings as they studied the words. In the spelling exposure condition, decoding was not necessary because the words were pronounced as soon as they were seen. We argue that grapheme-phoneme connections were activated spontaneously when the words were heard and seen. In one study though, we found that having students actively decode the spellings produced greater memory for pronunciations than simple exposure to the words. So decoding enhances memory. [boldface mine]

Directly above, Ehri makes reference to “orthographic facilitation of vocabulary studies.” In one such study [27], low SES urban first graders were compared to see how well they learned new vocabulary under 3 different conditions:

“In the decoding condition, students sounded out and blended spellings during study and feedback periods but not when memory for pronunciations and meanings was tested. In the exposure-only condition, spellings were shown, but no attention was drawn to them. In the no exposure condition, words were learned without spellings but were spoken extra times. Students practiced recalling words over several test trials with feedback. Results revealed that students who decoded spellings learned pronunciations and meanings better than students who were only exposed to spellings. Seeing spellings enhanced learning more than not seeing them.”

“These findings support the theory that exposure to spellings activates grapheme–phoneme connections to better secure spellings to pronunciations along with meanings in memory. These connections are activated implicitly when spellings are simply exposed, but the connections are strengthened when spellings are explicitly decoded…Students should be shown spellings and should decode them.” [28] [Boldface mine]

So having students decode, that is, sound out and blend spellings, produced better results than pronouncing the word for the students – a procedure which avoided the need to decode, but allowed segmentation to occur. And both these groups (the decoding group and the segmenting group) did better in remembering new vocabulary than the students who were not shown spellings.

But here’s the primary reason decoding must take preference over segmenting for sight word creation in the full alphabetic phase and beyond. Recall that a skilled reader will eventually recognize 50,000-70,000 sight words. The vast majority of those sight words will be created by the reader as she decodes them for herself while reading authentic text. She decodes them, makes the necessary grapheme-phoneme connections, and then adds those words to her sight word mental dictionary. While our intrepid reader could ask someone else for an unknown word’s pronunciation, and then use her segmenting skills to make the necessary grapheme-phoneme connections, it’s barely possible to imagine this happening for 50,000 or more sight words.

Decoding, not segmenting, is the key to building a large sight word vocabulary. Decoding, not segmenting, is the key to David Share’s Self-Teaching Hypothesis. While it’s true that either decoding or segmenting (in conjunction, of course, with letter-sound knowledge) can create a fully-connected sight word in the full alphabetic phase, only decoding is capable of creating the 50,000 sight word lexicon of the skilled reader.

The full alphabetic phase gradually (and automatically) morphs into the consolidated alphabetic phase as more and more fully-connected sight words are retained in memory.

The Consolidated Alphabetic Phase

As sight words accumulate for a given reader, letter patterns that recur in multiple words start to consolidate in memory. These letter strings represent the same phoneme blend when they’re decoded. Soon, the reader begins to recognize them by sight. Though not words, these fragments become consolidated units in memory. Such units can be rimes (-EST, -ANK, -OAST, -UST), prefixes (ANTI-, PRE-, NON-, UN-), or suffixes (-ING, -MENT, -NESS, -TION).

Here’s an example. Suppose the words BEST, WEST, and TEST have earlier (in the full alphabetic phase) become fully connected sight words for the student. Under such conditions, the rime EST has now been fully analyzed and decoded at the grapheme-phoneme level for 3 different words. EST will now likely become consolidated for this reader. (It becomes a sight fragment if you will.)

Suppose the same student subsequently comes across the unknown word CHEST. Decoding will now happen faster because only 2 entities must be blended, /ch/+/est/, rather than four, /ch/+/e/+/s/+/t/. Quickly, CHEST becomes a new sight word for this student. The same thing will happen when PEST, NEST, REST, and ZEST are first encountered.

In this phase, learning to read (and store) multi-syllable words becomes much easier. For example, decoding the word IN-TER-EST-ING is greatly simplified if 2 or more of these four syllables have already been consolidated.

You can see how this consolidation process will allow sight words to start accumulating rapidly in this fourth and final phase. The student is now on his way to becoming a skilled reader – someone fully capable of building a 50,000 sight word mental dictionary by the time he’s off to college.

Ehri summarizes her 4 phases in a useful table [29], reproduced in full, below. Note how rows 6, 7, and 8 in the table correspond to decoding, analogizing, and predicting – the 3 ways Ehri identifies for reading unfamiliar words. But there is now a clear hierarchy for using these 3 strategies. Analogizing most easily occurs after the student becomes adept at decoding. And predicting (from context or first letters) occurs only before decoding is a possibility. Once the student can decode, predicting from context is used only to confirm or disconfirm the decoding.

Implications for Reading Instruction

Where to begin? Since this is already a lengthy blog, I will focus on just two areas.

1) The Futility of Early Onset-Rime Instruction

Having read thus far, the reader can now perhaps better appreciate why trying to teach children to read via onset-rime before teaching them to analyze and map those rimes at the phoneme-grapheme level is not supported by reading science. Onset-rime involves reading words using larger, consolidated chunks of graphemes forming the rime unit, for example, -ENT in TENT, RENT, and SENT. It’s a form of reading by analogy. It belongs in Ehri’s fourth and final stage where rimes become “consolidated” after having been analyzed on the grapheme-phoneme level.

Nonetheless, in many Balanced Literacy classes, onset-rime is taught in the partial alphabetic phase, before decoding is even a possibility. How do educators suppose this can work? Let’s look at an example of what would actually have to happen.

First, the child must somehow memorize a word (let’s say it’s TEACH) as an unanalyzed whole, that is, before any work is done at the grapheme-phoneme level. T is considered the onset; the vowels and everything following the vowels are the rime – in this case EACH. TEACH is now supposed to act as the “keyword” that will unlock the entire rhyming EACH family.

Suppose the child now comes across the unknown word BEACH. How is she to read it? She must recall the sight word TEACH, subtract the T sound (tuh?), replace it with a B sound (buh?) and read the new word. But how is she to recall TEACH? She doesn’t (yet) know the sound of BEACH, so she can’t recall TEACH by its rhyming relationship to BEACH. She must recall TEACH by noticing that the two words both end in the same letter sequence E-A-C-H. How likely is that to happen? Also, how likely is it that a student in the partial alphabetic stage will have the phonemic awareness skill, or the working memory capacity, to substitute a B sound for the T sound?

And that’s just the EACH rime family. What about the ACK, OOP, and UNK families? At this point, you might find yourself wondering just how many rime families are out there? Most teachers who use early onset-rime don’t realize there are over 300 rime families in English. One sight word, acting as the pronunciation key, must be memorized for each rime family.

It gets worse. This 300 total covers only single-syllable words. Many more rimes exist only in multi-syllable words (e.g. ULT in ADULT, RESULT, and CONSULT; ECT in DEFECT, RESPECT, and SELECT).

Having students memorize a rime that has not been analyzed at the grapheme-phoneme level is the functional equivalent of having them memorize whole words. In fact, many rimes are whole words: ITCH, AT, IN, AIM, EACH, ALL, ARCH, EAST, IMP, OAK, OIL, OUT, UMP.

Why do teachers attempt early onset-rime instruction? Many simply don’t know better. They’re unfamiliar with Ehri’s research. Worse, their professors at whatever teacher college they attended were also unaware of Ehri. Many teachers genuinely believe that since children can hear and distinguish syllables easier than they can hear and distinguish phonemes, syllables (including rimes) should be taught prior to grapheme-phoneme decoding.

The problem with thinking this way is that English, like many of the world’s languages, is based on the Latin alphabet. Because alphabetic writing systems represent speech at the level of phonemes rather than at the level of syllables, instruction is more effective when students are taught to detect these phonemes at the start of reading instruction. All Ehri’s research has concluded that teaching students to make full connections between phonemes and graphemes (Phase 3) is the key to reading rimes efficiently (Phase 4).

Portuguese is a more transparent language than English. Portuguese syllable structure is simpler than English syllable structure. There are fewer monosyllables in Portuguese, and the boundaries between syllables are clearer, so it is easier for beginners to detect the separate syllables in spoken words and map them onto spellings. Nonetheless, Ehri’s most recent study showed that, even in Portuguese, “teaching students to decode using small grapheme–phoneme units was much more efficient than teaching students to decode using larger syllabic units.” [30] In other words, “small grapheme–phoneme units emerge first and are needed to form larger syllabic units.” [31]

2) Is the Partial Alphabetic Phase Necessary?

Here are other ways of asking this question. Must children go through what could become a lengthy partial alphabetic phase, where they guess words based on pictures, context, or first letters? Must they spend lots of time inventing wildly inaccurate spellings, and learning partially connected sight words which they will then easily confuse with words having similar spellings? Must they spend this time acquiring what, for many of them, will be a difficult-to-break habit of word-guessing?

Here’s what Ehri thinks:

“It is important to note that sight word reading during the partial alphabetic phase is an imperfect process that occurs among beginners who lack full knowledge of the alphabetic system and phonemic segmentation skill. There are no expectations about how long this phase will last. If beginners quickly acquire the skills necessary for the next phase, they may not exhibit phonetic cue [partial stage] reading.” [32]

Ehri calls attention to the fact that German-speaking Austrian children show little evidence of undergoing the partial alphabetic phase. She speculates:

“This may be the case because German is a transparent writing system, so decoding is relatively easy to learn. In addition, from Day 1 in school, Austrian children receive systematic phonics instruction that teaches them the alphabetic system.” [33]

Ehri also notes that Portuguese-speaking children taught with a whole-word method did exhibit partial alphabetic reading, while those taught with a phonics method did not. The latter group started right in with decoding.

“Instructional method influences how long beginners show evidence of the partial alphabetic phase and how quickly they acquire use of full grapheme-phoneme connections.” [34]

Referring to her own phase theory, Ehri says:

“We expected that teaching students in the pre-alphabetic phase to decode [simple] syllables by sounding out and blending grapheme–phoneme constituents would enable these students to read more like readers in the full alphabetic phase, whereas teaching students to decode whole syllables and single grapheme–phoneme relations would limit these students’ movement to the partial alphabetic phase.” [emphasis added] [35]

Ehri’s expectations were confirmed in her recent instructional study comparing three ways Brazilian beginners are taught to read words. The three ways are: decoding at the grapheme-phoneme level, decoding at the level of whole syllables, and teaching grapheme-phoneme relations, but without decoding. Those taught how to decode at the grapheme-phoneme level far surpassed the other two groups:

“Present findings support these views. Students taught to use grapheme–phonemes to read CV syllables far outperformed the other two groups on posttests in decoding longer pseudowords, in remembering how to read a set of 12 longer words over several learning trials, and in recalling their spellings more accurately.” [36]

So the partial alphabetic phase, according to Ehri, can be short-to-nonexistent, or it can be quite lengthy. The main determinant is “instructional method.”

What would an instructional method look like if the goal was to maximize the time children spend as partial alphabetic readers? It would have many of the following characteristics:

“Sight words” would be understood as words to be learned consciously through rote-memorization, rather than learned automatically through OM. Lists of such words would be given out to students and their families to be memorized.
“Look at the picture” would be viewed as a “strategy for word-solving.”
Beginners would be expected to read whole words, whole syllables, and rimes before they acquired knowledge of grapheme-phoneme correspondences and decoding skill.
Phonemic awareness training would be done without letters, even though both reading and spelling obviously involve letters.
Beginners would be encouraged to guess the identity and pronunciation of a word based on its context, position in the sentence, and/or its first letter. It would be called three-cueing.
Children would be provided with reading materials that have lots of words they can’t possibly read. That way they could practice their three-cueing strategy.
Meaning, rather than the alphabetic code, would be emphasized at the start of instruction even though knowledge of that code forms the basis for extracting meaning from print.
“Discovery” would take precedence over explicit teaching when aspects of the code are finally taught.
In teaching the code, more emphasis would be placed on easier-to-learn consonant sounds than is placed on those 20 harder-to-learn vowel sounds, even though vowel sounds occur in every syllable.

How would Ehri evaluate such an instructional method? Well, as Chair of the Phonemic Awareness Subcommittee for the U.S. National Reading Panel, she produced a final report that said, no less than 15 times: phonemic awareness training should consist in blending and segmenting – and both should be done with letters.

Ehri also says this:

“Our findings challenge instructional approaches claiming that beginners can learn to read whole words before they have acquired knowledge of grapheme–phoneme relations. Without this knowledge, students would remain in the pre-alphabetic phase. Our findings challenge the view that prereaders will move into reading through exposure to and practice in reading authentically written, meaningful storybooks without much attention paid to teaching them foundational skills. Without this, progress will be halting and limited. Students may not function beyond the partial alphabetic phase. Our findings challenge the strategy of teaching students that guessing words using syntactic and semantic cues is better than decoding words using graphic cues. Guessing does not build students’ lexical memory to support word-reading accuracy and automaticity.” [37]

What, then, would an instructional method look like if the goal was to minimize, or even eliminate, the time children spend as partial alphabetic readers? We can begin to answer this question using a simple process of elimination. The instructional method we seek would not, per Ehri, employ word-guessing. (Even Balanced Literacy guru Jennifer Serravallo recently came to this conclusion.) It would not, per Ehri, use early onset-rime or any other type of analogy phonics.

What about analytic phonics? Analytic phonics requires the beginner to have something to analyze. That something is a large cache of sight words. But these sight words have to be consciously memorized by children long before they’ve been taught the skills needed to create proper, fully-connected sight words. Nothing in Ehri’s writings suggests this would be acceptable.

So the instructional method we’re looking for – if our goal is to minimize time spent in the partial phase – can’t include guessing from pictures, guessing from context, guessing from the first letter, early onset-rime, analogy phonics, or analytic phonics. What remains? I can think of only a single method: systematic synthetic phonics.

Taught using synthetic phonics, students waste no time on word-guessing, on rote-memorization of sight words, on phonemic awareness training without letters, or on discovery learning. Instead, they start learning letter-sound relationships, explicitly taught by a competent teacher, on Day 1. As soon as some (say, 7-10) of those relationships are mastered, students are taught how to decode words using those letters and sounds.

In a relatively short amount of time, students find themselves in the full alphabetic phase, creating fully-connected sight words, and identifying unknown words on their own. “Teaching letter sounds and decoding necessarily occupies a larger portion of instructional time until students master foundational skills. This enables students to function at the full and consolidated alphabetic phases.” [38]

Segmentation is also taught in synthetic phonics, not so much for its use in the “imperfect” partial alphabetic stage, but for its use in spelling and, as Ehri points out, for its use in creating fully-connected sight words when the word is not decoded, but rather is pronounced for the student.

One of the prerequisites for OM to occur is practice. This, too, is an essential part of synthetic phonics – a method that fully embraces decodable readers. Here's Ehri: “Decodable books provide beginners with practice in applying the grapheme–phoneme relations that they have learned to decode words and to build their sight vocabularies.” [39]

I should note Ehri has expressed a reservation concerning synthetic phonics:

“The ability to decode new words marks entry into the full alphabetic phase. A synthetic procedure for decoding words is to say the phoneme corresponding to each grapheme and then blend them to pronounce the word. Learning this procedure is hindered when schwa vowels are added to stop consonants and have to be deleted during blending.” [40]

(For more information on the schwa sound, see here.) Here’s an example of what Ehri is saying: To successfully read the word CAT, a student might attempt to blend “cuh” + “ah” + “tuh.” But CAT is not a blend of these 3 sounds. The student must eliminate the “uh” part of both “cuh” and “tuh” in order to successfully decode CAT. Some students have difficulty doing this. So Ehri and a colleague, Selenid M. Gonzalez-Frey, conducted a study [41] to see if this difficulty might be overcome. Note this study is done with kindergartners (Year 1 of formal schooling in the US).

“We compared two methods of teaching decoding to kindergartners in the partial phase who knew letter sounds but could not decode nonwords. Students were taught to decode CVC nonwords containing continuant consonants, which allowed phonemes to be stretched and connected without interruption from schwa vowels (“sssaaannn”). Students in the connected condition were taught to stretch and pronounce phonemes without breaking the speech stream before blending. Students in the segmented condition were taught to stretch and say each phoneme but to break the speech stream between phonemes (“sss-aaa-nnn”) before blending. Following learning to criterion, students completed a transfer test to decode 20 CVCs with stop consonants that are harder to blend because of intrusion from schwa vowels when stops are pronounced in isolation. Results showed that during training, kindergartners who received connected practice learned to decode the nonwords more quickly, and on the transfer test, they read nonwords with stops more accurately than the segmented group. An error analysis revealed that breaking between phonemes caused students in the segmented condition to forget initial phonemes during blending. These findings suggest how to teach decoding more effectively to help students move into the full alphabetic phase.” [42] [Boldface mine.]

One country (so far) in the English-speaking world has formally adopted synthetic phonics as the way to teach beginners to read. That country is England. The decision was due to the 2006 Rose Report which concluded:

“Because our writing system is alphabetic, beginner readers must be taught how the letters of the alphabet, singly or in combination, represent the sounds of spoken language (letter-sound correspondences) and how to blend (synthesize) the sounds to read words, and break up (segment) the sounds in words to spell. They must learn to process all the letters in words and ‘read words in and out of text’. Phonic work should teach these skills and knowledge in a well defined and systematic sequence.” [Paragraph 45]

“Having considered a wide range of evidence, the review has concluded that the case for systematic phonic work is overwhelming and much strengthened by a synthetic approach, the key features of which are to teach beginner readers: • grapheme/phoneme (letter/sound) correspondences (the alphabetic principle) in a clearly defined, incremental sequence • to apply the highly important skill of blending (synthesizing) phonemes in order, all through a word to read it • to apply the skills of segmenting words into their constituent phonemes to spell • that blending and segmenting are reversible processes.” [Paragraph 51]

I’ll end this blog by asking Ehri to sum up her own long and successful career researching how children learn to read. I especially like this quote because it so clearly shows the centrality of decoding in Ehri’s thinking.

“In sum, the theory and research presented in this article show that teaching students to decode unfamiliar words and enabling students to store spellings of familiar words bonded to their other identities in memory should be central goals of beginning reading instruction. Decoding is a means of getting spellings of words into memory so they can be read by sight. Being able to connect letters in spellings to sounds in pronunciations spontaneously when spellings of words are seen and heard also serves to retain words in memory. Both decoding and letter–sound mapping skills require knowledge of the alphabetic writing system. Gradual acquisition of this knowledge propels students through the alphabetic phases to become skilled readers.” [43] [Boldface mine.]

Stephen Parker

Boston

October 2021

Please note: Stephen Parker’s FREE stand-alone phonics books for reading teachers and parents can be downloaded here.

Special thanks to:

Dr. Linnea Ehri who read the initial drafts of this blog and offered many suggestions, all of which have been incorporated into this final draft.

Dr. Pamela Snow, friend and mentor these past 3 years. Without her, there would be no Parker Phonics. Dr. Snow also offered many helpful suggestions for this blog.

Sir Jim Rose, for his constant encouragement and for his assistance promoting this blog throughout his circles of influence - especially in England.

Sources:

[1] Ehri, L.C. (2005). Learning to read words: Theory, findings and issues. Scientific Studies of Reading 9, 167-188, p168.

[2] Ehri, (2005) “Learning to Read Words,” p170.

[3] Ehri, L.C. (1998). Grapheme-phoneme knowledge is essential for learning to read words in English. In J. Metsala & L. Ehri (Eds.), Word Recognition in Beginning Literacy (pp. 3-40). Mahwah, NJ: Erlbaum, p12-13.

[4] Ehri, L.C. (2014). Orthographic mapping in the acquisition of sight word reading, spelling memory, and vocabulary learning. Scientific Studies of Reading 18(1), 5-21. p6.

[5] Ehri, (1998) “Grapheme-Phoneme Knowledge,” p14.

[6] Sargiani, R., Ehri, L.C., & Maluf, M.R. (in press). Teaching beginners to decode consonant-vowel syllables using grapheme-phoneme subunits facilitates reading and spelling compared to teaching whole syllable decoding. Reading Research Quarterly. https://doi.org/10.1002/rrq.432. p2-3.

[7] Ehri, (2005) “Learning to Read Words,” p171.

[8] Ehri, L.C. (2020). The science of learning to read words: A case for systematic phonics instruction. Reading Research Quarterly, 55(1), S45-S60. Special Issue: The Science of Reading: Supports, Critiques, and Questions. https://ila.onlinelibrary.wiley.com/doi/epdf/10.1002/rrq.334. p4.

[9] Ehri, (2020) “The Science of Learning to Read Words,” p2-3.

[10] Ehri, (2020) “The Science of Learning to Read Words,” p10.

[11] Ehri, (2014) “Orthographic Mapping in the Acquisition,” p7.

[12] Ehri, (2020) “The Science of Learning to Read Words,” p3.

[13] Ehri, (2014) “Orthographic Mapping in the Acquisition,” p19.

[14] Ehri, (2005) “Learning to Read Words,” p172.

[15] Sargiani et al. (in press) “Teaching Beginners to Decode,” p3.

[16] Ehri, (2005) “Learning to Read Words,” p173.

[17] Ehri, (2020) “The Science of Learning to Read Words,” p10.

[18] Ehri, L.C. (2005). Development of sight word reading: Phases and findings. In M. Snowling & C. Hulme,(Eds.), The science of reading, a handbook (pp. 135-154). UK: Blackwell. p145.

[19] Ehri, (2005) “Learning to Read Words,” p173.

[20] Ehri, (2005) “Learning to Read Words,” p174.

[21] Ehri, (2005) “Development of Sight Word Reading,” p145.

[22] Ehri, (2005) “Development of Sight Word Reading,” p146.

[23] Ehri, (2005) “Learning to Read Words,” p184.

[24] Sargiani et al. (in press) “Teaching Beginners to Decode,” p16.

[25] Ehri, (1998) “Grapheme-Phoneme Knowledge,” p21.

[26] Ehri, (2014) “Orthographic Mapping in the Acquisition,” p5.

[27] Chambre, S., Ehri, L.C., & Nest, M. (2020). Phonological decoding enhances orthographic facilitation of vocabulary learning in first graders. Reading and Writing: An Interdisciplinary Journal, 33(5), 1133-1162.

[28] Ehri, (2020) “The Science of Learning to Read Words,” p11.

[29] Ehri, (2014) “Orthographic Mapping in the Acquisition,” p8.

[30] Sargiani et al. (in press) “Teaching Beginners to Decode,” p15.

[31] Sargiani et al. (in press) “Teaching Beginners to Decode,” p16.

[32] Ehri, (2005) “Development of Sight Word Reading,” p145.

[33] Ehri, (2005) “Development of Sight Word Reading,” p145-146.

[34] Ehri, (2005) “Development of Sight Word Reading,” p146.

[35] Sargiani et al. (in press) “Teaching Beginners to Decode,” p4.

[36] Sargiani et al. (in press) “Teaching Beginners to Decode,” p16.

[37] Ehri, (2020) “The Science of Learning to Read Words,” p13.

[38] Ehri, (2020) “The Science of Learning to Read Words,” p13.

[39] Ehri, (2020) “The Science of Learning to Read Words,” p11.

[40] Ehri, (2020) “The Science of Learning to Read Words,” p9.

[41] Gonzalez-Frey, S.M., & Ehri, L.C. (2021). Connected phonation is more effective than segmented phonation for teaching beginning readers to decode unfamiliar words. Scientific Studies of Reading, 25(3), 272-285.

[42] Ehri, (2020) “The Science of Learning to Read Words,” p8.

[43] Ehri, (2020) “The Science of Learning to Read Words,” p13-14.

Stephen Parker