December 17, 2017

The lexical approach revisited

"Beginning in the 1980s, computer-based studies (mainly of English) began to provide us with powerful insights into the workings of our language. Linguists fed millions of English documents into software programs to scan them and see what they might yield about their patterns of behavior. These studies were known as ‘corpora’ studies. From the beginning, the corpora studies began to reveal surprising insights into how words interact and behave with other.

The studies offered empirical data, based on a very broad range of English language sources. They allowed us to take a given word or expression and look at how it behaved over the course of thousands of examples - how it was used grammatically, where it was likely to be used, with whom it as most likely to keep company, etc. The results were often startling and they began to challenge traditional ideas about the role of grammar and even about how we defined grammar.

One outgrowth of these studies was the development of the ‘lexical approach’ to language teaching. The first description of a lexical approach is attributed to Michael Lewis, who wrote a book of that title in 1993. This book became a classic amongst language teachers and I myself have been greatly influenced by it over the years. I convinced that the lexical approach (with some revisions) offers very useful insights into how we might approach the study of Mandarin, so let me explain a little about what it is.

The most striking revelation from the corpora concerns how words tend to associate strongly with other words in the form of chunks, fixed expressions, collocations, etc. As an example, let’s take a look at collocation. The word ‘collocation’ refers to the tendency amongst words to collocate, or ‘co-locate’ (appear close to) certain other words. Some random examples (out of millions of possibilities):

seriously ill
serious problem
serious accusation
common cold
unfair advantage
decisive action
strong tea
join hands
commit a crime

If you typed the word ’seriously’, into the corpora software, it would yield thousands of sentences (taken from original documents) and show you the words that ’seriously’ was most likely to appear next to. In this case, ’seriously’ occurred much more frequently with the word ‘ill’ than with any other word. We can therefore say that ’seriously’ collocates with ‘ill’. The word ’serious’, meanwhile, is more likely to appear next to ‘problem’ or ‘accusation’ than with any other words, and so on.

The other phrases on the list above are every day expressions (or collocations) that every native speaker of English knows. But here’s the really interesting thing: even advanced level non-native speakers are unlikely to know these expressions! In fact a non-native speaker is more likely to make a mistake when using such expressions than to use bad grammar. (If ever you are in doubt about whether someone is a native speaker of English, just test his/her knowledge of these kinds of expressions.)

To non-language teachers, the examples of collocations I offer may seem trite, but let me tell you that they set off a firestorm of innovation and debate in the language teaching world that has continued unabated to this day. (Actually, while we’re at it, ‘to this day’ is a nice fixed expression, while the word ‘unabated’ tends to occur with ‘floods’ or ‘firestorms’ or things like that, for some reason!)"


If you liked this post...
sign up here for free email updates and special offers


  1. Michael says:

    I don’t know if the lexical approach per se set off a firestorm so much as the combined weight of all forms of data driven language research which has changed the face of how we teach languages especially English.

    The Lexical Approach had one big disadvantage- you couldn’t build a sequenced curriculum around the results. It was darn hard ordering the phrases by difficulty and it was difficult teaching fixed multi-phrase chunks in a communicative setting.

    Do you have any insight into how to order a curriculum or teach multi-word chunks using a lexical approach?

  2. Orlando Kelm says:

    I really like your question Michael. It got me thinking about class yesterday. (I’m teaching an advanced Portuguese class this semester). We often review transcripts from hundreds of video clips that I have recorded of Brazilians talking about different things. I like it because it isn’t common to have a written transcript of oral speech. The transcripts give us a way to see how people really talk. Our approach in class is usually something like, “I know that you can already say these things in Portuguese, but let’s look at how native speakers convey those same ideas.
    Yesterday the students caught on to a phrase that included “continua gostando” (keep on liking) and I believe that it finally sunk in that although book learners learn how to say “continua a gostar” (continue to like), no Brazilian would actually say that. Hopefully they have a new chunk.
    Anyway, to keep this reply brief, the “curriculum”–you might say–is to analyze how Brazilians use language through our study of the transcripts of their natural speech, I don’t really feel a need to sequence the lexical chunks or limit how many they are exposed to. In case you are interested, here’s the URL to the Brazilian video clips:

  3. Michael says:

    Pretty impressive Orlando. I particularly liked the way you had things organized.

    In class, do you point this language out to students (so content serves to support your teaching) or do you have them look at the raw data and draw conclusions on their own?

    I believe both approaches are currently being used. I am more interested in the instructed approach. But, I recently came across a journal article that recommends the second. You might be interested.

    Language Learning and Technology, October 2008; The Pedagogical mediation of a developmental learner corpus for classroom-based language instruction.

  4. Ken Carroll says:


    I agree that lexis is no silver bullet for the teaching of languages. (I think we all aggre that there is no silver bullet.) For me, however, lexis has opened up new possibilities, such as Orlando’s empirical approach to selecting learning content on the basis of what really happens when native speakers of L2 communicate.

    Clearly there are patterns in spoken language and ithem is better than a discrete-point, word by word approach to learning the language. Those paterns could be organized on the basis of frequency or relevance to the context of study. Ultimately, though, I’m not sure this stuff can be organized in textbook form. I have no idea how a university professor woudl choose which phrases belong in a course, and which ones do not, or the rationale for how they woudl be presented. The only solution I see is a dynamic syllabus, updated and developed according to the needs of the learners. (Maybe this is why we’ve ended up with 1,000 lessons on CPod!) I’m startign to think that the network approach is the only answer. More on this later.


  5. Michael says:

    Actually, I am a big supporter of data driven language learning. Like you (I think), I also strongly believe that a lexically driven curriculum can and should access specialized lexical fields (e.g. the language of cooking) as often as it should turn to broad general fields (BNC corpus for example).

    In terms of Chinese this decision is easier because of a lack of any comprehensive, generally accepted corpus that underpins materials creation. Insofar as Chinese is concerned, any movement down this path is positive and much needed.

    But, my question is not if a lexically informed approach is needed but rather how to use the “stuff” that we create or we observe in the speech of others. If we order language teaching in a hierarchy moving from basic to advanced, how do we integrate a hierarchical approach with a lexical approach? Should a lesson on the language of cooking precede a lesson on the language of scuba diving and if so why or why not? Should a lesson on the language of cooperation precede a lesson on the language of disagreement? Should a lesson on the lexical uses of “get” precede a lesson on the lexical uses of “make” or “know” or “banish”? This takes on a great deal of importance if we agree that a lot of grammar is only disguised lexis. Indeed, are making decisions like these more an art than a science?

    So how do we make these decisions? Should we leave these decisions entirely up to the interests of students? And what happens when we are not teaching self-directed learners that are capable of making smart choices? What if we are teaching a class full of students who need to follow a common thread?

    Personally I have a list of rather interesting questions I want answered. Maybe Orlando could chime in. How do we turn lexical phrase reception (passive) into production (active)? Are phrases learned and remembered differently than single words? Do phrases need to be actively produced in context to be stored for future reuse? How many times do they need to be heard or produced before they are remembered?

    I’m interested in how you see a network approach as an answer. Of course, given that this is your blog, I should allow for other questions as well, so I am curious about what questions you hope to answer

  6. I’ve had my own small version of Orlando’s experience. A good part of my language practice occurs in Second Life, which as a virtual world has “residents” from many countries.

    Many of my online francophone friends prefer to speak in text chat, rather than in voice. Especially in one-to-one conversations, I see how they actually use language, and discover many practices like the continua gostando / continua a gostar situation Orlando mentions. (It took me two weeks to realize that no one used demeurer to say, “I live in Paris” — it would be more like “I reside.”

    So, with the approval of the person I’m talking to, I’ve sometimes turned on a feature that records a text conversation, giving me idiomatic evidence of French being used.

  7. Hanyu Man says:

    I like Michael’s questions.

    I am not an educator, just a weak and slow learner. I spent my first 1.5 years of Chinese language study relying almost exclusively on Chinesepod, so was immersed in the lexical approach. I think it mostly is very good. From my perspective, the weakness with Chinesepod’s specific tack is that doesn’t systematically expand and reinforce the use of lexical chunks across lessons, since each lesson is an independent module, with no expected sequence or explicit layering between them.

    The very most common lexical constructs and patterns that occur in the language are adequately reinforced by Chinesepod, since they naturally occur across a broad swath of different lessons. For example, you can’t help but run into “zen3 me2 yang4” (= “what about”) multiple times. But, it is mostly random hit and miss as to whether any but the most common chunks will ever be reinforced enough to develop into a lasting component of the students vocabulary.

    For example, I may come across “seriously ill” in an individual lesson, but will I ever see it again? Will I ever come across related chunks like “serious accident” or “serious accusation” in a future module? Who knows. If not, the value of time spent learning this chunk will mostly be lost. It will rapidly disappear from memory.

    During the podcast for a dialogue which includes the chunk “seriously ill”, the hosts may talk about related variants, like “serious accident” and “serious accusation”. This is good, but since it isn’t part of the actual lesson dialogue, it is unlikely that I will retain this information. Only material included in the lesson dialogue receives enough of my focus and repetition to have the potential for setting in.

    A very good feature that Chinesepod has added sometime in the last year or so is the ability to search for the use of a specific word in other lessons. This is a help. A student might be able to use this to design their own lesson sequence that re-enforces sets of lexical chunks. I hope to eventually return to Chinesepod and try this someday.

  8. Peter Easton says:

    The drawback with the lexical approach is that it’s cliché-forming. Remember Orwell’s rule:

    ‘Never use a metaphor, simile or other figure of speech which you are used to seeing in print.’

    This holds for turgid discourse markers and other unimaginative phrases. Corpora are too formulaic and encourage poor style and lazy thinking. Language is far more flexible than a few thousand standard pairings thrown out of a corpus. I can think of more creative things to do with my classroom time than teaching hackneyed collocations – which learners can acquire passively through general exposure to the language outside of the classroom. The ‘real world’ is the best place to learn vocab, not the classroom. The classroom is really the place to develop and polish what vocab and grammar they already have.

    You have to everything in chunks, in other words, with some syntax even if it’s just an article tagged on to a word but teachers and students should lean towards making their own chunks where possible.

  9. Ken Carroll says:


    I’m generally less concered about literary sensibilities than with learner needs. Orwell’s essay was not written for the ESL learner.

    I’m glad that you find creative ways to spend classroom time, which is to say that you are doing your job. Again, however, creativity is one thing, while the issue of what our students need to learn is another. In my experience, most Chinese learners simply want to learn to communicate in English through email and conversation. They also want to sound like something that approximates native speech. All of this requires learning high frequency language and using it where appropriate. If they do not learn to identify these patterns they will indeed produce uniquely creative syntax. You won’t have to teach that.


  10. DAVID says:

    a great deal of research has recently taken place in the USA and Great Britain regarding “language Corpora” which has led to the resurgent interest in the lexical approach. what do you believe are the three biggest developments in this area?


    • Don Diego says:

      hey ,that’s interesting David, I came to this site with exactly the same question in mind ,word for word…. so what did you find out?

      if anything this indicate that the “recent research” took place before 2009…

      Ken I loved the article by the way, thanks

      • Ken Carroll says:

        Hey Don,

        I haven’t been following these studies that closely in recent times, but they’re having a much bigger effect on our world than I had seen when the psot was written. Imagine Google without the lexical approach to search.
        And Google also uses lexis – not grammar, not semantics – in it’s translation tools. It’s all done on the probabilities of collocation. Amazing stuff.


        • Don Diego says:

          indeed, I never thought of that, that is pretty big… thank you very much for bringing this aspect to my attention Ken

          • Andrew says:

            Thanks for that Ken and Don,

            Don I guess you’re doing the same or similar course as I am, word for word!

            Ken you’re spot on about Google, that explains the often very natural sentences in very unnatural situations often found in services like Google translate.

            Thanks for the useful information and broadening my mind 😉

  11. clarohme says:

    “Especially in one-to-one conversations, I see how they actually use language, and discover many practices like the continua gostando / continua a gostar situation Orlando mentions. (It took me two weeks to realize that no one used demeurer to say, “I live in Paris” — it would be more like “I reside.”
    Can about it more?

  12. Nice post! Good to remind ourselves and revisit the Lexical Approach, alongside the use of corpus data to help us understand the way collocations work.
    So now for the quiz question…what are the strongest collocates of the following words:

    Brazen ______
    _____ amok

    ; )

  13. Gebre says:

    Hello dears? I have read Michael Lewis books because I am now doing my dissertation in this regard. However, there very confusing ideas in it. For instance, if the Lexical Approach is a development of the Communicative Approach, what basic elements does it bring? It says grammar and vocabulary should be taught together, but in what way? Why teaching vocabulary with chunks is more advantageous than teaching it with single words? Which elements should be focused for vocabulary teaching?


  1. […] second post I wanted to mention, The lexical approach revisited, goes further into the theory of how to teach language.  In a way, learning a language at one of […]

  2. […] About Ken Carroll « The lexical approach revisited […]

  3. […] are insights I derived from the work of a man called Michael Lewis, whose ideas on teaching vocabulary and sheer impact on ESL and EFL will only be adequately […]

Speak Your Mind