Language acquisition:

A critique on "A corpus driven study of the potential for vocabulary learning through watching movies"

Raymond Cheng

Photo © Pavel Losevsky

About the journal paper

The following is a critique of the journal paper written by Dr. Stuart Webb (2010), Senior Lecturer with the School of Linguistics and Applied Language Studies at the Victoria University of Wellington, in the area of language acquisition (specifically, vocabulary learning) titled, "A corpus driven study of the potential for vocabulary learning through watching movies," appearing in International Journal of Corpus Linguistics, 15(4), pp. 497-519, published 2010 [1]. The paper attempted to justify the potential for significant incidental vocabulary learning by watching movies regularly over time through comparing frequency of words that appear from the transcripts of 143 movies with Nation's (2004) 4th to 14th 1,000-word BNC lists. The paper claimed that movies may be a valuable resource for incidental vocabulary learning.

But before I start, I need to disclaim my gut feeling toward the paper – I love going to movies and, personally speaking, I have really learnt a great deal of vocabulary from movies of all sort (including Hollywood, British, French, etc.) over the years. So basically, I would consider myself to be a believer of Webb's research. But this time I will try to be a bit more critical and see if this is really what I have always believed.

Starting from the cognitive perspective

Unless you have better home equipment, find 3D experience a turn-off, hate to bump into those laugh-at-anything text-along-the-movie rebellions, human IMDBs [2] who spew out spoilers or late comers who make noisy entrances, going to a movie for entertainment may not be a bad choice. However, how attentive cinema-goers can remain during the average one-and-a-half-hour movie can be a different issue – especially when it comes to incidental vocabulary learning. Built upon earlier psychology models [3], contemporary cognition theories [4a] on attention (Treisman & Gelade, 1980; Wolfe, 1994) tell us that people become more attentive with tasks when there is more than one type of stimulus or modality, which explains why people get attracted and focused during movies. However, our attention (or cognitive load) varies not only with culture (Correa-Chavez & Rogoff, 2011) but can also be selective [4b] (Eriksen & Hoffman, 1972; Eriksen & St James, 1986) and change as we age (Lavie, Hirst, de Fockert & Viding, 2004). Younger people are able to process multiple stimuli but find it more difficult to differentiate between relevant and irrelevant information whereas older people find it easier to identify what is relevant yet they become less capable of processing multiple stimuli (p.341). In terms of incidental vocabulary learning through watching movie, such concept would translate into the notion that younger people may fail to identify the "relevant" moments for learning vocabulary because they are more likely to be distracted by their better perception toward other stimuli, e.g. actions on the screen, background music, special effects etc., whereas older people may well be concentrating on the "relevant" development as well as the underlying message and philosophy of the film, hence bypassing the opportunity to learn a new term or word. In other words, the claim that the potential of incidental vocabulary learning can be realized by comparing the movie transcripts and Nation's (2004) BNC word lists is, in terms of cognitive theories, kind of problematic [5a], not to mention when most of the usual components of "incidental learning", e.g. task accomplishment, interpersonal interaction, sense of the organized environment and trial-and-error experience (Marsick & Watkins, 2001, p.25) (obviously for adults, in this case), are simply absent – leaving questions toward the paper's research methodology that really requires some further clarification and explanation [5b].

But even if the methodology had stood the challenge, the criteria in which data was collected would still have compromised the research's internal validity. A check of the 143 movies studied in the paper against the top 10 films [6] with the highest gross revenues of each decade (in Table 1) as well as those with the highest UNESCO film popularity scores [7] (see Figure 1) reveals that the author's choice of movies were neither commercially geared toward the box office, nor statistically favored in terms of the actual admission headcounts [8], let alone preferentially selected for analyzing the effects of any particular movie sequel or trans-media production [9]. While the author attributed such a choice to the "availability of movie scripts" (Webb, 2010, p. 504), it is, given that text-formatted subtitles can now be easily extracted from any DVDs or downloaded instantly from online subtitle databases [10], obviously ungrounded. And, adding to the fact that some of these movies date back to as early as the 1930s [11], one may question on the generalizability of the results when we know that the use of a language does change and evolve over time.

1980-1989 Star Wars: The Empire Strikes Back (1980)
Raiders of the Lost Ark (1981)
E.T.: The Extra-Terrestrial (1982)
Star Wars: Return of the Jedi (1983)
Ghostbusters (1984)
Indiana Jones and the Temple of Doom (1984)
Beverly Hills Cop (1984)
Back to the Future (1985)
Batman (1989)
Indiana Jones and the Last Crusade (1989)
1990-1999 Home Alone (1990)
Jurassic Park (1993)
Forrest Gump (1994)
The Lion King (1994)
Independence Day (1996)
Titanic (1997)
Men in Black (1997)
Star Wars: Episode I - The Phantom Menace (1999)
The Sixth Sense (1999)
Toy Story 2 (1999)
2000-2009 Spider-Man (2002)
The Lord of the Rings: The Return of the King (2003)
Shrek 2 (2004)
Spider-Man 2 (2004)
The Passion of the Christ (2004)
Star Wars: Episode III - Revenge of the Sith (2005)
Pirates of the Caribbean: Dead Man's Chest (2006)
The Dark Knight (2008)
Avatar (2009)
Transformers: Revenge of the Fallen (2009)

Table 1. Top 10 movies of the decade, 1980-2010 (in terms of gross revenue)

UNESCO Popularity scores of top 20 featured films 2007-2009
Figure 1. Popularity scores [12] of top 20 featured films 2007-2009. Source: UNESCO Institute for Statistics, January 2012. Note that the data-splits for each of the years are characterized by their respective color dotted lines.

UNESCO Frequency of attendance per capita for the top 10 countries
Figure 2. Frequency of attendance per capita [13] for the top 10 countries (population aged 5 to 79), 2006-2009. Source: UNESCO Institute for Statistics, January 2012.

Note 1: Also available electronically, doi 10.1075/ijcl.15.4.03

Note 2: IMDB, Internet Movie Database, see http://www.imdb.com/

Note 3: See Wicken's (1984) Multiple Resource Theory (MRT) model.

Note 4a: Please refer to the highly influential Feature Integration Theory (1980) developed by Anne Treisman and Garry Gelade and the Guided Search Theory (1993) by Jeremy Wolfe.

Note 4b: For details on selective attention, see the Spotlight model (Eriksen & Hoffman, 1972) and the Zoomlens model (Eriksen & St James, 1986).

Note 5a: The author of the journal paper, Dr. Stuart Webb, pointed out the following in an email to me: "Corpus-driven studies of incidental vocabulary learning provide an indication of how the occurrence of vocabulary in a text type might affect learning. Thus, they indicate what might happen in empirical studies and suggest that the research is followed up with empirical studies doing this. In my study, it is not about the specific movies that were analyzed. These sets of movies provide an indication of the likely distribution of vocabulary occurrence in a certain amount of viewing time. Thus, if we replaced one set with a completely different set, we may find a similar distribution of words according to frequency."

Note 5a: Dr. Stuart Webb also mentioned that I had misunderstood the nature of corpus driven methodologies and disagreed that there had been methodological flaws. The study, according to Webb, was not an empirical study reporting that people would learn a certain number of words through watching certain sets of movies.

Note 6: See the all-time box office hits (by decade and year) at http://www.filmsite.org/boxoffice2.html

Note 7: According to United Nations Education, Scientific and Culture Organization (UNESCO), film popularity is the measure of cinema admissions (UNESCO, 2012).

Note 8: Because cinema ticket prices vary across countries, the box office record should not be presumed to be an accurate reflection of the admission headcount. In addition, according to a research paper (Saptadi, 2009) published by The Nippon Foundation, blockbusters from Hollywood now account for at least 75% of the European market, 96% of box office receipts in Taiwan, approximately 78% in Thailand, 65% in Japan, and more than 60% in mainland China (Jensen, 2012)... etc. In fact, Asia is Hollywood's fastest growing regional market and it is predicted that within 20 years Asia could be responsible for as much as 60 percent of Hollywood¡¦s box-office revenue. In short, we do need to look at the box office because people are going to those movies!

Note 9: Movie sequels and trans-media films (i.e. characters, settings and storylines developed across print, film and web-based media) are much more popular than just the average movie (UNESCO, 2012).

Note 10: There are websites that allow the general public to download movie subtitles in a variety of languages free of charge, for instance, the Open Subtitles website at http://www.opensubtitles.org/

Note 11: Out of the 143 movies selected for Webb's (2010) journal paper, 36 of them, i.e. over 25% of all movies, were released before the 1970s.

Note 12: Source: UNESCO Institute for Statistics, January 2012. Note that the data-splits for each of the years are characterized by their respective color dotted lines.

Note 13: In 2006, Ireland appeared in the list due to the exceptional success of the Irish movie "The Wind that shakes the Barley."