Over the last 12 months, I’ve been reading a fair bit about educational ‘measurement’ and, while doing so, this well-known George Orwell quote has come to mind many times:
So much of left-wing thought is a kind of playing with fire by people who don’t even know that fire is hot.
The line immediately preceding this describes a kind of ‘amoralism’ which “is only possible, if you are the kind of person who is always somewhere else when the trigger is pulled.”
These lines appear in a 1940 piece, Inside the Whale, about the “warmongering to which the English intelligentsia gave themselves up in the period 1935-9” and, as such, were part of a strident criticism about gravely important matters. I don’t want to suggest that there’s a parallel between the build-up to World War II and language assessment beliefs, discourse and practices. But Orwell’s lines did come to mind again tonight when I read about the Duolingo English Test in ELTjam’s latest blog post, ‘8 ELTjam predictions for 2017‘.
In the interests of living a more positive and fulfilling life, I’d put Duolingo out of my mind since reading in 2015 about their ridiculous partnership with Uber but I’ve been thinking for some time about writing about language assessment and so much of what is wrong with assessment in ELT is reflected in this ELTjam post that it’s had a galvanising effect on me.
ELTjam’s Nick Robinson writes that “Duolingo’s assessment product features cloze activities that are clearly generated from out-of-copyright texts that happen to include a requisite number of target items (see screenshots below).”
I’ve seen (and been responsible for) a lot of poorly designed language test items but this is one of the very worst. It might be adequate if it were intended to test English majors’ knowledge of verbs used by James Joyce on page 6 of Ulysses.
But it’s not intended to test English majors’ knowledge of verbs used by James Joyce on page 6 of Ulysses. It appears to be a genuine part of what Duolingo describe as a test which is “scientifically designed to provide a precise and accurate assessment of real world language ability” and “significantly correlated with the TOEFL iBT (a standardized English test)”.
Every part of that claim is utter bullshit and the above sample test item is ample evidence. If you’re not convinced, here are some items from the Duolingo ‘Practice Test’ I took a few minutes ago (after signing up of course: you have to exchange a marketing lead and your data for the privilege of determining whether it might be worth spending $49 to take the full test and receiving an “official score and certificate”).
The first one is more of the same kind of weirdness that users of the Duolingo app are already familiar with. The second one is from the ‘History of the New York Jets’ Wikipedia page. That last one is based on a passage from a Sherlock Holmes story. They’re all completely appalling and no amount of flimsy ‘validity studies’ will compensate for such a cavalier approach to item construction.
Any company charging $49 for such a test and then pretending to be ‘partners’ with 45 universities and colleges, including Yale, should be dismissed as a joke. Instead we have influential folk like ELTjam meekly stating that “it’s just a matter of time” before the technology is “where it needs to be in order to create the level of content quality that an ELT publisher would expect” and anyway “how many students actually know or care” if the proprietors’ claims about their technology are utter bullshit, “especially if a product is free” [except that in this case it costs $49].
But people like Duolingo’s Luis Von Ahn and ELTjam don’t seem to understand the complexity and seriousness of what it is they’re playing with. ‘Automated content creation’ in language assessment is just another step further along “a continuum that started with the switch from royalties to fees and ends with the use of algorithms to create learning materials that publishers would have had to pay authors to produce in the past.”
In an effort to crystallise my own thoughts and deepen my understanding, I’m aiming this year to write more about the complexity and seriousness of assessment and some of the problematic beliefs and practices I’ve come across in ELT.
3 thoughts on “Playing with fire when you don’t even know it’s hot: ELTjam, Duolingo and irresponsible practices in language assessment”
Wow, that’s truly horrid!
Do you mean the Duolingo English Test or my writing? 😉
Definitely the former!!!