What is automated scoring?
- Computer software that automatically assigns scores to writing or speaking samples.
- Essays can be assigned scores instantly by computer.
- Test takers can call a testing centre and take an oral test without speaking to a human.
- Scores can be reported instantly.
- Some level of feedback is given to test takers.
- There is a variety of software available.
1. Natural Language Processing (NLP)
- software identifies and counts linguistic features.
- software does not attempt to gauge content in any way.
- used for testing writing.
- software compares the speech sample to a large database of samples of the same test questions.
- faster responses are 'more fluent'.
- used for testing speaking.
- automated scoring of timed essays
- uses NLP
- currently used in a limited way to rate TOEFL and GRE
- used for formative assessment (e.g. TOEFL practice online)
- individual assessment
- students submit essays, receive scores and re-write them as many times as they want in order to improve their score
- the number of words
- the number of sentences
- the number of paragraphs
- sentence length
- the number of unique words used versus the total number of words (lexical diversity)
- the number of low-frequency words (lexical depth)
- the number of prompt-specific words (topic appropriateness)
- dependent/independent clauses
- passive voice
- subject-verb agreement
- sequencing words
- logical relations
- mechanics (punctuation, for example)
- It's long - longer is always better!
- It has a standard structure.
- It has many longer sentences with a lot of dependent clauses.
- It has many explicit organisational words.
- It has a lot of obscure vocabulary - for example, indubitably would score much higher than surely!
- It has a wide range of vocabulary.
What does E-rater not notice?
- Grammatical errors
- Lexical errors
- Flawed arguments
This is an E-rater application designed for in-class use. Students' essays are instantly scored using E-rater software. Students are given individual scores and extra resources to refer to about their errors.
This is the first fully automated oral language test used commercially. It is a Pearson product. The test is taken in a computer lab or over the phone (speaking to a computer). The computer automatically rates the speech and produces scores. It is used widely in business and increasingly in schools. There are many versions with multiple uses and languages - for the aviation industry, for example.
The test is fifteen minutes long and includes:
- repeating sentences
- scrambled sentences
- oral multiple choice
- sentence mastery
What is a good Versant response?
- It's fast (fluency score)
- It's clear
- It's accurate
- It has native-like pronunciation
What Versant doesn't measure:
- the range of vocabulary used
- extended speaking
- pragmatics - cultural awareness, for example
- the ability to interact with others
- computers don't get tired
- computers aren't biased for or against individuals
- scores are more consistent than with human raters
- it's less expensive than using human raters
- scores and feedback are obtained instantly
Automated tests can be 'gamed' or tricked. Versant scores, for example, can be quickly raised by coaching.
Positive effects on teaching
- Students can get more and faster feedback.
- The form of the test can influence what happens in the classroom.
- Teachers tend to focus on what is tested at the expense of communicative teaching.
- There can be a decreased focus on the quality of the content.
- There can be an increased focus on grammatical accuracy and low-frequency vocabulary.
- There is more oral repetition in order to increase the students' speed of response.
- There is less time spent on developing critical thinking.
- There is a decreased focus on the pragmatic.
Despite the obvious drawbacks, computer scored testing is in all our futures.