Automated Essay Scoring |
Automated Essay Scoring (AES) is an emerging area of assessment
technology that is gaining the attention of Canadian educators and
policy leaders. It involves the training of computer engines to rate
essays by considering both the mechanics and content of the writing.
Even though it is not currently being practiced or even tested in a
wide-scale manner in Canadian classrooms, the scoring of essays by
computers is fueling debate leading to the need for further independent
research in order to help inform decisions on how this technology should
be handled.
However, independent research on automated essay
scoring is hard to come by due to the fact that much of the research
being conducted is by and for the companies producing the systems. For
that reason SAEE, through the Technology Assisted Student Assessment
Institute (TASA) commissioned Dr. Susan M. Phillips to scan and analyze
the current research on this topic from a variety of disciplines
including writing instruction, computational linguistics, and computer
science. The purpose of the report, Automated Essay Scoring: A
Literature Review, is to communicate a balanced picture of the state of
AES research and its implications for K-12 schools in Canada. The
review is broad in scope including a wide range of perspectives designed
to be of interest to teachers, assessment specialists, developers of
assessment technology and educational policy makers.
Most AES
systems were initially developed for summative writing assessments in
large-scale, high-stakes situations such as graduate admissions tests
(GMAT). However, the most recent developments have expanded the
potential application of AES to formative assessment at the classroom
level, where students can receive immediate, specific feedback on their
writing and can still be monitored and assisted by their teacher.
Numerous
software companies have developed different techniques to predict essay
scores by using correlations of the intrinsic qualities. First, the
system needs to be trained on what to look for. This is done by entering
the results from a number of essays written on the same prompt or
question that are marked by human raters. The system is then trained to
examine a new essay on the same prompt and predict the score that a
human rater would give. Some programs claim to mark for both style and
content, while others focus on one or the other.
In terms of their
reliability, Phillips (2007) cautions, to date, there seems to be a
dearth of independent comparative research on the effectiveness of the
different AES engines for specific purposes, and for use with specific
populations...While it would appear that one basis of comparison might
be the degree of agreement of specific AES engines with human raters,
this also needs to be scrutinized as different prompts, expertise of
raters, and other factors can cause different levels of rater agreement.
AES
has great potential. It can be more objective than human scoring
because the computer will not suffer from fatigue or favoritism.
Assessment criteria are applied exactly the same way whether it is the
first or the thousandth essay marked on the same prompt. The potential
for immediate feedback is also considered positively when AES is used as
a formative assessment tool because it allows students to work at their
own level and at their own pace receiving feedback on specific problem
areas.
This rapid feedback also allows for more frequent testing
leading to greater learning opportunities for students. By using
computers to grade essays, the marking load of teachers is reduced
creating more time for professional collaboration, and student-specific
instruction. Since computers are being used more often as a learning
tool in the classroom, computer-based testing places assessment in the
same milieu as learning and provides more accessible statistical data to
inform instruction.
However, adopting AES in Canadian schools
requires a careful investigation of the potential threats. Some say that
it removes human interaction from the writing process. Writing is a
form of communication between an author and a specific audience
according to The National Council of Teachers of English, and using AES
violates the social nature of writing (Phillips, 2007, p. 25). Other
concerns raised are related to whether the systems can adequately detect
copied, or nonsense essays. Currently, systems need to be trained by
specific prompts. This limits the ability of educators to modify or
create their own essay questions, potentially creating greater
separation between learning and assessment. Additionally, implementing
AES in schools involves not only the provision of access to computers
and software, likely purchased from private companies, but also
technical support and professional development to sustain its use.
No comments:
Post a Comment