As third year students, we are beginning to write our dissertations. This piece of work, which will require hours of researching, writing and evaluating, will hopefully be something we are extremely proud of. How would you then feel if you were told your dissertation would never be read? Your work would never inspire a reader, simply uploaded onto a machine that churns out your grade. This controversial method is automated grading, which I introduced last week as a potential solution for the subjectivity within grading.
It is important to note the arguments in favour of using automated grading. Computerised marking is consistent and fairer than human marking, due to the removal of human error and subjectivity (Valenti, Neri & Cucchiarelli, 2003; Jordan, 2012). Additionally, as we have seen from several students’ blogs, technology has become an integral part our education system. We already use automated marking for MCQ’s, therefore perhaps it could be argued that automated marking within essays is the next logical step for technology. Supporting this, initial research into automated grading of short-answer questions has suggested that automated marking is just as effective as human marking (Butcher & Jordan, 2010).
However, short-answer questions often have a right or wrong answer. Conversely, there is often no right answer in university essays, which are of much greater length. This is where the significant flaw to this marking-style is evident. Consider language translators, such as ‘Google Translate’. This automated programme can translate text into any language quickly and accurately. However the translation is taken very literally, which can lead to fundamental issues regarding overall meaning. Similarly, automated grading cannot recognise the wider picture like a human can, and determine ‘What is the purpose of this essay?’ This relates to my discussion last week regarding creativity, where a student produces a novel idea. Logical, original thinking are the most important qualities of effective writing, yet automated marking overlooks them (Byrne, Tang, Tranduc & Tang, 2010). In a module such as this where creativity is a necessity within our blogs, automated grading would be futile.
It is undeniable that these programmes are extremely clever, using artificial intelligence and complex statistics to determine a student’s grade (Valenti, Neri & Cucchiarelli, 2003). However some aspects of writing, such as fluency of knowledge, cannot be directly measured. As a result, programmes measure correlates to fluency, such as essay length (Valenti, Neri & Cucchiarelli, 2003). One must question if a concept as complex as fluency can be reduced to something as simple as essay length? Students may try and ‘beat the system’, producing long essays (regardless of content) in order to achieve a higher grade. Research has shown students’ work is extremely influenced by what they believe the system expects from them (Jordan, 2012), encouraging destructive compliance rather than valuable creativity.
Despite research suggesting automated marking is fairer than human marking, this doesn’t recognise the issues faced by children with dyslexia or those who are second-language English (Jordan, 2012). Automated programmes often fail to identify misspelt words, or analyse a sentence if it is poorly constructed (Mitchell, Russell, Broomhead & Aldridge, 2002). Computerised errors can also benefit the student unfairly, failing to identify incorrect statements when key words are used within the sentence (Mitchell, Russell, Broomhead & Aldridge, 2002). This evidence suggests automated marking isn’t as accurate as it initially appears.
Unsurprisingly, a lot of debate has emerged about this controversial method, with a petition created against the use of automated systems. Regardless of this method’s effectiveness, I believe it is important to question the effect this method would have on students if it was universally implemented. What would motivate students to engage in their learning if their work was never read? Students will become robots, striving for the right answer with no motivation to discover novel ideas beyond the guidelines. If automated grading is to be integrated within education in any form, it should be used alongside human marking as a confirmation of accuracy, and definitely not as a stand-alone pedagogical tool (Markoff, 2013).
Although my first two weeks of blogging have focused on the flaws of the grading system, I believe there are solutions. I will focus on viable solutions in a later blog; however automated marking isn’t one of them. Inevitably, the only way to determine an effective solution is to eliminate flawed alternative suggestions. Grading has become an integral aspect of our education system, yet its effectiveness is questionable. As psychologists we believe nothing and question everything, and this approach should (but often isn’t) applied to pedagogy.
I end this blog with a question, if we are not machines, why should we be evaluated by one?