Research Methods in Education: Examinations

What makes for a reliable exam series? And who actually knows or cares?

A community effort?

Baird, Greatorex and Bell (2004) set out to study the effect that a community of practice could have on marking reliability - their study concluded that discussion between examiners at a standardisation meeting had no significant effect on marking reliability, therefore the effect of a community of practice was called into question. I have to challenge this conclusion, because the artificial and synchronous conditions of a standardisation meeting most definitely do not represent the potential for the asynchronous nature of a community of practice. The authors didn't take this into account - interesting that they actually cited Wenger in their references. Later on in the paper they actually seem to acknowledge that tacit knowledge already acquired in a community of assessment practice might actually explain why the format of standardisation meetings seemed to have no significant effect - this seems more in line with the effect that I would expect a community of practice to have.

What do these people know anyway?

Later research by Taylor (2007) sought to investigate public perceptions of the examining process. The study involved a range of participants - students, parents and teachers, using a range of interview techniques, and looking at issues such as how papers are marked, the reliability and quality of marking, and the procedure for re-marks. The level of awareness about these issues varied somewhat, but it seemed very few people (even teachers) had full knowledge of the entire process.

There seemed to be a perception among students and parents that several examiners would mark each script (p.6) and arrive at a final mark through consensus, while teachers generally felt that a single examiner marked each paper, but this was based on perception rather than firm knowledge. Students and parents did not seem to have any real knowledge of how examiners might arrive at a mark, and even teachers were not aware of the hierarchical method, although they agreed that it seemed sensible when it was explained to them. All groups believed that the system would work better with multiple examiners, although they did acknowledge that this might be unrealistic given time and financial constraints. When questioned about the possible merits of having multiple examiners mark a single paper, examiners commented that any gains would be minimal in comparison to the present hierarchical system, and the cost would be prohibitive.

Members of the public seemed to have a better awareness of the concept of reliability (p.7), with teachers having a similar perception to examiners about the potential for marks between examiners to differ within a band of ability. Students and parents seemed to understand the potential for marks to differ, although their expectations were more optimistic than the real situation. There was much less understanding of how quality of marking was assured (p.8), with perceptions varying from an assumption that there was no checking at all. to some people believing that far more scripts were checked than is the case. All parties agreed that quality checking was desirable, and examiners appeared to support the current system. Re-marking was more common knowledge to all the participants (p.10), although there was little knowledge of the precise system used.

The report considered whether attempts to increase public understanding of the exam system would promote public trust - although some believe that greater transparency might actually invite criticism, other literature seems to suggest that revealing the workings of the system would lead to more realistic expectations and improved engagement, rather than a focus on failings. Establishing a clearer link between understanding of the exams process and public confidence was suggested for future research.

More on reliability

Chamberlain (2010) set out to build on the research into public awareness of assessment reliability - which also ties in well with some of the points raised by Billington (2007) about whether the general public are acting on good information or misinformation. The research into public awareness used focus groups as a means of drawing out understanding, as it bypasses some of the possible problems of researcher bias (since the researchers were AQA employees, and can help foster a collaborative environment to trigger the sharing of opinions and ideas (p.6). One remaining source of bias was that the groups were small and were composed primarily of people with a particular interest in the exams.

Research with several groups of people, including a number of teachers and trainee teachers, suggested little general understanding of the concept. Secondary teachers had more understanding, often due to exposure to requesting re-marks for their pupils. Promisingly, most of the participants cared enough to indicate they would like to be better informed about the overall process of examinations, although there was no support for any quantification of reliability - most people already seemed to accept that there would inevitably be some variance in how marks were awarded, but placed trust in examiners to act as professionals. The exception to the rule is when a very public failure occurs, but even in this case attention rapidly fades once the problem seems to be 'fixed', with little real gain in understanding. The video below is one attempt to get some understanding out into the public domain:

Making The Grades - A Guide To Awarding

References:

AQA Education, 2013. Making The Grades - A Guide To Awarding, Available at:http://www.youtube.com/watch?v=S9orrawQQII&feature=youtube_gdata_player[Accessed April 5, 2014].
Baird, J., Greatorex, J. & Bell, J.F., 2004. What makes marking reliable? Experiments with UK examinations. Assessment in Education: Principles, Policy & Practice, 11(3), pp.331–348. Available at: http://www.tandfonline.com/doi/abs/10.1080/0969594042000304627 [Accessed December 8, 2013].
Billington, L., 2007. Public trust and high stakes assessment. Available at: https://cerp.aqa.org.uk/research-library/public-trust-and-high-stakes-assessment [Accessed March 27, 2014].
Chamberlain, S., 2010. Public perceptions of reliability. Available at: https://cerp.aqa.org.uk/research-library/public-perceptions-reliability [Accessed March 29, 2014].
Chamberlain, S., 2012. Qualification users’ perceptions and experiences of assessment reliability. Available at: https://cerp.aqa.org.uk/research-library/qualification-users-perceptions-and-experiences-assessment-reliability [Accessed March 29, 2014].
De Moira, A.P. et al., 2002. Marking consistency over time. Research in Education, 67(-1), pp.79–87. Available at: http://dx.doi.org/10.7227/RIE.67.8 [Accessed December 8, 2013].
Pinot de Moira, A., 2013. Features of a levels-based mark scheme and their effect on marking reliability. Available at: https://cerp.aqa.org.uk/research-library/features-levels-based-mark-scheme-effect-marking-reliability [Accessed March 29, 2014].
Royal-Dawson, L. & Baird, J.-A., 2006. Is teaching experience necessary for reliable marking? Available at: https://cerp.aqa.org.uk/research-library/teaching-experience-necessary-reliable-marking [Accessed December 8, 2013].

Image: freeimages

In December 2011, a scandal emerged from the exposure of corrupt practices by two senior examiners from an examination board (Newell, 2011; Orr, 2011; Watt, 2011), who were captured on video telling teachers at a paid-for seminar what subjects would appear on upcoming exam papers. The ensuing public outcry and reaction from government officials led to accusations that the whole exams system was compromised by profit motives (Garner, 2011), and even calls to abolish the exam boards and replace them with a single body.

What causes examiners, with such a degree of trust placed in them, to break with the values of the education system in this way? How badly eroded is the trust in our public education system? Was this an isolated incident, or was it just a flashpoint for a growing sense of discontent with education in our country?

Four years prior to this incident, Billington (2007) reviewed the literature on trust in public institutions, with a view to building an understanding of how to look at issues of trust within examinations and examination standards. Many public institutions appear to suffer from what is described as either a 'crisis of trust' or a 'culture of suspicion', characterised by an expressed lack of faith in the institution - although this does not always equate to a lack of trust in individual professionals within that system, particularly when there is an immediate need for their skill.

There are also distinct shifts in the way that institutions and members of the public relate to one another, largely relating to the impact of information and communications technology (Billington, 2007, p.2), which alter perceptions of and demands for equality. Professionals have access to increasingly powerful tools and knowledge designed for their use, whilst the general public also have access to a wider body of information - and misinformation. People often feel the need to gain a little control over professionals whose power affects their lives, and they will attempt to use any tools at their disposal to get a sense of equality. When there is a huge deal of accountability data available, members of the public may use this accountability as a substitute for trust, which can lead to negative consequences.

In the case of public examinations, the consequences have often been that schools gravitate towards exam boards that offer less demanding specifications. There was a significant move towards the International GCSE and International Baccalaureate qualifications without any formal recognition from government bodies, which seems indicative of an erosion of trust. Education scandals in previous years had already brought a higher public awareness of examination standards. There has also been an increased politicisation of education, with an apparent mismatch between the goals of the government and the exams watchdog QCA being at odds with one another. It seems then that the media coverage of the cheating examiners has sparked a strong response to a growing sentiment of suspicion.

The qualifications environment in England has become increasingly similar to one of a free market economy (Jones, 2011); language such as 'gold standard' and 'currency of qualifications' suggest by metaphor that our understanding of and behaviour towards qualifications is strongly shaped by market philosophy. Allowing market forces to enter an area of public service can affect perceptions of trust - Billington (2007, p. 6-7) mentions the medical profession as an example of public trust being undermined by suspicion about the motives of those who represent the profession. When qualifications are treated as currency, we risk losing sight of the real educational value - just as actual currency no longer has any real relation to a physical 'gold standard' any more.

I realise that I am starting to veer off to a new matter entirely, so I'll end this blog post here!

References:

Billington, L., 2007. Public trust and high stakes assessment. Available at: https://cerp.aqa.org.uk/research-library/public-trust-and-high-stakes-assessment [Accessed March 27, 2014].
Garner, R., Profit motive has created corrupt education system, say teachers. The Independent. Available at: http://www.independent.co.uk/news/education/education-news/profit-motive-has-created-corrupt-education-system-say-teachers-6274531.html [Accessed March 27, 2014].
Jones, B., 2011. Regulation and the qualifications market. Available at:
https://cerp.aqa.org.uk/research-library/regulation-and-qualifications-market [Accessed March 29, 2014].
Newell, C., 2011. Exam boards: “We”re cheating, we’re telling you the question cycle'. Telegraph.co.uk. Available at:
http://www.telegraph.co.uk/education/secondaryeducation/8940799/Exam-boards-Were-cheating-were-telling-you-the-question-cycle.html [Accessed March 29, 2014].
Orr, J., 2011. Exam boards: examiners suspended in “corrupt practices” row.Telegraph.co.uk. Available at: http://www.telegraph.co.uk/education/secondaryeducation/8943300/Exam-boards-examiners-suspended-in-corrupt-practices-row.html [Accessed March 27, 2014].
Watt, H., 2011. Exam boards: how examiners tip off teachers to help students pass. Telegraph.co.uk. Available at:
http://www.telegraph.co.uk/education/secondaryeducation/8940781/Exam-boards-how-examiners-tip-off-teachers-to-help-students-pass.html [Accessed March 29, 2014].

Research Methods in Education

Tuesday, 8 April 2014

Reliability of public examinations

Sunday, 30 March 2014

Trust