Saturday 16 August 2014

PISA: leaning tower?

Image: MeiTeng via freeimages.com

The PISA tests attract a great deal of attention across the world, having quickly become the most influential rankings in international education, with students from around the world taking part every three years (Coughlan, 2013 a).

Are they a fair test, or even a necessary one? Andreas Schleicher of OECD is adamant that the tests represent a fair way of assessing what comes out of a country's education system, rather than simply how much money is poured into it (Coughlan, 2013 b). He believes it should serve as a wake up call  to our complacency; are we giving students a false sense of security that will quickly be destroyed in the face of global (or even local) competition? Even in his own country of Germany, the PISA tests acted to damage faith in the system as the results did not match up with the perceptions of the nation; while in Italy, the tests suggested a real inequality in national test scores between north and south. In the UK, the relentless rise in top grades achieved at A-level seems to have stalled in recent years (Coughlan, 2014) - perhaps this is an indication that the government have been forced to rein in their confidence?

The UK government has certainly been taking the stagnation in PISA ratings seriously (Coughlan, 2013 a), with changes made to the curriculum and attempts to give extra financial support to poor performing pupils. However, the government's reaction has been criticised by teachers, as it does not place importance on their professionalism. They point out that higher performing countries make their achievements by letting schools and teachers collaborate, instead of having ministers try to make judgements based on assumptions about market forces. Similar concerns have been raised about reactions in the US (AFTHQ, 2013), as both government and pressure groups try to apply forceful, top-down solutions to quickly bolster the test scores, instead of looking deeper into the data and trying to emulate the methods and philosophies of the top-performing nations.

We always seem to come down to how competitive we can be as a society when reacting to these test scores, but perhaps we are ignoring a deeper and growing trend towards cooperation, with individuals taking advantage of technology to share knowledge in ways previously considered impossible, and often in ways that will actually benefit their competitors as well (Price, 2013).

References:

Sunday 4 May 2014

Method in the madness?

Image: freeimages
So I've got my research plan ready, and now just need some final feedback from my supervisor and an ethics check for the sound research plan that I've put together. It is sound - I'm sure, aren't I? Having spent a long time reading and re-reading books on research methods, constructing, de-constructing and re-constructing my world view, I should be able to put together a set of research questions that covers all bases for both my own project and the needs of my company. Having said that, I can't help but wonder if I've caught all the main areas; if I haven't left out some critical parts under the time pressure to get a final set of questions 'out there' in time for the project going live.

So how am I going about this? I've been balancing the needs for my own learning by phrasing many of the questions in an open-ended manner that should draw on individual experiences, along with some questions aimed at evaluation of how the examiners actually use the resources. I'll also be inferring some evaluation from the kind of responses that examiners give to the open-ended questions. However there does seem to be a need for more concerted evaluation now that I look back over the questions, so I'm considering how I might go about drawing out some extra data without imposing too much on busy people!

Sunday 13 April 2014

Looking the other way

Image: freeimages
I've done quite a lot of searching through the research library to summarise some of the key findings of my colleagues, but what does everyone else think? There are other exam boards, and undoubtedly many independent researchers in the field of GCSE and A-level examining, so I've done a search for some alternative viewpoints on the matter.


Marking and cognitive psychology

How do examiners actually go about the process of marking, and does it actually matter? Greatorex & Suto (2006) looked at cognitive approaches taken by examiners when undertaking marking, and identified five distinct approaches that were used - rarely in isolation. The approaches identified were matching, scanning, evaluating, scrutiny and no response. These were related to the 'System 1' (quick, associative) and 'System 2' (slow, rule-governed) thought processing models described by psychologists (Kahneman & Frederick, 2002).

The study involved having examiners mark papers, and 'think out loud' about the approach they were taking. When marking short-form answers that could be easily distinguished by single words or numbers, examiners used the 'matching' approach (System 1) to quickly assign marks by pattern recognition. Some longer answers could be marked through the use of 'scanning' to pick either key words (System 1, pattern recognition); or distinct phrases or stages of calculations (System 2, semantic processing). For more detailed answers, examiners moved to the 'evaluating' approach to assess the candidate's response, usually drawing on a variety of sources, and compare these to their own knowledge and the mark scheme (entirely System 2). Where responses deviated noticeably from the mark scheme, examiners would engage in 'scrutinising' to identify if the response was worthy of credit, for instance an unexpected but valid response; this approach also draws entirely on System 2. In the case of 'no response', examiners would use a simple System 1 approach to check that material has not been written elsewhere.

The researchers then went on to analyse how frequently the different approaches were used in different subjects. There was a marked difference between Mathematics and Business Studies papers: Mathematics responses called for a high level of matching, with slightly less evaluation, and relatively small amounts of scanning and scrutinising; Business Studies drew heavily on evaluating, with matching as the secondary approach, and small amounts of the other approaches. Most importantly, the study showed that the different approaches were used across multiple subjects.

There was notably no relation between marking strategy and marking reliability - multiple approaches could be equally valid and successful. There was also no significant difference between marking approaches for novice and experienced markers. Senior examiners went on to suggest that new examiners could benefit from some explicit advice being given to new examiners about their approach to marking, possibly with screen recordings overlaid with commentary. The researchers also noted that there did not seem to be any difference in cognitive approach between paper-based marking and on-screen marking, although this had yet to be confirmed by direct study.

References:
  • Greatorex, J. & Suto, W.M.I., 2006. An empirical exploration of human judgement in the marking of school examinations. In International Association for Educational Assessment Conference. Singapore. Available at:http://iaea.info/documents/paper_1162a2471.pdf
  • Kahneman, D. & Frederick, S., 2002. Representativeness revisited: Attribute substitution in intuitive judgment. In T. Gilovich, D. Griffin, & D. Kahneman, eds. Heuristics and biases:  The psychology of intuitive judgment. New York,  NY, US: Cambridge University Press, pp. 49–81.

Tuesday 8 April 2014

Reliability of public examinations

Image: freeimages
What makes for a reliable exam series? And who actually knows or cares?


A community effort?

Baird, Greatorex and Bell (2004) set out to study the effect that a community of practice could have on marking reliability - their study concluded that discussion between examiners at a standardisation meeting had no significant effect on marking reliability, therefore the effect of a community of practice was called into question. I have to challenge this conclusion, because the artificial and synchronous conditions of a standardisation meeting most definitely do not represent the potential for the asynchronous nature of a community of practice. The authors didn't take this into account - interesting that they actually cited Wenger in their references. Later on in the paper they actually seem to acknowledge that tacit knowledge already acquired in a community of assessment practice might actually explain why the format of standardisation meetings seemed to have no significant effect - this seems more in line with the effect that I would expect a community of practice to have.



What do these people know anyway?

Later research by Taylor (2007) sought to investigate public perceptions of the examining process. The study involved a range of participants - students, parents and teachers, using a range of interview techniques, and looking at issues such as how papers are marked, the reliability and quality of marking, and the procedure for re-marks. The level of awareness about these issues varied somewhat, but it seemed very few people (even teachers) had full knowledge of the entire process.

There seemed to be a perception among students and parents that several examiners would mark each script (p.6) and arrive at a final mark through consensus, while teachers generally felt that a single examiner marked each paper, but this was based on perception rather than firm knowledge. Students and parents did not seem to have any real knowledge of how examiners might arrive at a mark, and even teachers were not aware of the hierarchical method, although they agreed that it seemed sensible when it was explained to them. All groups believed that the system would work better with multiple examiners, although they did acknowledge that this might be unrealistic given time and financial constraints. When questioned about the possible merits of having multiple examiners mark a single paper, examiners commented that any gains would be minimal in comparison to the present hierarchical system, and the cost would be prohibitive.

Members of the public seemed to have a better awareness of the concept of reliability (p.7), with teachers having a similar perception to examiners about the potential for marks between examiners to differ within a band of ability. Students and parents seemed to understand the potential for marks to differ, although their expectations were more optimistic than the real situation. There was much less understanding of how quality of marking was assured (p.8), with perceptions varying from an assumption that there was no checking at all. to some people believing that far more scripts were checked than is the case. All parties agreed that quality checking was desirable, and examiners appeared to support the current system. Re-marking was more common knowledge to all the participants (p.10), although there was little knowledge of the precise system used.

The report considered whether attempts to increase public understanding of the exam system would promote public trust - although some believe that greater transparency might actually invite criticism, other literature seems to suggest that revealing the workings of the system would lead to more realistic expectations and improved engagement, rather than a focus on failings. Establishing a clearer link between understanding of the exams process and public confidence was suggested for future research.

More on reliability

Chamberlain (2010) set out to build on the research into public awareness of assessment reliability - which also ties in well with some of the points raised by Billington (2007) about whether the general public are acting on good information or misinformation.
The research into public awareness used focus groups as a means of drawing out understanding, as it bypasses some of the possible problems of researcher bias (since the researchers were AQA employees, and can help foster a collaborative environment to trigger the sharing of opinions and ideas (p.6). One remaining source of bias was that the groups were small and were composed primarily of people with a particular interest in the exams.

Research with several groups of people, including a number of teachers and trainee teachers, suggested little general understanding of the concept. Secondary teachers had more understanding, often due to exposure to requesting re-marks for their pupils. Promisingly, most of the participants cared enough to indicate they would like to be better informed about the overall process of examinations, although there was no support for any quantification of reliability - most people already seemed to accept that there would inevitably be some variance in how marks were awarded, but placed trust in examiners to act as professionals. The exception to the rule is when a very public failure occurs, but even in this case attention rapidly fades once the problem seems to be 'fixed', with little real gain in understanding. The video below is one attempt to get some understanding out into the public domain:


Making The Grades - A Guide To Awarding



References:


Sunday 30 March 2014

Trust

Trust (image: http://www.sxc.hu/)
Image: freeimages

In December 2011, a scandal emerged from the exposure of corrupt practices by two senior examiners from an examination board (Newell, 2011; Orr, 2011; Watt, 2011), who were captured on video telling teachers at a paid-for seminar what subjects would appear on upcoming exam papers. The ensuing public outcry and reaction from government officials led to accusations that the whole exams system was compromised by profit motives (Garner, 2011), and even calls to abolish the exam boards and replace them with a single body.


What causes examiners, with such a degree of trust placed in them, to break with the values of the education system in this way? How badly eroded is the trust in our public education system? Was this an isolated incident, or was it just a flashpoint for a growing sense of discontent with education in our country?


Four years prior to this incident, Billington (2007) reviewed the literature on trust in public institutions, with a view to building an understanding of how to look at issues of trust within examinations and examination standards. Many public institutions appear to suffer from what is described as either a 'crisis of trust' or a 'culture of suspicion', characterised by an expressed lack of faith in the institution - although this does not always equate to a lack of trust in individual professionals within that system, particularly when there is an immediate need for their skill.


There are also distinct shifts in the way that institutions and members of the public relate to one another, largely relating to the impact of information and communications technology (Billington, 2007, p.2), which alter perceptions of and demands for equality. Professionals have access to increasingly powerful tools and knowledge designed for their use, whilst the general public also have access to a wider body of information - and misinformation. People often feel the need to gain a little control over professionals whose power affects their lives, and they will attempt to use any tools at their disposal to get a sense of equality. When there is a huge deal of accountability data available, members of the public may use this accountability as a substitute for trust, which can lead to negative consequences.


In the case of public examinations, the consequences have often been that schools gravitate towards exam boards that offer less demanding specifications. There was a significant move towards the International GCSE and International Baccalaureate qualifications without any formal recognition from government bodies, which seems indicative of an erosion of trust. Education scandals in previous years had already brought a higher public awareness of examination standards. There has also been an increased politicisation of education, with an apparent mismatch between the goals of the government and the exams watchdog QCA being at odds with one another. It seems then that the media coverage of the cheating examiners has sparked a strong response to a growing sentiment of suspicion.


The qualifications environment in England has become increasingly similar to one of a free market economy (Jones, 2011); language such as 'gold standard' and 'currency of qualifications' suggest by metaphor that our understanding of and behaviour towards qualifications is strongly shaped by market philosophy. Allowing market forces to enter an area of public service can affect perceptions of trust - Billington (2007, p. 6-7) mentions the medical profession as an example of public trust being undermined by suspicion about the motives of those who represent the profession. When qualifications are treated as currency, we risk losing sight of the real educational value - just as actual currency no longer has any real relation to a physical 'gold standard' any more.

I realise that I am starting to veer off to a new matter entirely, so I'll end this blog post here!

References:

Wednesday 12 February 2014

Re-affirming the project outline

My project outline has remained a little vague, so I thought I would try to add more details to it...

We currently provide secondary qualifications for 14 - 19 year olds. This work is dependent on a network of some 35,000 teachers and other experts to help set and mark examinations, with technology playing an increasingly pivotal role to ensure fast and scalable transfer of marks. We are continually exploring new technologies for mark capture and transfer to ensure the best possible service for candidates.

Trialling and adopting a new marking technology requires training provision for large numbers of examiners. Previous technology adoptions have initially been dependent on government funding for their initial success, but this funding is no longer available. Training provision has increasingly moved towards the creation of online software demonstration videos and interactive simulations, now hosted on a secure Learning Management System (LMS).

We are currently piloting a new marking technology with a very small number of examiners who have received face-to-face training, and are now looking to move to exclusively online training as soon as possible. With previous technology adoptions the online provision has been developed largely through internal discussion after face-to-face training and released without a live test for examiners. For this project, the online learning materials are being developed alongside the first live pilot of the technology, with an opportunity for early testing and feedback.

We will be using action research methods to inform improvements to the online learning materials and identify additional support methods prior to general release. This research will be expanded for the live use of the software during the summer examination series, with a view to providing both evaluation of success and action research for practitioner development.

Sunday 19 January 2014

Five questions

Image: freeimages

I'm currently working through some questions for framing qualitative research (Mason, 2002), which I hope will help me to better frame my research.

The social 'reality': Your Ontological Perspective

What is the nature of the phenomena, or entities, or 'social reality' that I wish to investigate?

Elements from Table 1.1 that appeal:
  • People, social actors, motivations, identities, cultural or social constructions
  • Experiences, development, behaviours, interactions, social processes
  • Institutions, markets, societies, organisation, connectedness, multiple realities or versions + tribes, networks
These are more relevant to my overall world-view, so I will generate an additional set specific to the current project:
  • People, understandings, perceptions,attitudes, thought
  • Experiences, development, actions, interactions, situations, rules
  • The 'material', groups, organisation

Knowledge and Evidence: Your Epistemological Position

What might represent knowledge or evidence of the entities or social 'reality' that I wish to investigate?

Experiences of users will allow me to link my design approach to how people perceive the materials and support framework - affects my ability to rationally design or re-design materials and approaches in response to user feedback.

The research would benefit from some objective measures of performance to show that knowledge about systems and procedures has successfully transferred, rather than asking for user ratings - allows for genuine accountability in our training approach.

Your Broad Research Area

What topic, or broad substantive area, is the research concerned with?

This research topic is concerned with understanding what experiences examiners have when using online resources and support to adopt new software for marking and standardisation, and any different procedures that must be adopted. My focus is on interpreting results in such a way as to allow continuous improvement of the design of such materials, and to understand how much effect the strategy has on learning, as evidenced through performance. I do not have control over the precise methods for measuring performance, but am able to access such data.

Your Intellectual Puzzle and Your Research Questions

What is the intellectual puzzle?

How do I show that my learning intervention has had the desired impact, and how can I rationally design better approaches in future, or re-design to off-set any shortcomings?

What do I wish to explain or explore?

I wish to explain how the support approaches link to performance and attitudes among the examiners, and explore how to improve both in the future.

What type of puzzle is it?

This is a causal / predictive puzzle.

Your Research Questions

What are my research questions?

For gaining insights from examiners taking part in the live pilot I'm thinking along the lines of:
  • Describe your experiences of using the software to carry out standardisation of marking
  • Describe your experiences of using the software to record and submit your marks
  • Describe your experiences of communicating with your senior examiner during the marking period
For improving on my online materials:
  • Describe your experience of using the online resources to prepare you for marking
  • Describe your experience of using the online resources and printable materials during the marking period
More of these to come - from previous experience I find that it's best to mull over these and look at what I've written down with a fresh perspective later.

Your Aims and Purpose

What is the purpose of my research? What am I doing it for?

The purpose of this research is to ensure the timely and accurate delivery of high quality marking for candidates sitting national examinations. The research will be done on behalf of my employer, to benefit examiners whilst delivering the marking and help to maintain motivation through consideration of the support offered. By achieving these objectives it is also expected that examiner performance will improve, which will be judged through the assessments from senior examiners and staff. The timeliness and accuracy of results could also be interrogated relative to expected deadlines and the number of examiners stopped or re-marked.

References

  • Mason, J., 2002. Qualitative researching, London: SAGE Publications.

Sunday 12 January 2014

Architecture revisited


I'm going to take another look at the Learning Architect approach for my project, but this time from the perspective of staff offering support to examiner. Now this covers a wide variety of job roles, so I'll try to cover as many angles as I can, and think of how we might make best use of the different areas. This post will probably come across as a little scattered at the time of writing - it's primarily intended for some reflection after the event!

From the top down:

Experiential
Staff will naturally be taking part in performance appraisals so it will be worth considering how their involvement with the online marking will be judged - relating this to our cornerstone behaviours. Making sure that good ideas are credited will help to embed good behaviours.

On-demand
Producing performance support materials that relate to specific roles - these could be as simple as one-sheet reference guides or checklists for particular tasks that can be printed out. These will include the materials that are provided for examiners so that staff can advise examiners who are under pressure, and could also be expanded to include common problems that are encountered by the team leading the initiative.

Non-formal
Staff will have access to the same rapid e-learning as examiners via the LMS, although perhaps there may be a case for adding in some modules that relate to specific roles? This could also be covered in mini-workshops for staff - though possibly this is a continuous blur into on-job training. The main focus should be on discussing particular problems, with the knowledge transfer aspect left to e-learning to be looked at before the sessions.
Other possibilities are webinars to cover discussion of emergent issues, although this is probably more likely to be done on an individual basis and difficult to capture.

Formal
Classroom courses will undoubtedly be offered in some form - although these might actually be better described as mini-workshops. The opportunity for collaboration is limited, and the focus is likely to be on knowledge transfer, which is better left to the e-learning modules. Whether these are regarded as formal self-study or rapid is a debate for elsewhere!
Our main nod to formal learning should be to ensure that business goals are made explicit, and that staff know how their role relates to achieving them. Assessment will be based on accomplishing objectives, and nothing more!

From the bottom up:

Experiential
Encouraging personal reflection or reflection with others within or across work teams will help to ensure that lessons are learned well. This will be dependent on engaging effectively with line managers.

On-demand
The use of forums and wikis is one area where we might be able to advance the information sharing between colleagues, although there would probably need to be some moderation of comments and content to ensure accuracy, and many colleagues would probably prefer to stick to more formal channels that they are used to monitoring. The use of a wiki may be worth pursuing for future, but effective guidance would have to be in place, and this would have to take a back seat until other top-down measures are in place.

Non-formal
Nothing here for now

Formal
Nothing here for now

Overall, I believe the need for bottom-up learning is less immediate in this context - the tasks and goals are generally quite fixed, rather than fluid. Our key focus for bottom-up learning should centre around making tacit knowledge transfer more readily to explicit knowledge. This will stem primarily from the staff leading the project, but careful consideration of affected parties will help us know where best to direct our efforts.

Sunday 5 January 2014

Re-examining my worldview

Image: ariss via freeimages

Research Philosophy

How do I identify with four of the main worldview areas, as discussed by Newby?

Scientism and Positivism

This school of thought demands predictable cause and effect - not easy to establish where people are concerned. Once people realise that others are able to predict their behaviour in any way, they tend to change their behaviour to avoid manipulation! It may be beneficial to look for patterns if you have a critical interest, such as a business issue, although this may actually overlap with more humanistic traditions (see below) because you will be seeking a model that is 'fit for purpose' rather than trying to establish an objective truth.

This is the case for the research carried out by Meadows (2004) and Tremain (2011, 2012), where the critical factors affecting examiner retention and job satisfaction needed to be identified. Establishing an all-encompassing theory or truth (positivism) may be too ambitious - and would be clouded by personal interests - identifying and monitoring the relevant factors was achieved through careful selection of questions and interpretation of data.


Future reading: Discussion of the work of Karl Popper (Newby, Ch.3)

Humanism, Phenomenology and Existentialism


Whist positivism demands an objective external truth independent of human influence, humanism treats 'truth' as a social construction, where one culture's truth may not be another's. This ties well with my understanding of social evolution (Hobson, 2012; Ronfeldt, 1996, 2012a,b) whereby societies develop distinct cultures of varying complexity through the addition of different forms of organisation. Conflicts arise between or even within societies due to disagreements about how society should be ordered (or not) by hierarchical institutions and free market policies. Ronfeldt (2013) acknowledges that even the term 'tribe' in his T-I-M-N framework is frequently contested by others, and progresses the model by engaging with and incorporating these disagreements into his model.

Phenomenology focuses on individual and collective experiences to form a rational basis for future action, by probing the differences between 'perceived' and 'experienced' world. Methods include description observation, reporting and reflection.


Existentialism centres around seeking to understand the world from a personal perspective, driven by conviction and desire. Methods could extend to asking participants for other forms of 'data' such as pictures, videos and stories of their lives - anything that conveys their viewpoint in richer detail than a questionnaire could. Existentialism has been applied to curriculum design, by designing a curriculum that centres around self-discovery.


Critical theory

This theory is concerned with political beliefs, particularly those that are left-of-centre and seeking to change society by making people aware of their circumstances in order to liberate them. Research carried out within a critical theory framework seeks to observe and expose individual facts which can then be combined to form an argument for change.


Aside: I am particularly interested in the references to Karl Marx, as this led to a model for social evolution around the struggle for control of wealth - somewhat akin to Ronfeldt's description of the transition between biform (T+I) and triform (T+I+M) societies.


Applying critical theory to education has been used by academics to attack the present model for education as being primarily based around servicing the needs of the capitalist economy, and thus serves only to reinforce the inequalities that it inevitably generates. However positivists would undoubtedly attack this approach because it proposes a hypothesis and seeks to prove it.

Postmodernism

This approach distinguishes itself from the 'modernist' approaches by rejecting the 'modernist' assumption that there is a single explanation for things, which leads to a natural order. For education, this almost rejects the need for theories of education, since such theories are rooted in modernist approaches! This approach does overlap with my TIMN world-view, because it is by nature multi-layered, and acknowledges that 'people and organisations can play several and sometimes conflicting roles', in keeping with the different types of organisations and the interfaces between them.


Postmodernism bears a great deal of similarity to the network (+N) principles described by Ronfeldt, with researchers seeking to act more as nodes in a network, unpicking their assumptions and sharing data that reflects their local situation, with the ideal that the value of the network increases not simply with the number of nodes but the number of connections between them.


Future reading: Steven Johnson: Future Perfect 

Summary


My adoption to the T-I-M-N framework in my outlook leads me to cross several of the boundaries of research philosophy, but perhaps I identify more strongly than I realised with the post-modernist approach for rejecting the established order with its demanded polarisation of political outlooks.

References: