psmythblog

Home » Articles posted by Phil Smyth

Author Archives: Phil Smyth

Peer assessment: why doesn’t it appear to work?

Peer assessment is often promoted as a beneficial strategy for student learning. There are concerns, however, about its effectiveness. Students are often underwhelmed with the experience and teachers sometimes wonder whether it is worth the effort. So why does peer assessment often not work? The lack of success of many attempts to implement peer assessment is, I believe, largely a question of teachers’ beliefs and understandings. To many teachers, peer assessment, like feedback and exemplar use, is a technical skill to be mastered. Once we know how to do it, we can add it to our existing teaching and derive the maximum benefits from it. Yet, peer assessment is not a task or a skill, but rather an approach to helping students learn for themselves. It is therefore not just technical but empowering. But a peer assessment task or feedback given to students is not empowering if and of itself. It becomes empowering when the attitude and belief of the teacher is that students can and should be able to identify what they need to learn, what good quality looks like and learning what they can do to turn their work into good quality. This empowerment needs to occur consistently through a teacher’s teaching, not just when it comes to feedback or a peer assessment task. How can we know as teachers whether we are being empowering? If we are doing peer assessment right, the assessor is learning more than the assessed. The assessor has come to appreciate what quality is and can see where peers’ work is falling short in comparison. Being able to ‘assess’ the quality for themselves is the real learning that comes from peer assessment. But this won’t come from just doing one peer assessment task. Empowering students to assess for themselves needs to be a core aim of all our teaching.

Is there a difference between peer assessment and peer feedback?

Although typically we think of peer assessment as assigning grades and peer feedback as having no grades, I ultimately find separating feedback and assessment unhelpful. There are plenty of assessments that happen in the real world with no numbers or grades, PhD vivas and job interviews being just two examples. The idea that assessment equals grades seems to be unique to education. When education does things differently to the learning we do in our normal lives, I tend to wonder why. I’m not sure the distinction between feedback and assessment really carries any weight beyond educational institutions. The separation we insist on within education, however, seems to have consequences for the ways we view learning and the how we ‘do’ feedback.

The distinction between feedback and assessment implies feedback is learning and assessment is assessment of learning. Assessment is something ‘tacked’ on to the end of learning in order to see if students have learnt. Feedback happens independently of assessment. I think the idea of teachers teaching, students learning exactly what the teacher has taught and students then demonstrating that learning in assessment is a conceptualization of the learning process that, with what we know today, is difficult to sustain. What we know about learning today is that the best learners are ‘self-regulated’ and that learning can and does frequently occur in spite of the teacher. Students do not ‘passively’ acquire the knowledge told to them, but rather are active participants using their prior knowledge and personal agency to learn. Viewing learning in this more ‘constructivist’ light blurs the lines between feedback, learning and assessment and should make us question what feedback is for.

In light of what we know about student learning, the ultimate aim of feedback should be to help students assess themselves. This aim is not well served by feedback that merely ‘tells’ students what they are doing well and what they should improve. This is teacher-directed, ignores student agency and suggests that what the teacher does is somehow more important than what the student does. Feedback, when done well, is fundamentally about students learning to assess themselves. Students give and receive feedback, they learn to seek it, they learn how to process it and learn what to do next based on that feedback. These are fundamentally assessment acts that lead to learning. The separation of feedback and assessment I think masks this assessment-feedback link.

The relationship between a PhD candidate and a supervisor is a good example of feedback as assessment. The supervisor is not only giving feedback so the PhD candidate can write a better thesis. They want the candidate to be able to assess for themselves what is expected, and what they need to do next. The supervisor hopes that the need for feedback becomes obsolete as the candidate becomes better able to assess themselves and make good decisions about what to do next. An overly helpful supervisor telling the candidate what to do and what not to do might be harming the candidate’s ability to function effectively after they receive their doctorate.

The cynic in me asks who benefits from this separation of feedback and assessment. It certainly isn’t the student. By separating feedback and assessment there is a danger that those with assessment and testing expertize claim power over what goes on in classrooms due to their knowledge of statistics and testing and assessment methods. We mustn’t let them. Teachers need to be empowered to assess in their classrooms, not only with grades (in fact rarely with grades), but with informal unplanned assessments that give teachers information about what to do next, and more importantly, allow students to judge what to do next. This can help students develop the assessment capabilities to judge for themselves what they need to do, how they need to do it, and what they need to work on in order to do something successfully. This continual assessment on the part of the student is rich learning. It is rich learning whether it is learning a sport, doing a PhD, learning a musical instrument or learning in our classrooms.

Grading peer assessment

There are two schools of thought with regard to peer assessment: One way to look at it is students making judgements and assigning grades to finished pieces of work and hopefully learning something about quality. Peer assessment entails trying to train students to rate written work reliably and to achieve high agreement between the rater and the teacher. The fact that this all comes at the end of teaching and learning when there is no real further opportunity to apply insights and learning makes the whole enterprise summative.

In this form of peer assessment we are concerned only with the end result of work. The focus is on the correct grade and student effort is directed to this aim. We know when marks are assigned, students are not interested in comments. With the award of grades the whole exercise has become very high-stakes. The stakes of the exercise, the awarding of grades, and the timing of the assessment all seem to work to undermine the promise of effective peer assessment. There’s something about this exercise that feels contrived and unnatural to students. It is hard to envisage students ever needing to go through this exercise again in their lives so they may wonder what the payoff for all their hard work is. And it is a lot of work. Even 5% of a course grade does not seem commensurate with the amount of work a student has to put in.

There is another way of looking at peer assessment though; to view it as inseparable from teaching and learning and an integral part of the learning process. This view of assessment and learning accords with what we do outside the classroom in real life. We constantly assess. Am I talking so much in my classroom that students are not getting enough time to interact? Am I spending too much time at work writing on forums when I should be enjoying family time? Is my writing appropriate for the purpose and audience I have in mind? In all these scenarios we are assessing and monitoring what we do, and we make changes while we still can; before the piece of writing is finished, before the semester finishes, before my life finishes. In real life we always assess but we are not always able to do it well. We often need our friends to point out that we talk too much in class or that we’re neglecting our family. Similarly, peers in a classroom can help students judge if their writing is appropriate based on the learning they have done up to that point in the class.

Within the classroom, teachers and students are constantly assessing. Through constant assessment, teachers and students can learn what is not working and what can be done about it. The assessments we make lead directly to improvements. We teach better when we assess that students are struggling with what we’ve taught and we change our lesson plan. Students can also write better when they are frequently assessed, often by peers in a low-stakes environment. Improvements in writing do not only lead to better quality outcomes. They also lead to learning. In our teaching we can learn why a particular classroom activity did not work and learn to do something different another time. In writing, through peer assessment, students can learn what they tend to do poorly and what they might focus on before an assignment is due. The whole process carried out in a way congruent with real life is likely to be a good starting point for thinking about peer assessment.

Peer assessment is most likely to lead to learning when carried out formatively and frequently, in a low-stakes environment, with exemplars to help students come to realize what they’re aiming for, and without grades. It does this in real life so why should it be different in the classroom? If we can show our students that peer assessment helps improve the quality of their work, and there is learning of a real graduate workplace real life skill into the bargain, they are likely to want to give it a try.

The problem with ‘best practice’ in exemplar research

Typically conceptions of best practice in teaching and learning in higher education focus on the desirability of student-centred over teacher-centred approaches. Student-centred approaches are associated with higher quality learning outcomes while approaches that ‘transmit’ content are said to result in surface approaches to learning. Yet, such conceptions often seem to downplay the importance of the teaching context; the subject matter, the students and the strengths and weaknesses of the teacher. If such contextual features are taken into account, the concept of a ‘universalist’ best-way of teaching is challenged. Good teaching is sensitive to the teaching context and good teachers consider the culture and the educational backgrounds of the students. Best practice may entail teachers adopting various approaches as and when required.

In exemplar research, unlike the teaching approaches research, there is not very clear ‘best practice’. What has been suggested is that exemplars are best used in a dialogic mode where students can discuss exemplars with their teacher and other students. But even here, context seems to be under-explored, especially the agency of individual teachers. The teachers’ beliefs about teaching, learning and assessment, their attitudes to exemplar use and their strengths and weaknesses as a teacher are likely to influence the choices they make with how best to exploit exemplars. Any attempt to define best-practice should therefore be contingent on aspects of why the teacher does what he/she does. Understanding this agency is likely to be critical to understanding the best use of exemplars in higher education.

What is perhaps needed is further research that explores the interplay of the agency of individual lecturers using exemplars (or choosing not to) and the environment of the teaching context. Such research would not be able to evaluate ‘best practice’. Rather, by exploring how teachers use exemplars, and relating this use to teacher agency and environmental factors, a deeper level of understanding could be reached about teachers’ choice of approach with regard to exemplars. This understanding could be used in a model that individual lecturers might use when considering their teaching practice and could be used for staff development.

What does it mean to be a language tester?

It is these days a given that language tests have a social and a political dimension. Tests are frequently given with little or no educational value to decide immigration status and certify suitability for professional work (e.g for air traffic control). Yet, to what extent do language testers engage with the social and political nature of tests? McNamara (2014) feels there is a general conservatism in language testing. One example is testers’ acceptance of the administrative policy-friendly character of the Common European Framework of Reference (CEFR) even when it is under attack from both pyscholinguistics (Hulstijn, 2007, 2011), and sociolinguists’ critique from those with interest in English as a Lingua Franca (ELF). The ELF critique is particularly problematic for the CEFR as it questions the privilege of the native speaker in the level descriptions and the equivalence of ELF to other languages’s use implied in the CEFR (McNamara, 2014). The ELF critique suggests a need to reconceptualize the framework, a move that would be deeply unpopular with administrators and policymakers.

Questions of alignment to the CEFR clearly have sociopolitical implications but can and should language testers defy policy-makers and their insistence on alignment to CEFR? Does being a language tester imply a deeper engagement with sociopolitical issues like those illustrated in this post? The answer I think to both questions is yes. Technical competence and sociopolitical awareness taken together imply a need for a broader competence for the language tester than is commonly taught on language testing modules of postgraduate degree programmes. Testers need tools to understand sociopolitical issues more clearly and they need to engage with these issues if test use is to be a fair enterprise.

The use of exemplars for assessment and learning of writing

Exemplars of good quality work are essential to student-centered learning environments. They make ‘real’ what standards of work are expected for assessment and help students understand the marking criteria (e.g. Handley and Willimas, 2011; Orsmond, Merry and Reiling, 2002). Students are highly appreciative of the opportunities to see and discuss exemplars of good quality work and often feel they get a better idea of what the teacher wants. These exemplar ‘products’, and the opportunities to discuss them have been shown to help students appreciate what is expected of them, and can lead to higher levels of engagement with feedback. This suggests that in Sadler’s terms, students are able to see what quality looks like through good use of quality exemplars. 

It is not clear however, if and how students are able to use exemplars to improve their writing, bridging the ‘gap’ from the standard of their own work to the quality of the exemplar. In other words, we do not know if students are able to use exemplars for process purposes, to improve their writing while it is in production. The available evidence suggests that it is ‘difficult’ for students to learn from exemplars. This could be due to cognitive difficulties in ‘noticing’ features of the exemplars that constitute good writing or students being able to evaluate the merits of example text in relation to their own language use. There could also be social aspects to this dilemma with students perhaps not trusting student exemplars as having anything to teach them about good writing. There may also be a legitimate student fear of plagiarizing by adopting structures and phrasing from exemplars for use in students’ own work.

There is still much we do not know about student learning from exemplars in student-centred environments and yet their use is becoming ubiquitous in learning environments. More research would appear to be needed, particularly in the area of second language writing, to begin to judge whether and how student learning about writing can be facilitated through exemplar use.  

 

 

 

 

 

Constructive alignment and the assessment of academic writing

What impact, if any, might the idea of intended learning outcomes (Biggs and Tang, 2011) have on the assessment of academic writing? The biggest impact may well be the specification of aligned learning activities that help students achieve the intended learning outcomes. But that begs the question, what learning activities will help students write better assignments? The question is deceptively simple, yet the response gets at the heart of what teachers think they are teaching and assessing in academic writing.

 

The response to the question about learning activities to some extent depends on whether academic writing can be learned with a surface approach. Many teachers seem to implicitly (or explicitly in some cases) hold the view that this is possible. Teachers who exclusively use classroom activities that deal with stance, hedging, topic sentences and paragraphing in de-contextualized tasks and exercises for example, implicitly feel that when these tasks are added together, they should leave students capable of writing academic text that persuades their already knowledgeable audience that they have something to say. Most teachers seem to acknowledge, however, that the results of student writing when following these learning activities are pretty disappointing. 

So what would it mean to learn academic writing with a deep approach? For starters, it would probably mean moving away from the idea of EAP classes as training and making them more educative. Learning activities would need to revolve around not just the acquisition of information on aspects of academic writing, but rather conceptual change of what it means to write effectively in the academy and other real-world settings. This would be no easy task in an already crowded curriculum, as time (and probably lots of it) would need to be set aside for students to effect conceptual change through working collaboratively with their peers and the teacher. As Biggs and Tang (2011, pg. 23) point out, “good dialogue elicits those activities that shape, elaborate and deepen understanding”. For EAP classes therefore, time set aside to focus on study and discussion of exemplar texts, assessment criteria, collaborative writing, and peer critique of students’ own writing might be worthwhile learning activities that are likely to encourage conceptual change of what it means to be a successful writer.  

Apart from taking an active role in suitable learning activities, students would also need to know what was expected of them, hence the need for a clear and unambiguous learning outcome that encourages the deep learning of academic writing. Clearly, an intended learning outcome that merely states students will be able to produce acceptable academic text is very unlikely to lead to suitable learning activities no matter how inspiring the teacher is. An outcome that specifies awareness of audience, purpose or register could lead to more defined learning tasks, but without meaningful dialogue it is unlikely that the learning activities will help in achieving the desired outcome. 

The ideas of a deep approach to learning and the need for conceptual change in order to achieve deeper learning suggest a need for a rethink of how EAP classes deal with academic writing. Clearly some of what is implied above is a focus on how we can encourage more collaboration, more interest and motivation in the area of writing, and how we can design our courses and curricula to foster the kind of learning that we would like to see as teachers.  

The blurring of formative and summative assessment

To what extent can assessment evidence collected for a summative purpose be used for formative purposes? And can evidence collected formatively be used summatively? Such questions are at the heart of issues surrounding the choice of language assessments in tertiary education.

Plans for an exit proficiency test of English for graduating students, a test with a clear summative purpose of certifying graduating students’ proficiency levels in English, need to include considerations of how the assessment might serve the purpose of supporting student learning. Without such considerations, the assessment is unlikely to lead to positive washback on teaching and learning where students are actively engaged in improving their own learning. A summative test such as this one is likely to be limited in its suitability for formative use by the frequency and the nature of the tests or tasks. Even a system of assessment which aimed to ‘track’ students with repeated instances of summative assessment across occasions is likely to run into the problem of the assessment tasks not being fully consistent with good formative practices. The ability for assessment with a summative purpose to be used formatively would therefore appear to be limited.   

A fundamental change in the nature of assessment would be needed if it is to be designed to serve both formative and summative purposes from the start. Rather than use summative assessments formatively, it would appear better to use formative assessments summatively. Both formative and summative purposes might be served by evidence collected during the regular teaching of a curriculum, provided that a distinction is made between the evidence and the interpretation of evidence (Harlen, 2012). Moving from day-to-day learning tasks to a summary of achievement in terms of grades requires assurance that a) the evidence used is valid and adequately reflects learning goals, and b) that judgements are reliable. Since formative use of evidence depends on teachers’ judgements, additional quality assurance procedures will be needed when the information is used for a summative purpose. 

With quality assurance in place, it could be argued that formative assessment can fulfil the purposes of summative assessment. The reverse appears not to be true, however, as summative assessment rarely allows for principles of formative assessment to be fulfilled. The fact that formative assessment appears able to fulfil some or most of the purposes of summative assessment, albeit with extra quality assurance and other modifications, suggests that it is possible to blur the distinction between formative and summative assessment. The relationship between formative and summative assessment might be better described as a dimension rather than a dichotomy. 

Harlan, W., (2012), On the relationship between Assessment for formative and summative purposes. In Assessment and learning 2nd edition, ed Gardner, J.. London. Sage Publications

 

How specific can (should) academic purpose writing assessments be?

Attempting to assess student writing skills in academic settings is something of a dilemma: The more specific an assessment becomes, the more difficult it is to generalize about language ability in other contexts. So, how specific should we be in designing writing assessments for use in EAP courses?

In answering the question, it is important to ask what we would like to make inferences about in our academic purpose writing assessments. Occasionally, we are interested in making inferences purely about language ability, for example whether or not students can complete an essay with correct spelling, grammar and vocabulary. Such inferences are, however, unlikely to be particularly useful in academic contexts as we are more interested in inferring whether students can use their language ability (and background knowledge) to write academic assignments that are likely to be acceptable to students’ professors and teachers, than whether they can form grammatically acceptable strings of words. It is widely accepted that language performances vary with context and that specific purpose language is precise (in terms of its lexis, semantics and syntax). It would be almost impossible therefore to generalize from an assessment of pure language ability to the inferences we want to make about students’ ability to write and make meaning in academically acceptable ways in their discipline.

It is highly likely, therefore, that we would be interested in making inferences about specific purpose language ability to some degree. A defining feature of specific purpose writing assessment is that assessment performance is interpreted from the point of view of language users (Douglas, 2000). That is to say, the criteria that really count when we make inferences are the criteria used by academics in assessing student work (however implicit they may be!). In such circumstances, content is more likely to be the main area of interest to academics. When academics make judgements about student work, background knowledge and language ability are combined and language is not separated from content in assessments. Assessing in this way is a possible option for EAP specialists. Another possible option, perhaps most relevant for EAP practitioners, is to make inferences about language ability and specific purpose background knowledge separately. Both the second and the third option would give some evidence of communicative language ability (in varying degrees) with reference to the target situation.

So, which option is likely to be most useful for EAP specialists assessing academic writing in university contexts where students are acquiring the disciplinary ways of knowing through their EAP courses and writing assessments? The answer is likely to depend on where students are in their disciplinary contexts. Are they novices in their field or can specific purpose background knowledge be taken for granted as students get closer to graduation?  What is clear is that the level of specificity required needs to be made clear to teachers and students from the outset so that students know what is expected of them and teachers can grade assessments consistently.

The purpose of a university exit test: Political expediency or educational benefit?

Despite calls for a university exit test of English language proficiency here in Hong Kong, little attention seems to have been paid to what purpose(s) the test would serve and what desirable outcomes could be achieved. Purposes for such large-scale tests are rarely explicitly stated when they are externally mandated. One purpose that is often talked about is the need to give employers in Hong Kong information about graduating student abilities in English. This does not seem too unreasonable, given that when undergraduates enter the workforce, the last major language exam they would have taken would have been the Hong Kong Diploma of Secondary Education (HKDSE), the major school-leaving examination in HK. The question is whether that is the only purpose?

Glenn Fulcher (2010) has talked of the politicization of assessment. Authorities are frequently concerned with the efficient operation of the economy and part of that concern is producing the human resources required by business. In Hong Kong’s case, the human resource would be a graduate with a proficient command of English. Shohamy (2000) lists several other temptations that policy makers might have for using tests. One such temptation sees testing as a cost effective and efficient way of policy making. A test forces students to improve their English while at university. Such a test would also be cheaper than, for example, making money available for further language enhancement initiatives at university.  Another temptation to test comes from the appearance of action having been taken to address a problem. Hong Kong’s media and politicians are constantly lamenting the declining standards of students’ English (even though neither can ever show evidence for this supposed decline), and the imposition of a test seems like affirmative action, and would therefore have the support of business. A test would also give policy makers authoritative power over aspects of a university curriculum, forcing the university to direct resources to language enhancement initiatives.  With these somewhat political purposes for testing it is hardly surprising a test’s aims are not always made explicit.

But what if the purpose(s) for an exit test could be viewed differently? Could an assessment primarily focus on educational benefit for students (rather than political expediency) and still inform employers about student abilities without a standardized measure of English proficiency? Such an assessment (or more likely an assessment programme) might have several beneficial effects that the standardized test would be unlikely to achieve. An assessment that focused on educational benefit would be more ‘achievement test’ than ‘proficiency test’. An ‘achievement’ assessment could align with the quite considerable amount of English language learning that goes on while students are at university. Such a test would also be more ‘ecologically sensitive’ and would have positive washback on student learning. The alignment of the assessment programme to the curriculum could help raise language standards in a way that has not happened with the existing exit test mechanisms, IELTS and GSLPA.The different paradigm of assessment proposed in this paragraph raises two fundamental questions: Can a well-planned and constructed achievement assessment programme yield enough of what employers want (ie easily interpretable data about students)? And if it can, would the powers-that-be buy it?

Fulcher, G. (2010). Practical language testing. London: Hodder Education.
Shohamy, E. (2000). Using Language tests for upgrading knowledge. Hong Kong Journal of Applied Linguistics, 5(1), 1-18.