IAEA 2004 Conference Paper – Philadelphia

How Can Assessment be used to Improve Student Learning in a High-Stakes Environment?

Dr John Bennett
Office of the Board of Studies NSW

Introduction

The developments in assessment for learning are a welcome direction in education. Using information gained from assessment activities to assist teachers and students in understanding how students are progressing in their learning and to plan their next steps has tremendous potential for improving the quality of learning. Does such an approach have any place in a high stakes assessment environment, however, where the focus has tended to be on achieving the best possible final result for students in order to maximize their opportunities for further study or employment?

This paper begins by explaining assessment for learning and the practices associated with it. It compares assessment for learning with assessment of learning. The paper then describes a particular high-stakes environment, namely the New South Wales Higher School Certificate (HSC). It explains the nature and organisation of the HSC with a particular focus on the approaches used to assess and report student achievement and on the purposes for which the results are used. The paper then goes on to show how the materials developed as part of the assessment of learning approach used in the determination of students’ HSC achievement are being used to support student learning.

By drawing on the approaches introduced as part of the changes to the New South Wales Higher School Certificate, this paper shows that, not only is assessment for learning a reasonable expectation, but that such an approach can enhance students’ opportunities for maximizing their final levels of achievement. The paper looks at how the principles of assessment for learning are being operationalised as part of the standards-based Higher School Certificate.

What is assessment for learning?

Although there are several slightly different definitions of assessment for learning, the one used in this paper is the one proposed by the Assessment Reform Group. This definition states that:
“Assessment for Learning is the process of seeking and interpreting evidence for use by learners and their teachers to decide where the learners are in their learning, where they need to go and how best to get there.” (Assessment Reform Group 2002)
That is, assessment for learning has the purpose of collecting information on a student’s achievement that can be used to improve and progress their learning.

When looking at this definition one might well ask “how is assessment for learning different from what is commonly referred to as formative assessment?” The answer is that these terms are often used interchangeably. “Formative assessment is taken to refer to all those activities, undertaken by both teachers and students, that provide the information used as feedback to modify the teaching and learning.” (Harrison and Swaffield, 2003) The use of the term assessment for learning and the principles and practices underlying it send a clear message that assessment can be an integral part of the teaching and learning process, not just something ‘tacked on’ at the end.

The work conceptualising assessment for learning identifies ten related principles. These state that assessment for learning:
• is part of effective planning
• focuses on how students learn
• is central to classroom practice
• is a key professional skill
• is sensitive and constructive
• fosters motivation
• promotes understanding of goals and criteria
• helps learners know how to improve
• develops the capacity for self-assessment
• recognises all educational achievement.
(Assessment Reform Group 2002)

This list raises some important implications for classroom practice. It identifies the importance of assessment as part of normal classroom practice, it emphasises the importance of feedback aimed at motivating students and leading to improvement, and it indicates the importance of showing students how to realistically assess the value of their own work.

One might ask the question “isn’t this just good classroom practice?” Given that the answer is obviously “yes”, it leads us to the following questions:
• “How can we ensure that assessment for learning becomes common practice?” and
• “How can it operate in a high-stakes situation when the focus of learning seems to be maximising the end of course results in order to impress potential employers or gain a place in a highly competitive university course?”

The answer to the first of these questions is that the majority of assessment activity that teachers perform actually occurs during the teaching process and forms an integral part of teaching. Assessment is not just the test or assignment given at the end of the topic or course. Whenever teachers question their students, note their responses or make observations about their work they are collecting assessment information. Such information gathered informally can be placed side by side with information collected from more formal assessment activities to gain an understanding of where the student is in their learning and what they need to do to improve and progress. This approach provides a more natural link between what is taught, how it is taught and how well students have learnt it. Once this is understood, pre-service and in-service professional development activities for teachers can be directed at activities that develop these skills by modeling good practice.

The answer to the question “how can assessment be used to improve student learning in a high-stakes environment?” is addressed later, following the description of a particular high-stakes environment.

The term assessment of learning is synonymous with summative assessment, that is, judgments made about students’ achievement at some key point such as the end of the course or unit of work. Such assessment is usually focused on reporting students’ achievements at the end of the program of study, not only to the student, but to other parties such as parents, employers, the next year’s teacher, and so on.

The point needs to be made, however, that assessment for learning is not something completely divorced from assessment of learning. The assessment activities conducted and the information collected during the teaching of a course that are used to improve student learning can in most cases be used to provide a measure of the standard of student achievement at the end of a course. This claim is explored further below.

The NSW Higher School Certificate

The NSW Higher School Certificate (HSC) is the credential awarded at the end of secondary school. During Year 12, the final year of secondary school, students typically study courses in five or six subjects. Most courses are of two units in duration signifying 120 hours of study in one year. To be awarded a Higher School Certificate students must satisfactorily complete courses totalling a minimum of 10 units in Year 12. At least two units of English must be studied.

Student achievement in the courses studied in Year 12 is assessed through two components – an external examination and a school-based assessment.

The examinations are closely based on course curricula, and employ a variety of different item types, as appropriate. Most examinations consist of written response-type items that are scored polytomously. Some examinations also include multiple-choice and short answer items. Written components generally consist of short or extended responses or the solution to a mathematics problem. However, in some courses the examinations include other substantial manifestations of student work. For example, in Visual Arts, students submit for assessment a piece of artwork they have created. In the examinations for foreign languages, items that assess listening and speaking skills are employed. In Music and Drama, students are assessed on the quality of their performance of pieces of music or short plays they have prepared.

The school assessment mark determined by schools is based on a program of assessment activities developed and administered by the school over several school terms according to requirements and guidelines provided by the Board of Studies.

The final HSC mark that is reported and used to determine students’ level of achievement is the average of the examination mark and the statistically moderated school assessment mark. The examination mark and the assessment mark are also shown.

Since 2001 the initial mark from the examination and the assessment mark submitted by the school after statistical moderation have been ‘aligned’ to a standards-based performance scale in order to obtain the marks reported to the students. The alignment process consists of the application of a structured, multi-stage Angoff-based standards-setting procedure involving teams of highly experienced teachers, referred to as ‘judges’. (Angoff, 1971) The judges determine what examination marks each year correspond to the borderlines between the different levels of achievement, which are referred to as ‘performance bands’. A multilinear mapping process, which adjusts these cut-off marks to the borderline marks of 50, 60, . . . , 90 used as part of the reporting scale is then applied to all the examination marks for a course. In this way students’ HSC results are related to the knowledge, skills and understandings they have achieved in each course. The standards-setting procedure was especially developed to suit the nature and form of the HSC examinations (Bennett, 1998). A more detailed explanation of this procedure is provided in the appendix.

Students’ performances in the HSC are also used in the calculation of the students’ Universities Admission Index (UAI). Their initial examination marks and initial school assessment marks, after the statistical moderation is applied to the assessments (that is, before the alignment to the performance scales), are re-scaled by the Universities Admissions Centre during the process of determining the UAI rank used in the selection of students for tertiary courses. It is on the basis of this rank that those students interested in proceeding to university are offered places in particular courses at particular universities.

Following the 2001 and 2002 examinations, HSC standards packages were produced. These packages consist of a CD-ROM containing the descriptions of the levels of achievement or standards that are part of the performance scale, the examination paper and marking guidelines, and for each examination question or task the responses or works of several students who received the mark for a question the judges believed would be scored by students at the borderline of each pair of levels of achievement.

These standards packages encapsulate the standards of performance that have been created for each course. They are used by the teams of judges each year when they are determining what examination marks represent the cut-off mark between the different levels of achievement (or performance bands) in that year. In this way, although the examination paper may vary in difficulty from year to year, the marking schemes may vary from year to year and student rates of achievement may vary from year to year; we can be confident that those charged with the responsibility of establishing the cut-off marks each year will be basing their decisions on the same standards of performance.

How Can Assessment for Learning Operate in this Context?

Copies of the standards packages were also provided to each school to assist teachers to understand the standards created for the courses they teach. There is strong evidence that teachers are engaging with the contents of the packages and are internalising the standards. Many schools put the packages on their computer file servers and make them available to students as well as teachers. Students and members of the public can purchase the packages.

Some professional development activities were conducted in 2002 and 2003 to provide teachers with a structured method for using the packages to become familiar with the standards themselves and to incorporate them into activities involving their students.

While many tend to focus on the summative assessment of achievement (assessment of learning) that is an important aspect of the HSC, it is quite feasible to incorporate assessment for learning approaches into this high-stakes program. Given that the approaches espoused as part of assessment for learning are good teaching practice, there is nothing to prevent the effective use of such approaches through the senior secondary years. Teaching techniques that are aimed at:
• identifying the goals of learning;
• observing learning and analysing and interpreting evidence of learning and giving meaningful feedback and guidance to students; and
• motivating students by providing a supportive environment,
are applicable in any learning context.

The changes that were made to the Higher School Certificate in 2001 to assess and report student achievement in terms of standards, together with some of the materials that were developed to support this initiative, have provided some significant opportunities in relation to assessment for learning.

As indicated above, it is a requirement of the HSC school assessment programs that, for each course they teach, schools must establish a program of assessment tasks. These tasks are conducted throughout Year 12 and each has a weighting determined by the school within guidelines provided by the Board. Each task enables teachers to collect information about students’ achievement in relation to several outcomes, award marks in accordance with structured marking guidelines, and provide constructive feedback to students on their performances that high-lights their strengths and where they could make improvements.

The effectiveness of this feedback can be significantly improved now in the standards-based system used for the HSC. Teachers can work through some of the materials in the HSC standards packages with their students. For example, when teaching a topic they might identify the questions that were related to that topic in the 2001 or 2002 HSC examination. They can discuss the requirements of that question with their students, show them the marking guidelines that were used to allocate marks and then show them a number of student responses that represent different levels of achievement. By working through these responses the teacher can highlight the important features of the responses, including their strengths and any shortcomings. This approach could be even more effective if the teacher gives a student the opportunity to compare a piece of work they have produced on the same topic with the works in the standards packages. While initially such an approach might best be undertaken with the involvement of the teacher, at a later point it is quite likely that many students would be capable of undertaking such an activity as part of their self-assessment.

To further support this approach, the statistical feedback provided to schools on the performance of their students following the HSC examinations assists in further consolidating teachers’ understanding of the performance standards. For example, teachers are able to see how each of their students performed in the various major components of the examination. In Drama, for instance, teachers can see the proportion of their total mark each student received for the Written Examination component, the Group Performance component and the Individual Performance component.

A consequence of using these materials and information is that teachers will be in a better position to apply the principles of assessment for learning with the next cohort of students. In this way teachers’ feedback to students can be targeted and focused on helping students to improve.

Conclusion

The principles and practices of assessment for learning are applicable in any teaching/learning context, whether that context is leading to what is typically regarded as a high-stakes situation or not.

What this paper shows is that assessment for learning should be regarded as sound teaching practice that is equally at home in any situation. Such an approach can exist comfortably in a context such as the NSW Higher School Certificate – the classic high-stakes environment where student results are the most critical element in the selection for further study or employment.

References
Angoff, W. (1971) Scales, Norms and Equivalent Scores. In R.L. Thorndike (Ed.), Educational Measurement (2nd ed., pp 508-600), American Council on Education, Washington, DC.
Assessment Reform Group Assessment for Learning: 10 Principles (2002)
Black P et al in A Successful Intervention – Why Did it Work paper presented at AERA Chicago 2003

Bennett, J. (1998) A Procedure for Equating Curriculum-based Public Examinations Using Professional Judgment Informed by the Psychometric Analysis of Response Data and Student Scripts Unpublished doctoral thesis, University of New South Wales.

Bennett J and Taylor C Is Assessment for Learning in a High-Stakes Environment a Reasonable Expectation? (2003) paper presented at ACACA Conference Adelaide 2003

McGaw, B. (1997) Shaping Their Future: Recommendations for Reform of the Higher School Certificate. Department of Training and Education Co-ordination, New South Wales.

Harrison C and Swaffield S Formative Assessment in Action in Whither Assessment Qualifications and Curriculum Authority, 2003

Appendix

The New South Wales Government in its policy document, Securing Their Future, released in August 1997 adopted the recommendation made by McGaw (1997) that “a standard-referenced approach to assessment be adopted for the Higher School Certificate by developing achievement scales for each subject” (p.97). McGaw recommended that examination data be used to clarify performance scales on which student achievement and item difficulties can be represented, to develop descriptions of what the scales measure in broad bands so as to amplify the meaning of the bands. The Government determined that from the 2001 HSC examinations student achievement would be reported using a standards-referenced approach.

Teams of experienced teachers using data and student responses to past examinations met to prepare statements describing five different levels of achievement in their course. These statements were the first component of the standards that were to be set for each course. The levels were designated as Band 2 to Band 6 with Band 6 being the highest level. A sixth level of achievement, Band 1 was considered to be below the minimum standard expected, did not have a description.

The procedure that was developed to relate student examination performance to these standards was based on the work of Bennett (1998). This research developed and tested a multi-staged Angoff-based standards-setting procedure that could be applied in the context of the NSW Higher School Certificate examinations. It is a procedure that uses teams of highly experienced teachers employing professional judgment informed by certain appropriate statistical data and student examination responses to determine what examination marks correspond to the borderlines between the different performance bands established for that course.

The way the procedure operated in 2001, the initial year as follows:

For each course a team of experienced teachers was created. These teachers, referred to as judges, were given special training for this task. They were also given a copy of the band descriptions for their course, a copy of the examination paper and specially designed recording sheets.

Stage 1

Working independently from his or her colleagues, each judge read the band descriptions carefully and developed an “image” of the knowledge and skills of students whose achievements would place them in each performance band in that course. The judges then used these images to develop images of students whose achievements would place them on the borderline between two bands.

Having done this, each judge recorded what mark for each examination question a borderline band 5/band 6 student would receive. Adding up these individual question marks gave the total examination mark that the judge believed corresponded to the borderline or cut-off mark between band 5/band 6. Averaging the cut-off marks between band 5 and band 6 proposed by all the judges produced the first estimate of what examination mark will represent the borderline between band 5 and band 6. The judges followed the identical procedure for the band 4/band 5, band 3/band 4, band 2/band 3 and band 1/band 2 borderlines.

Stage 2

The judges came together and discussed the decisions they had made individually. At the same time they were given specially designed statistical reports that were very effective in showing how students of different abilities had performed on each question in the examination. The judges worked through and discussed this information. During this process a judge had the opportunity to modify any of the decisions he or she recorded during the first stage. Through this stage the team starts to develop a common image of students who would be at the borderlines between bands.

The judges recording sheets were again collected and processed as in Stage 1. This resulted in a new set of band cut-off marks.

Stage 3

The examination responses of samples of students who had achieved the marks for an examination question that were equal to the band cut-off marks identified by the judges for that question were collected. The judges then met again and reviewed and discussed these examination responses. The judges were asked to confirm that the responses produced by these students were typical of what they would expect of students placed at the borderline between bands. The judges also reviewed student works slightly above and below their proposed cut-off marks. During this process the judges had the opportunity to further refine their band cut-off marks.

When they had completed this third stage the average of the decisions made by the judges became the cut-off examination marks necessary to achieve each performance band.

These marks were then used to finalise the marks that were to be reported to students. This was done by adjusting the mark that was adjudged to be the borderline between Band 6 and Band 5 to 90, the mark adjudged to be the borderline between Band 4 and Band 5 to 80, and so on. Marks between these borderlines were simply adjusted using linear interpolation.

The final activity undertaken in 2001 was to “capture the standards”. This involved the development of a “Standards Package” for each course. The packages produced on CD-ROM contained the band descriptions, the 2001 examination paper and samples of responses of students at each borderline and other statistical information are collected and incorporated into what is referred to as a Standards Package. The material is presented in such a way that teachers, students and others can most effectively develop a clear understanding of the standards that have been developed for each course.

The way the procedure operated in subsequent years

The Standards Packages are an essential part of the standard setting procedure for the examinations from 2002. In 2002 and onwards, it has been the job of the teams of judges to become thoroughly familiar with the material and apply the same standards in determining the band cut-off marks in subsequent years. In this way, while the actual cut-off marks may vary from year to year for a number of reasons, the standards used to report students’ achievement will not vary.

Hence, the teams of judges follow basically the same procedure as in the initial year, with the exception that the standards of performance against which they make their judgments are clearly exemplified in the standards packages. That is, they do not simply attempt to visual the standards from the descriptions of the levels of achievement. Essentially the question the judges ask themselves is, “what mark in each question in this year’s examination paper would be achieved by the students at the borderlines of the different performance bands whose works are included in the standards packages?”