Abstract of PhD Thesis

A Procedure for Equating Curriculum-based Public Examinations Using Professional Judgment Informed by the Psychometric Analysis of Response Data and Student Scripts

One of the greatest concerns facing those responsible for conducting large-scale educational programs is whether funds used for such purposes are leading to an increase over time, or at the very least to no decline, in the proportions of students who achieve course outcomes. In order to answer this question in educational systems around the world, students are often given an examination at the conclusion of their course. A number of methods may then be employed to equate an examination to those administered in previous years. Once this is done, it is possible to determine whether more students have reached the desired performance standards than in the past. Equating techniques generally use common items or common persons to establish links between the examinations.

In a number of high-stakes educational programs the examinations are substantial measures of the knowledge and skills students have learnt from studying courses based on traditional subject disciplines. Such curriculum-based examinations commonly employ a variety of different item types appropriate to the curriculum outcomes being assessed. While some of the items may be scored dichotomously, it is not uncommon that the majority of items are scored polytomously using an holistic scoring key. It is also usual in such cases for the examinations to be made available for public consideration after they are administered. Students use past examination papers to practise for their own examination. In such circumstances, traditional equating methods employing common items or common students can not be used.

Following a review of the literature on setting standards and equating it was decided that an Angoff-based approach would be an appropriate way to equate such examinations. It was reasoned that, a team of appropriately qualified judges could develop a set of performance standards based on one examination. These standards could then be described and exemplified using items and student responses. Once this is done it would be possible for a similarly qualified team of judges to internalise those standards and equate examinations administered in different years by determining the scores on a subsequent examination that corresponded to the standards set on the initial examination. The examinations in three courses from the New South Wales Higher School Certificate were used to test the procedure which was developed for this study.

To provide information to the judges to assist them in their task student performance data were analysed using the Extended Logistical Model, a Rasch measurement model. These data were and presented to the judges in a manner most suited to understanding how students of different ability levels had performed in the items in such comprehensive curriculum-based examinations. The feedback provided by this analysis proved to be effective in assisting judges to refine their views. A review of student scripts also assisted in this regard.

This study shows that the procedure developed for equating two curriculum-based examinations is effective. The multi-staged procedure based on the application of informed professional judgment, which utilises the Extended Logistical Model as a way of providing pertinent feedback on student performance, together with consideration of student scripts, delivers promising results when applied to a sample of courses from the NSW Higher School Certificate. The results obtained would indicate that, while certain refinements may strengthen the process, the procedure is sufficiently flexible that it could be used with virtually any form of examination or test.