 Year 6 pupils' performance - but is the blue line suspect? |
An investigation into the validity of national school test results in England found there was no conspiracy to inflate standards over time. It said maintaining standards year on year was a very difficult job that had largely been achieved.
But it raised concerns that 11 year olds' English tests in 1999 and 2000, and 14 year olds' maths tests in 2001, were easier than in previous years.
The researchers were working for the qualifications watchdog, the QCA.
They examined tests between 1996 and 2001 and completed the work two years ago - but the QCA has only now published it.
"The evidence shows how difficult it is to determine standards and gives the lie to any theory of conspiracy to undermine them," they conclude.
The main findings were revealed in October by the lead researcher, Alf Massey, head of examinations research at the University of Cambridge Local Examinations Syndicate (Ucles).
At the time he declined to say which results might have been suspect, however.
These are clear now that the QCA has published the 249-page report. But overall the tests are given the all-clear.
Main findings
Key Stage 1 reading comprehension, 1996 v 1999:
"The evidence ... strongly supports a view that performance levels in schools have risen."
Key Stage 1 maths 1996 v 2000:
"... the improvements observed in results at a national level ... seem more than merited - reflecting, indeed under-estimating, learning gains in schools."
Key Stage 2 English 1996 v 1999 and 1996 v 2000:
"... entirely consistent with the view that there has been a substantial improvement in children's performance. But these data do also suggest that national results might over-estimate the rate of progress."
"The second study replicated the earlier findings in some detail .... If anything the gap between the standards applied in the 2000 and 1996 versions was marginally wider than that observed a year earlier".
"The improvement in national results over this period is thought to stem largely from improved performance in reading. If valid, our experimental data question the extent of this improvement."
Key Stage 2 maths 1996 v 1999:
"The experimental evidence ... provides no reason to challenge the validity of the improvements".
Key Stage 2 science 1996 v 2001:
" ... provides support for the view that there has been a great improvement in children's performance ... there are signs that a small part of the very large improvement in national test results reported between 1996 and 2001 may be a product of a shift in test standards."
Key Stage 3 English 1996 v 2001:
"There is no basis here on which to challenge the improvements in KS3 English levels recorded nationally".
Key Stage 3 maths 1996 v 2000:
"... the experimental evidence showed that pupils taking the later (2000) version of the test obtained better results ... than those allocated to the 1996 version ... relative lenience in the 2000 version is greater for lower ability pupils."
Key Stage 3 science 1996 v 2001:
"These data ... suggest that the quite substantial gains in KS3 Science test results reported nationally between 1996 and 2001 were merited; reflecting improvements in teaching and learning in schools."
And the report adds: "Like the similar conclusions regarding almost all the curriculum areas at all three key stages investigated, perhaps this should be recognised as the most important inference we have been able to make."
 | We have full confidence in the QCA to maintain standards in the tests  |
A QCA spokesman said it did not accept that gains made in Key Stage 2 English since 1996 were untrustworthy. "Standards are maintained every year and the research does not say that our systems for maintaining standards are unsound.
"The most important of the research findings is the data which provides sound evidence that, since the advent of national tests, achievement levels in schools have improved substantially in almost all curriculum areas and key stages investigated - not just Key Stage 2."
And a spokesperson for the Department for Education said the research conclusions were contradicted by the evidence used by the QCA each year when it decided how many marks were needed to earn the different levels in the tests.
The independent Rose Panel in 1999 had validated that process.
"We have full confidence in the QCA to maintain standards in the tests and exams and we are absolutely confident that the QCA achieved this successfully in the period covered by the report, and subsequently."
The latest research was done in three ways:
- pupils took tests from a number of years
- comparative data was collected from local education authorities which had done their own, large-scale standardised tests
- children were shown sample questions from the 1996 tests and the most recent versions and asked what they thought of them.
The researchers say a complicating factor is that tests have become more "accessible" or user-friendly. The efforts to make them clearer to understand and easier to respond to were welcomed by children and, the report says, "must have helped to soothe their initial anxieties".
"Further efforts to improve them remain desirable."
But this raised "interesting questions" about standards.
Did such changes make tests easier?
If so, should the mark thresholds be raised to compensate for this - or were such things a valid way of recognising performance?