 Eleven year olds reaching the expected level - or did they? |
Some national tests have been overstating or understating children's school progress, research suggests. Efforts to make the tests more "child friendly" and accessible may have made some easier - inflating the results, according to a large-scale study.
But some appear to have become harder over time, the researchers say.
The study, dated January 2002, was commissioned by the QCA watchdog. The department for education has insisted the tests are reliable.
'Good news and bad news'
The man in charge of the research was Alf Massey, head of examinations research at the University of Cambridge Local Examinations Syndicate (Ucles).
The study covered the English and maths tests taken by seven year olds and the English, maths and science tests at ages 11 and 14.
"The outcome is a mixed picture," Mr Massey said.
"In several cases the QCA seem to have performed the miracle and have got things right.
"In a number of cases, if anything the tests appear to have become a little more severe over time."
This meant the results "slightly underestimated" the progress that had been made in schools.
'Not a conspiracy'
"There are a couple of cases where the opposite is true and the more recent tests look a little more generous."
So the national test results would have overestimated progress.
Mr Massey declined to say which tests were which.
But he added: "It's certainly not a conspiracy to defraud anyone - its the difficulty of doing the job."
There were three strands to the research, which began in 1999.
- pupils took tests from a number of years
- comparative data was collected from local education authorities which had done their own, large-scale standardised tests
- children were shown sample questions from the 1996 tests and the most recent versions and asked what they thought of them
"It's difficult with small children but even seven year olds were able to appreciate the differences and express a view," Mr Massey said. The evidence from these different strands was "very convincing in the way that everything pointed in the same direction".
 | MANDATORY TESTING In England tests are taken at ages 7, 11 and 14 |
He said it was clear the tests had been evolving, which raised almost philosophical issues. "If you make it something children can engage with more, and tidy up the language - and don't interfere through language with their demonstrating what they know or can do - you are making the test more accessible.
"So would you expect results to improve, or would you penalise the ones taking the later test - because they have been helped?
"There's an issue there which needs to be raised."
Recommendations
Mr Massey stressed that the tests were "really very, very good".
But work to improve them could be better managed.
A major recommendation is to keep things steady for several years then, if changes are to be made, make them all in one go and monitor the effect.
The study team also evaluated the way levels were set each year and made "a range of recommendations" about how that might be improved.
He believed some of the ideas had been included in the tests from 2003.
Asked why the findings had not been published, a spokesman for the QCA described the research as "ongoing".
"The empirical part is done and we are looking at preparing that for publication by the end of the year."
'World class'
The Department for Education and Skills appeared to reject the findings.
"Our world class standards have been vigorously maintained year on year. Independent research clearly shows the robustness of the Key Stage 2 tests," a spokesperson said.
Independent evidence from a recent international PIRLS study - for which children were not prepared - "showed that our 10 year olds are the third most able readers in the world".
"The independent Rose Review in 1999 backed the test procedures and confirmed that the test data from Key Stage 2 was a reliable measure of how well pupils do."
The education officer for the National Union of Teacher, John Bangs, said this was further evidence of the tests' unreliability.
"They provide no useful information, yet each year they are used to create league tables by which to judge schools.
"Far too much reliance is placed on them, and when linked with government targets which have been plucked out of the air they are even less valuable."