Daily Bulletin

The Conversation

  • Written by Kevin Watson, Associate Professor of Science Education and Director of Research, University of Notre Dame

Recent political comment suggests Australia has not performed as well as expected in the latest round of NAPLAN testing. Such comment is based on the belief that NAPLAN scores can be compared from one year to the next.

But new research, to be published next month in the journal, Quality Assurance in Education, shows that NAPLAN results cannot be compared across years. It is not then reasonable for politicians to say NAPLAN results have plateaued, because comparisons from year to year are not reliably accurate.

The study – which used NAPLAN scores from 2008 to 2012 – questions the reliability of NAPLAN as a tool for charting individual student progress across school years, let alone that of whole year groups.

The study collected data for nearly 10,000 students in over 110 primary schools in Australia. It looked at the influences on student performance of gender, language background and the NAPLAN test itself (the nature of the actual test in a particular year).

While gender was found to be the most influential, followed by language background, the test itself was also found to be a factor, with the average score attained by students fluctuating significantly from year to year. This variability, the study concluded, was likely to be a consequence of differences in the tests themselves rather than a reflection of student performance.

Flaws in the test

There are two conflicting views about the reliability and use of NAPLAN scores to compare individuals from one year to the next.

NAPLAN tests have questions that are in common from one year to the next. Student performance in these questions can then be used to standardise the test as a whole. This then provides the mechanism for comparing the test in one year with that of the next.

Professor Margaret Wu from the University of Melbourne is sceptical of the capacity for NAPLAN scores to be used to compare individuals or schools from year to year.

She argues that comparisons of national cohorts are problematic due to the large random fluctuations and error margins implicit in such comparisons.

Each NAPLAN test is short, only 40 questions. Therefore the questions used for standardising one test with another are not enough.

As with any test, there is an expected error in measurement. The errors can be for a number of reasons. One reason can be that the answer to the question may not in fact be the best answer, which confuses kids. A test error rate is a measure of how good an instrument (test) is at achieving the same result if it were to be done by another group of students.

In the case of NAPLAN, that would mean the same group of students getting the same result, the school getting the same result and even any system (Department of Education or Catholic Education Office or even the DOE in different states) getting the same result.

In the NAPLAN test the measurement error is large mainly because the test is short. Even if the test from one year could be compared with the test from another, the errors inherent in individual test scores would mean such a comparison would be unreliable.

Margaret Wu states that the fluctuation in NAPLAN scores can be as much as ± 5.2. This is because of a standard error of measurement of about 2.6 standard deviations.

This means there is a 95% confidence that if the same students were to complete the same test again (without new learning between tests) the results would vary by as much as ± 5.2 (2.6 x 2) of the original score. This represents nearly 12% variability for each individual score.

The standard error of measurement depends on the test reliability, meaning the capacity of the test to produce consistent and robust results.

What some researchers say is that the NAPLAN test’s large margin for errors makes the comparison across years inaccurate.

For example, if a student gets 74% in a test and another gets 70% and the error is 5, that means that essentially the first mark is 74 + or – 5, and the other mark is 70% + or – 5.

This means the two different marks can overlap by a fair bit. So it is not really possible to say a score of 74 is that much different to a score of 70.

The implication is that when you take this into account over a whole cohort of people it is difficult to sat categorically that one set of marks is any different compared with another.


There are various implications for using NAPLAN results to compare students, schools or even state performance.

The “My School” website data, for example, should be viewed with caution by parents when making decisions about their children’s schooling.

Teachers and principals should not be judged based on NAPLAN findings and, as others have argued, more formative (assessment during learning) rather than summative (assessment at the end of a learning cycle) measures for providing teaching and learning feedback should be explored.

NAPLAN is not good for the purpose for which it was intended. However, it makes politicians feel they are doing something to promote literacy and numeracy.

Authors: Kevin Watson, Associate Professor of Science Education and Director of Research, University of Notre Dame

Read more http://theconversation.com/naplan-data-is-not-comparable-across-school-years-63703

Writers Wanted

Queensland's LNP wants a curfew for kids, but evidence suggests this won't reduce crime


We can no longer ignore the threats facing the Pacific — we need to support more migration to Australia


The Conversation


Prime Minister Interview with Kieran Gilbert, Sky News

KIERAN GILBERT: Kieran Gilbert here with you and the Prime Minister joins me. Prime Minister, thanks so much for your time.  PRIME MINISTER: G'day Kieran.  GILBERT: An assumption a vaccine is ...

Daily Bulletin - avatar Daily Bulletin

Did BLM Really Change the US Police Work?

The Black Lives Matter (BLM) movement has proven that the power of the state rests in the hands of the people it governs. Following the death of 46-year-old black American George Floyd in a case of ...

a Guest Writer - avatar a Guest Writer

Scott Morrison: the right man at the right time

Australia is not at war with another nation or ideology in August 2020 but the nation is in conflict. There are serious threats from China and there are many challenges flowing from the pandemic tha...

Greg Rogers - avatar Greg Rogers

Business News

What Few People Know About Painters

What do you look for when renting a house? Most potential tenants look for the general appearance of a house. If the house is poorly decorated, they are likely to turn you off. A painter Adelaide ...

News Co - avatar News Co

Important Instagram marketing tips

Instagram marketing is one of the most important approaches for digital advertisers. If you want to promote products online, then Instagram along with Facebook is the perfect option. After Faceboo...

News Co - avatar News Co

Top 3 Accident Law Firms of Riverside County, CA

Do you live in Riverside County and faced an accident and now looking for a trusted Law firm to present your case? If yes, then you have come to the right place. The purpose of the article is to...

News Co - avatar News Co

News Co Media Group

Content & Technology Connecting Global Audiences

More Information - Less Opinion