Wednesday, 14 December 2016

The longitudinal study, the 80/20 rule, and the reification of the average [update 2]


Averages are not themselves real things. Two men eating two steak dinners does not mean they eat one steak each (and the vegetarian would probably be offended that you even assumed that).

So what’s the problem with this Radio NZ headline and sub-heading then:

Researchers can predict 3 year olds' future problems: Researchers can predict which three year olds will grow up to be criminals or beneficiaries with poor health and a high chance of becoming obese.’

In fact, the researchers in question do not say that, and nor should they. The result is not a prediction, it describes a correlation. Correlation and causality are not the same – the latter may be teased out from the former, but that is a task of deduction and further research, not a job for reporters on the fly.

So what have the researchers done? They have just released more data from the ongoing Dunedin longitudinal study that has been following 1,037 children born up to thirty-four years ago. A good sample size. A good long time span. And based on correlations within that data they note that

three year olds who had been assessed [with low scores] on their language, motor skills and social behaviour … make up just 20 percent of the study's subjects but account for 81 percent of criminal convictions and 66 percent of welfare benefits.

So it’s a likely outcome, not a certainty. One that a heroic 20 percent do manage to overcome. (Wouldn’t it be interesting to hear more about them?)

Nonetheless, there are interesting results in the data set that should be properly highlighted by headline writers – one, for example, that should (but won’t) fascinate those alleging to be concerned about child poverty: that this result for these eighty-percenters is regardless of family wealth. As the BBC reports:

When the researchers took out children below the poverty line in a separate analysis they found that a similar proportion of middle class children who scored low in tests when they were three also went on to experience difficulties when they were older.

In other words, regardless of whether they grow up in poverty or not,

about 20 percent of the population will account for up to 80 percent of the cost to [taxpayers].

So the old 80/20 rule in play again. And just to say it again (because the claim it helps debunk is rolled out so frequently): this is regardless of whether they grow up in poverty or not.

Burrowing down a bit more into the findings:

This group made up only 22% of their age cohort, but they left a big footprint on costs of service delivery. By midlife, this small group was convicted for 81% of the crimes charged to the cohort; had filled 78% of all prescriptions for pharmaceutical drugs; accounted for 77% of the years in which their children were growing up fatherless; had received 66% of the cohort’s welfare benefit payments; occupied 57% of their cohort’s nights spent in a hospital bed; smoked 54% of the cohort’s tobacco cigarettes; carried 40% of the cohort’s kilograms of obese weight; and made 36% of injury insurance claims.

Yet while “researchers [themselves] stress that children's outcomes are not set at the age of three,” politicians and reporters keep using the word “predict,” just as if correlation did mean causality -- and the two-man steak dinner mean no vegetarians getting fed. Radio NZ is not alone in the error – Daily Mail, The Guardian, and the BBC all blunder the same way.

Naturally however, government is keen to make the most of this ‘prediction,’ both to push the importance of early childhood education (despite it not even starting until three for most children) and to launch what they now like to call “wraparound care” from government based on “risk modelling.”  The Press summarises:

On the face of it, the results were not a surprise…. The numbers showed that about 20 per cent of the study’s participants accounted for 80 per cent of its economic burden…
Incoming Prime Minister Bill English has [already] led a wholesale reform of the public service from the finance portfolio, based in part on smarter spending through data analytics [such as these]. Predictive risk modelling has already been proffered as a way to identify and help vulnerable children and stop abuse and welfare dependency turning into criminality. It wasn’t always popular – a plan to apply modelling to newborns was particularly controversial – but the change mantra has stuck. After English, its biggest champion has been his new deputy, Paula Bennett, a veteran of a host of social services portfolios.

So the politicians will be using this report to argue for more targeted welfare. David Farrar’s bland conclusion is probably their own:

The risk factors are now well known. The solutions are not free tertiary education for privileged students, but targeted interventions for those most at risk.

Yet as Lindsay Mitchell’s research and recent reports make abundantly clear (Child Poverty & Family Structure: What is the evidence telling us? and Child Abuse & Family Structure: What is the evidence telling us?), things are not quite this simple. The most telling correlation for many bad outcomes* is growing up in a family on welfare.

    • The risk of abuse for children whose parent/caregiver had spent more than 80% of the last five years on a benefit was 38 times greater than those with no benefit history.
    • In 2013 around 215,000 children were dependent on benefit recipients, and 76% of sole mothers were receiving a main benefit.
    • By the end of their birth year, a significant share of babies – averaging one in five between 2005 and 2014 - would be reliant on welfare. Of this group … more than two thirds [would rely] on a sole parent benefit.

It is unclear whether factors like these were even considered by the Dunedin researchers – and they assuredly won’t be by the politicians who (let’s face it) just need a new idea to trumpet, and may well end up simply choosing between bad outcomes and less-bad outcomes.

But we should be clear here: nothing about this report says that government must or can do anything about the language, motor skills and social behaviour of three-year olds – which is what the Dunedin results offer as the highest correlation. All government can do from zero to three is provide welfare – and welfare is the strongest correlation with many negative outcomes.

So the trumpeters of bland conclusions should at the very least tread warily.

Remember, data itself is silent as to causality. So that does not mean any of these things are necessary outcomes either. Everybody has free will, so just as 20% of those Dunedinites performing poorly as a three-year old did not go on to perform 80% of the crimes and catastrophe, so too everyone who spends time on a benefit does not go on to abuse their kids.

I for one would like to hear more about those who didn’t, and don’t – and what choices, if anything, led them on their better path. For after all, it is those choices that in the end made them the better people that they are.

Reifying the failure of a “cohort” is more likely to obscure these important choices than reveal them. It helps make us overlook the few, and important, “vegetarians” in the data set.

UPDATE 1: Toby Manhire at The Spinoff “spoke to Dunedin Study Director Professor Richie Poulton, to ask about the findings, how it fits into Bill English’s “social investment” approach, and the risk of stigmatising children.”

The Spinoff: One of the major news headlines reporting on the findings read, “Future criminals revealed at age three”, which has a bit of a Minority Report edge to it. Is that headline true?

Richie Poulton: No. It’s a headline that doesn’t reflect what’s in the paper accurately. There were unfortunate headlines.

Read: ‘‘Future criminals revealed at age three’? Not so fast, says Dunedin Study head

UPDATE 2: No, future criminals are not being revealed. Thomas Lumley at Stats Chat says:

I was going to write about the Herald’s headlineFuture criminals revealed at 3, says study“, but Toby Manhire has a good interview with someone from the study, explaining that no, it doesn’t.

Richie Poulton: No. It’s a headline that doesn’t reflect what’s in the paper accurately. There were unfortunate headlines.

What then are the major findings of this study?

The idea, which is intuitively appealing, is that there is a small group that account for a lot of service use…

* Different outcomes, to be fair, but still in the very-negative column.

1 comment:

  1. That old chestnut 80/20 rule- its bollocks- this was thrown around by my Managers in the Bank constantly (along with the saying that if 10% of your book is not falling over- you are being too conservative with your lending). Whenever I asked to see documentary evidence of either claim, nobody could ever produce it.



1. Commenters are welcome and invited.
2. All comments are moderated. Off-topic grandstanding, spam, and gibberish will be ignored. Tu quoque will be moderated.
3. Read the post before you comment. Challenge facts, but don't simply ignore them.
4. Use a name. If it's important enough to say, it's important enough to put a name to.
5. Above all: Act with honour. Say what you mean, and mean what you say.