Mark Twain would surely have said something nasty about this, but alas he perished before “damned statistics” were invented.
The most fundamental error was multiplying probabilities without counting failures. It’s as if you flipped a coin 1,000 times, got 493 heads, and then said “Wow! What are the odds of one person getting 493 heads?!”
The second most fundamental error was probability bootstrapping. First, you consider a death suspicious because she was working when the patient died suddenly of no apparent cause. Then you wonder why so many of the suspicious deaths occurred on her shifts.
1 in 342 million, 1 in 9. What’s the difference? It’s just numbers.
The false prosecution and imprisonment of Lucia de Berk is a human rights atrocity. Two Observations:
1) How on earth was the prosecution allowed to drag us such a shameless mathematical fraud as the “expert witness.” Sanctions are needed against prosecution, the “expert,” and the judge.
If the math is done well then that does take some of the burden off the jury. Just pick the level of certainty you wish and if the probability is greater then done, convicted. If it is better to let 100 guilty go free than imprison one innocent then let Bayes theorem do the work for you at whatever level of positive predictive value is needed for the jury to be comfortable. But for this to work we will need an independent review by judicial statisticians in the same way scientific journals employ their own expert bullshit detectors.
I wish that more was said about the error of the calculation. The “Simple Justice” Blog seems skeptical of all statistics, even DNA calculations.
Love canal was a big deal: “there are just too many cases not to be concerned.” The State of New York did a study that carefully measured exposure and found the cases of various cancers at Love canal to be within one standard deviation of expected.
Mayor Koch was worried about sudden acceleration from his Toyota. I tried to reassure him that the exposure for the cases causing concern was 100 million driving years, so that the risk of an event in a year was about the same as the risk of driving to a dealer for the recall. I doubt if my advise worked for him. He has trouble with logic at times.
My sense of it is that good statistical analysis does more good than harm. It is wonderful when it exonerates an innocent accused of rape by an erroneous identification. It happens fairly often.
The most common nonsensical DNA statistics abuse is this one: You have a crime where there’s DNA evidence, and no idea what the race of the perpetrator is. The accused is, say, Hispanic.
The DNA evidence is explained with a statement like “the accused’s DNA profile matches that found at the crime scene better than one out of a million Hispanics”.
However, this impressive sounding statistic is totally meaningless unless there is some independent reason to think that the DNA was left by a Hispanic. The confidence level of the match with reference to any sample population other than the population of people who might have left the DNA is completely meaningless.
“1 in 342 million, 1 in 9. What’s the difference?”
One difference is that lotteries would be much more popular if the odds of winning a million dollars were 1 in 9. Another would be that one would have to watch the home team play baseball for somewhat more than 5,000 seasons to see a second run scored.
Mr. Schwartz above has it backwards. If a person is Hispanic, then you would get smaller odds from comparing him to other Hispanics then to comparing him to all people.
DNA statistics are far from nonsensical as can be attested to by those erroneously identified as rapists by the victims of rapes.
William Nuesslein: Say the DNA profile contains a gene that most Caucasians have but that almost no Hispanics have. If the suspect is one of the few Hispanics who have that particular gene, he may match the sample better than one out of a million Hispanics. But no better than the majority of Caucasians. So the evidence really suggests more that the DNA came from a Caucasian.
8 Comments
Mark Twain would surely have said something nasty about this, but alas he perished before “damned statistics” were invented.
The most fundamental error was multiplying probabilities without counting failures. It’s as if you flipped a coin 1,000 times, got 493 heads, and then said “Wow! What are the odds of one person getting 493 heads?!”
The second most fundamental error was probability bootstrapping. First, you consider a death suspicious because she was working when the patient died suddenly of no apparent cause. Then you wonder why so many of the suspicious deaths occurred on her shifts.
1 in 342 million, 1 in 9. What’s the difference? It’s just numbers.
The false prosecution and imprisonment of Lucia de Berk is a human rights atrocity. Two Observations:
1) How on earth was the prosecution allowed to drag us such a shameless mathematical fraud as the “expert witness.” Sanctions are needed against prosecution, the “expert,” and the judge.
2) EuroJustice needs Daubert!
If the math is done well then that does take some of the burden off the jury. Just pick the level of certainty you wish and if the probability is greater then done, convicted.
If it is better to let 100 guilty go free than imprison one innocent then let Bayes theorem do the work for you at whatever level of positive predictive value is needed for the jury to be comfortable.
But for this to work we will need an independent review by judicial statisticians in the same way scientific journals employ their own expert bullshit detectors.
I wish that more was said about the error of the calculation. The “Simple Justice” Blog seems skeptical of all statistics, even DNA calculations.
Love canal was a big deal: “there are just too many cases not to be concerned.” The State of New York did a study that carefully measured exposure and found the cases of various cancers at Love canal to be within one standard deviation of expected.
Mayor Koch was worried about sudden acceleration from his Toyota. I tried to reassure him that the exposure for the cases causing concern was 100 million driving years, so that the risk of an event in a year was about the same as the risk of driving to a dealer for the recall. I doubt if my advise worked for him. He has trouble with logic at times.
My sense of it is that good statistical analysis does more good than harm. It is wonderful when it exonerates an innocent accused of rape by an erroneous identification. It happens fairly often.
The most common nonsensical DNA statistics abuse is this one: You have a crime where there’s DNA evidence, and no idea what the race of the perpetrator is. The accused is, say, Hispanic.
The DNA evidence is explained with a statement like “the accused’s DNA profile matches that found at the crime scene better than one out of a million Hispanics”.
However, this impressive sounding statistic is totally meaningless unless there is some independent reason to think that the DNA was left by a Hispanic. The confidence level of the match with reference to any sample population other than the population of people who might have left the DNA is completely meaningless.
“1 in 342 million, 1 in 9. What’s the difference?”
One difference is that lotteries would be much more popular if the odds of winning a million dollars were 1 in 9. Another would be that one would have to watch the home team play baseball for somewhat more than 5,000 seasons to see a second run scored.
Mr. Schwartz above has it backwards. If a person is Hispanic, then you would get smaller odds from comparing him to other Hispanics then to comparing him to all people.
DNA statistics are far from nonsensical as can be attested to by those erroneously identified as rapists by the victims of rapes.
William Nuesslein: Say the DNA profile contains a gene that most Caucasians have but that almost no Hispanics have. If the suspect is one of the few Hispanics who have that particular gene, he may match the sample better than one out of a million Hispanics. But no better than the majority of Caucasians. So the evidence really suggests more that the DNA came from a Caucasian.