Prosecutor's fallacy
From Academic Kids

The prosecutor's fallacy is a fallacy of statistical reasoning that takes several forms.
 One form of the fallacy results from neglecting the a priori odds of a defendent being guiltyi.e., the chance of an individual being guilty absenting specific evidence is the gross incident rate of perpetrators in the general population. When a prosecutor has collected some evidence (for instance a DNA match) and has an expert testify that the probability of finding this evidence if the accused were innocent is tiny, the fallacy occurs if it is concluded that the probability of the accused being innocent must be comparably tiny. The probability of innocence would only necessarily be comparably tiny if the a priori odds of guilt were 1:1that is, if the probability of innocence is computed with an a priori presumption of guilt.
 Another form of the fallacy results when evidence is compared against a large database. The mere size of the database elevates the likelihood of finding a match by pure chance alone. i.e., DNA evidence is soundest when a match is found after a single directed comparison because the existence of matches against a large database where the test sample is of poor quality (common for recovered evidence) is very likely by mere chance.
Contents 
Why this is fallacious: several examples
A concrete example can make it clear why this reasoning is fallacious. Suppose there is a oneinamillion chance of a match given that the accused is innocent. The prosecutor says this means there is only a oneinamillion chance of innocence. But in a community of 10 million people, one expects about 10 matches by pure chance, and the accused may be one of those ten. That would indicate only about oneinten chance of guilt, if no other evidence is available.
In another scenario, assume a rape has been committed and that a sample is compared against 20,000 men that have their DNA on record in a database. A match is found, and at his trial, it is testified that the probability that two DNA profiles match by chance is only 1 in 10,000. This does not mean the probability that the suspect is innocent is 1 in 10,000. Since 20,000 men were tested, there were 20,000 opportunities to find a match by chance; the probability that there was at least one DNA match is
 <math>1  \left(1\frac{1}{10000}\right)^{20000} \approx 86\%<math>
which is considerably more than 1 in 10,000. (The probability that exactly one of the 20,000 men has a match is about 27%, which is still rather high.)
Now consider this case: you win the lottery jackpot. You are then charged with having cheated, for instance with having bribed lottery officials. At the trial, the prosecutor points out that winning the lottery without cheating is extremely unlikely, and that therefore your being innocent must be comparably unlikely. This reasoning is clearly faulty: the prosecutor failed to mention that cheating lottery winners are much more rare than honest winners.
Mathematical analysis
We can view finding a person innocent or guilty in mathematical terms as a form of binary classification.
We start with a thought experiment. I have a big bowl with one thousand balls, some of them made of wood, some of them made of plastic. I know that 100% of the wooden balls are white, and only 1% of the plastic balls are white, the others being red. Now I pull a ball out at random, and observe that it is actually white. Given this information, how likely is it that the ball I pulled out is made of wood? Is it 99%? No! Maybe the bowl contains only 10 wooden and 990 plastic balls. Without that information (the a priori probability), we cannot make any statement. In this thought experiment, you should think of the wooden balls as "accused is guilty", the plastic balls as "accused is innocent", and the white balls as "the evidence is observed".
The fallacy can be analyzed using conditional probability: Suppose E is the observed evidence, and I stands for "accused is innocent". We know that P(EI) (the probability that the evidence would be observed if the accused were innocent) is tiny. The prosecutor wrongly concludes that P(IE) (the probability that the accused is innocent, given the evidence E) is comparatively tiny. However, P(EI) and P(IE) are quite different; using Bayes' theorem we see
 P(IE) = P(EI) · P(I) / P(E)
So the a priori probability of innocence P(I) and the overall probability of the observed evidence P(E) need to be taken into account. If P(I) is much larger than P(E), then P(IE) can be large as well.
We can also formulate Bayes' theorem with odds:
 Odds(IE) = Odds(I) · P(EI)/P(E~I)
Without knowledge of the a priori odds of I, the small value of P(EI) does not necessarily imply that Odds(IE) is small. (P(E~I), the probability that the evidence is observed given the accused is guilty, is assumed to be high.)
The fallacy lies in the fact that the a priori probability of guilt is not taken into account. If this probability is small, then the only effect of the presented evidence is to increase that probability somewhat, but not necessarily dramatically. (In the earlier example of a 10 million city, the presented evidence raises the a priori probability of guilt of 1 in 10 million to an a posteriori probability of guilt of 1 in 10.)
The prosecutor's fallacy is therefore no fallacy if the a priori odds of guilt are assumed to be 1:1. In an Bayesian approach to personal probabilities, where probabilities represent degrees of belief of reasonable persons, this assumption can be justified as follows: a completely unbiased person, without having been shown any evidence and without any prior knowledge, will estimate the a priori odds of guilt as 1:1.
In this picture then, the fallacy consists in the fact that the prosecutor claims an absolutely low probability of innocence, without mentioning that the information he conveniently omitted would have led to a different estimate.
In legal terms, the prosecutor is operating in terms of a presumption of guilt, something which is contrary to the normal presumption of innocence where a person is assumed to be innocent unless found guilty. A more reasonable value for the prior odds of guilt might be a value estimated from the overall frequency of the given crime in the general population.
Defendant's fallacy
The defendant's fallacy (taking the earlier example) would be to say, "We would expect 10 matches in this city of 10 million people, so this particular piece of evidence suggests there is 90% chance that the accused is innocent. So this evidence cannot be used to point to a conclusion of guilt, and should be excluded."
The problem with the defendant's argument is that there may be other available evidence which on its own is also not conclusive. For example if CCTV cameras surrounding the scene of the crime spotted one hundred people there at the relevant time, one of which was the accused, then the defendant could claim: "The video suggests a 99% chance that the defendant is innocent. The match suggested a 90% chance of innocence. So the conclusion should be a finding of innocence."
When the photographic evidence is combined with the match, the two together point strongly towards guilt, since (assuming the chance of being in the photograph and having the match are independent) the chance that the accused is innocent falls to about 0.01%. But similarly it should be noted that this low probability of innocence is not proof of guilt.
The Sally Clark case
Consider for instance the case of Sally Clark, who was accused in 1998 of having killed her first child at 11 weeks of age, then conceived another child and allegedly killed it at 8 weeks of age. The defense claimed that these were two cases of sudden infant death syndrome; neither prosecution nor defense offered any other explanations for the deaths. The prosecution had expert witness Sir Roy Meadow testify that the probability of two children in the same family dying from sudden infant death syndrome is about 1 in 73 million. But based on this alone, it is likely that there would be at least one person in the country to whom this has occurred. To provide proper context for this number, the probability of a mother killing one child, conceiving another and killing that one too, should have been estimated and compared to the 1 in 73 million figure, but it was not. Ms. Clark was convicted in 1999, resulting in a press release by the Royal Statistical Society which pointed out the mistake. (See link at end of article.) A higher court later quashed Sally Clark's conviction, on other grounds, on 29 January 2003.
See also
External links
 Press release by the Royal Statistical Society about the Sally Clark case: http://www.rss.org.uk/docs/Royal%20Statistical%20Society.doc
 http://www.colchsfc.ac.uk/maths/dna/discuss.htm
 http://dnaview.com/profile.htm