“A Prank, A Cigarette and a A Gun”
An article about the murder of Meredith Kercher. I’ve agreed not to say much but if you’re interested in this case this is the most important read of all. The truth of what happened is in here.
by Sigrun M. Van Houten
What is Best Fit Analysis, a quick intro
Students of statistics and stochastic analysis will recognize the term “Best Fit” as a statistical construct used to determine a most probable graph on a scatter plot. In the same manner when the clandestine services seek to resolve a most probable narrative of events when information about that narrative is limited or inaccessible, one can assess the information that is available to construct a probabilistic scatter plot. Once done, it is then possible to graph a “line” that represents a best fit to those data points. In this case, the data points making up the scatter plot are individual facts or evidence and the associated probability they possess as inference to a larger narrative. And the “line” drawn as a best fit is the most likely narrative of events.
This procedure has parallels in normative notions well understood in western law for some time. Namely, it deals with probative value which, though perhaps not used in the strict sense used in law, is used here as a catch-all to describe each data point. Each data point reflects the probability of a given piece of evidence. But what do we mean by “probability of a given piece of evidence”? In Best Fit analysis (BFA) we begin by constructing a hypothesized narrative. When applied to criminology, the hypothesized narrative usually presents itself fairly easily since it is almost always the “guilt narrative” for a given suspect or suspects in a crime. In this short introduction to BFA, I will show how it can be used in criminology. The advantage in criminology is that rather than having to sort through innumerable hypotheses as is common in the clandestine services, here we have the advantage that we usually have a hypothesis presented to us on account of an accusation or charge. We can then use BFA to test the narrative to see if it is the most likely narrative. With perturbations of the same, we can likely identify alternative narratives more likely to be correct.
Norms of western law are dated in some cases, and some have not been updated for a long, long time. One of those areas apparently is the area of probative value. Typically in Courts of western nations it is presumed that a piece of evidence has “probative value” if it points to a desired inference (which may be a guilt narrative or some more specific component thereof). I’m not an attorney so I can’t categorically state that the concept of a “desired inference” really refers to an overall guilt narrative or simply the odds that the evidence points to a guilt narrative. But what I can say is that in practice it almost always is used in the sense of an overarching narrative or reality.
A case in point is a famous case in which a man was accused of murdering his wife during a divorce. It turned out that his brother had actually committed the crime. But once his brother was convicted an attempt was made to convict the husband of the crime by the accusation that he “contracted” with the brother to commit the crime and end his divorce case favorably. In the second trail of the husband the evidence was almost entirely circumstantial and the jury relied heavily on an increase in phone activity between the husband and his brother leading up to the murder. Normally, the brothers had not spoken on the phone often and there was a clear and obvious sudden increase in the frequency of calls. The jury interpreted this as collusion and convicted the husband of murder. Thus, when brought to Court, the desired inference of testimony and records of phone calls was that collusion existed. This is a piece of evidence being used to point to a guilt narrative. The problem however, was that it was never shown why it should be more likely to have inferred collusion than simply distress over a divorce. It is not unusual for parties in a divorce to reach out to family and suddenly increase their level of communication at such a time. In other words, and on the face of it, one inference was just as likely as the other.
What legal scholars would say is that this is a reductionist argument and fails because it does not take into account the larger “body of evidence”. Unfortunately, this is mathematically illiterate and inconsistent with the proper application of probability. This is because it takes a black and white view of “reduction” and applies it incorrectly, resulting in a circularity condition. The correct answer is that
… One takes a body of evidence and reduces it to a degree sufficient to eliminate circularity and no further.
In other words, it is not all or nothing. In fact, this kind of absolutist understanding of “reductionist argumentation” is precisely what led to the results of the Salem Witch Trials. In those cases, probative value was ascribed based on a pre-existing hypothesis or collection of assumptions; essentially a cooked recipe for enabling confirmation bias either for or against guilt.
To explain what we mean, in the case of the phone calls between brothers, one cannot use a hypothesized narrative (the inference itself) to support the desired inference. This is circularity. But one also cannot reduce the evidence to such a degree that the body of evidence in toto is not weighed upon the merit of its parts. From the perspective of probability theory, this means that we must first determine whether, as an isolated construct, the probability that the phone calls between brothers were for the purpose of collusion must be greater than the probability that the calls were due to emotional distress. And it must be something we can reasonably well know and measure. While we can never apply numerical values to these things, it must at least be an accessible concept. Once we’ve looked at the odds of each of the two possible inferences we can then ask which is more likely. Unless the inference that the calls were for the purpose of collusion is greater than the odds that the calls were for the purpose of emotional support, there can be no probative value (in the sense we are using that term here).
The reason for the “isolation” is that we cannot determine aforesaid odds by using the inference, or the larger narrative, to support those odds because it is the narrative that is the hypothesis itself. Having said that, once we have done this, if we can show that the odds are greater that the calls between brothers were for the purpose of collusion, even if that difference of probability between the two inferences is very small, the phone calls can then be used to assess the likelihood of the guilt narrative by considering it in the context of the body of knowledge. In other words, if we could associate numbers with this analysis as a convenience for illustration, if we have 10 pieces of evidence bearing only, perhaps 5% probability difference favoring the guilt narrative, it might be possible nonetheless to show that the guilt narrative is the most likely narrative. In other words, we consider all evidence, each with its own net odds, in order to frame the odds of the guilt narrative. And we are therefore using reduction only to the extent that it excludes circularity, and no more. And both the number of evidentiary items and the odds of each matter. If we had 3 pieces of evidence each bearing a net probability of 90% favoring a guilt narrative, it might be just as strong as 10 pieces bearing a net probability of only 5%. And it is these odds that must be left to the jury, as it is not a mathematical or strictly legal exercise but an exercise in conscience and odds.
Sadly, it is routine practice in western Courts to employ probative value in such a manner as to establish in the juries thinking a circularity condition whereby the larger narrative of guilt or innocence is used to substantiate the probative value of individual pieces of evidence. The way to control this is for the understanding of probative value to change and modernize, and to require Judges to reject evidence (rule inadmissible) that either does not point in any direction (net odds of 0%) or points in a different direction than the desired inference. This is a judgment call that can only be left to the Judge since to leave that in the hands of the jury effects prejudice by its very existence. While there seems to be lip service to treating probative value as we’ve described, it appears to almost never be followed in practice and most laws and Court regulations permit Judges to use their “discretion” in this matter (which, in practice, amounts to accepting evidence with zero probative value). Standards are needed to constrain the degree of discretion seen in today’s Courts and to render the judgment of Judges in matters of probative value more consistent and reliable. One way to do this is to treat evidence as it is treated under BFA.
While many groups that lobby and advocate against wrongful conviction cite all sorts of reasons for wrongful convictions, tragically they seem to be missing the larger point which is that these underlying, structural and systemic issues surrounding probative value are the true, fundamental cause of wrongful conviction. For without proper filtering of evidence, things like prosecutorial misconduct, bad lab work, etc. find their ways to the jury. It is inevitable. But the minute you mention “structural” or “systemic” problems everyone runs like scared chickens. No one wants to address the need for major overhauls. But any real improvement in justice won’t come until that happens.
Thus, with BFA, in the clandestine context, we take a large data dump of everything we have. Teams go through the evidence to eliminate that which can be shown on its face to be false. Then we examine each piece of evidence for provenance and authenticity, again, only on what can be shown on its face. I’m condensing this process considerably, but that is the essence of the first stage. We then examine in each piece in relation to all advanced hypotheses and assign odds to each. Once done, we look at the entire body of evidence in the last stage to determine which of the narratives (hypotheses) requires the least number of assumptions to make it logically consistent. That one with the least number of assumptions is the Best Fit. If we were to graph it we would see a line running through a scatter plot of probabilistic evidence. That line represents the most likely narrative. On that graph assumptions appear as “naked fact” and are “dots” to be avoided.
To see a good example of how BFA is employed, you can see my work on the Lizzie Andrew Borden, Jon Benet Ramsey, Darlie Routier and Meredith Kercher cases. That this method is remarkably more effective than what we see in police investigations and Courts is well-known by those that have used this technique for at least three decades now. But it has been somewhat outside the radar of the general public because of its origins. My hope is that through public awareness this method can be applied to criminology and jurisprudence resulting in a far greater accuracy rate in the determination of what actually occurs during criminal acts, especially in matters of Capital crimes where the added overhead is well worth it.
Notice: I am told that Mozilla users can experience a copy of this report that has sections of text missing. I recommend that Mozilla users download the pdf and view it on their desktop. – kk