Can Bad Science Be Good Evidence: Neuroscience, Lie-Detection, and Beyond

Frederick Schauer - University of Virginia Law School

Posted in , , , , , , , , , , ,

The possibility of using functional magnetic resonance imaging— fMRI, or “brain scans”—to detect deception in legal settings has generated great controversy and, indeed, widespread resistance.  Although neuroscience-based lie-detection appears to hold out the promise of improvements to existing methods of detecting deception, most academic neuroscientists have balked, insisting that the research is flawed science, containing weaknesses of (1) reliability (the degree of accuracy), (2) external validity (do laboratory results predict real-world outcomes?), and (3) construct validity (do studies test what they purport to test?).  These flaws are real, but although using neuroscience-based lie-detection in non-experimental legal settings may well be premature, the critics are mistaken in  believing that scientific standards should determine when this method—or any other science, for that matter—is appropriate for legal use.  Law and science have different goals, and the legal suitability of neuroscience-based lie-detection must, in the final analysis, depend on legal standards and criteria and not on the standards and criteria that determine what is (or what is not) good science.

The Existing Debate

Because courts in making factual determinations rely on witness accounts rather than on the direct investigation common in so many other fields, witness credibility has long been a central concern of the law.  These concerns are often about the misperception, misrecollection, hedging, fudging, bending, and slanting rampant in a system dominated by testimony from self-interested parties , but the law also worries about flat-out lying.   Because we have lost much of our faith in the oath to assure veracity, and because cross-examination is more effective in exposing liars on television than in real courtrooms, the legal system is constantly seeking better ways of determining who is lying and who is not.  Legal debates about lie-detection technology have existed since the 1920s, and, indeed, the long-enduring Frye test1 for determining the admissibility of expert testimony arose in the context of a rudimentary lie-detection device.  Although lie-detection technology has improved substantially over what it was when the D.C. Circuitdecided Frye in 1923, with few exceptions the law still prohibits the use of polygraphs, electroencephalography, periorbital spectrography, analysis of facial micro-expressions, and various other technologies, relying instead on the traditionally alleged ability of the technologically unaided judge and jury to determine when witnesses are lying and when they are telling the truth.

The terrain has changed considerably as a result of relatively recent claims that the techniques of modern neuroscience—especially fMRI—can identify deception more accurately than previous techniques.  Those who have made these claims support them with reference to a considerable quantum of experimental studies and published research.  But numerous neuroscientists have challenged these claims, arguing that the differences between experimental subjects and those offering actual evidence in court are so great as to create fatal problems of external validity.  Even more troubling, it is said, is that major problems of construct validity arise because an instructed lie in the laboratory is simply not a real lie at all.  In addition, much of the research has not been published in peer-reviewed journals, has been produced by financially interested scientists connected with commercial “truth verification” companies such as CEPHOS or No Lie MRI, and obscures significantly lower rates of reliability than the proponents of the technology have alleged.

For such reasons, the overwhelming scientific opinion is that fMRI lie-detection is a “research topic and not a legal tool,”2 a conclusion emphasized by worries about how the research might be used.  It would be wrong, some neuroscientists have argued, to use brain scans to “send [someone] to prison,” 4  And thus one widely discussed article has urged a moratorium on the law’s use of neuroscience-based lie-detection until sponsors of the technology can establish its reliability to the satisfaction of a federal regulatory agency.5

Critiquing the Critics

At one level, these criticisms appear sound.  External and construct validity really are problematic, the claims of reliability are genuinely exaggerated, the lack of peer review is a fact, and some of the most prominent researchers are indeed connected with interested commercial entities.  But at another level the criticisms are based on three flawed premises: the first is that most legal decisions are about sending people to prison; the second is that the standards of science should be the standards of law; and the third is that the legal system’s current approach to determining veracity is acceptable.  I will discuss each of these in turn.

Although the legal system does send people to prison, it can do so only when prosecutors have proved the case against the defendant beyond a reasonable doubt.  So even if we were to imagine that prosecutors, in the face of existing Fifth Amendment self-incrimination law, could compel an involuntary brain scan of a defendant, there is no question that the reliability of fMRI-based lie-detection is nowhere near high enough to support a conviction under the beyond-a-reasonable-doubt standard.  But the converse of the prosecution’s heavy burden is the defendant’s ability to defeat conviction by showing only a reasonable possibility—not even a probability—of innocence.  Even if neuroimaging lie-detection reliability were only, say, .60, a level grossly insufficient to support conviction, that level, if used to support a defendant’s alibi or to undercut an arresting officer or eyewitness’s testimony, could easily raise a reasonable doubt as to guilt.

That slight, but somewhat plausible, confidence levels in the evidence indicating innocence are sufficient to acquit, even if nowhere near sufficient to convict, is just one example of legal decisions made on the basis of confidence levels far below those necessary for scientific publication or other scientific uses within the scientific community.  Judgments in civil cases, for example, demand only a “preponderance of the evidence,” and other decisions require “reasonable suspicion,” “probable cause,” or occasionally merely a “scintilla of evidence.”  Most importantly, the confidence necessary to admit an individual item of evidence is much lower than that necessary to justify a verdict—under Rule 401 of the Federal Rules of Evidence, for example, admission into evidence requires only that a proposition be more likely with the evidence than without, a standard dramatically lower than proof beyond a reasonable doubt and even than a preponderance of the evidence.  To hold that evidence must in some way be very highly reliable just to be admitted is to urge a conclusion dramatically at odds with fundamental premises of the trial process and the law of evidence.

The foregoing is an explanation of why the levels of reliability commonly (and properly) demanded by scientists for scientific purposes are not and should not be the levels for determining the admissibility of single items of evidence, but the same considerations apply to degrees of external validity.  Often-criticized psychological experiments on college undergraduates, for example, can justify conclusions about the behavior of people in non-laboratory settings when and because we know from other research that the behavior of subjects correlates with that of non-subjects.  Similarly, although deception in the laboratory differs from deception in the non-laboratory real world, the existence of any positive correlation between laboratory and non-laboratory deception would mean that the laboratory results provide some evidence justifying non-laboratory conclusions, and determining whether that “some” evidence was sufficient requires the use of legal and not scientific standards.

So too for questions of construct validity.   With one prominent exception, 6 the research on neuroscience-based lie-detection has been conducted largely with instructed lies, which are not lies at all, but simply examples of subjects following the instructions of the researchers.  But if the ease of telling an instructed lie in the laboratory correlates with the ease of telling a real lie outside the laboratory, then the research on instructed lies is no longer irrelevant to detecting real lies.  With any positive correlation between instructed and real lies, experiments on the former will tell us something about the latter, and whether that “something” is enough again depends on the uses for which the research is employed.  That which is inadequate for scientific publication or criminal prosecution may be sufficient, for example, for a defendant seeking only to raise the possibility of a reasonable doubt.

In law as well as in science, “compared to what?” is often the important question. Accordingly, evaluating fMRI lie-detection requires knowing what it would supplement or replace.  As it turns out, the methods that are now used to determine witness veracity are substantially worse than the lie-detection science that remains routinely excluded.  Currently, the jury or judge (when there is no jury and the judge is serving as trier-of-fact) is charged with determining if witnesses are telling the truth.  And, thus, when cross-examination provides little assistance, as it rarely does, courts instruct juries to evaluate the “demeanor” of a witness to determine veracity.  In doing this, however, jurors (and, we suspect, judges as well) rely on numerous myths, urban legends, and pop psychology with little reliability.  They distrust witnesses who perspire, fidget, and fail to make eye contact, and trust those who speak confidently while looking directly at them.  Research has shown that ordinary people’s ability to distinguish truth from lies rarely rises much above random, and juries and judges are unlikely to do better.  The question is thus not whether neuroscience-based lie-detection is good in the abstract, but instead whether it should be rejected in favor of methods—methods that go to the heart of the jury system, have been in place for centuries, and are extremely unlikely to be changed substantially—that are demonstrably unreliable.  The answer to this question will depend on a comparison for which there is yet little evidence, but the point is only that the admissibility of neural lie-detection evidence must be based on an evaluation of the realistic alternatives within the legal system and not on a non-comparative assessment of whether neural lie-detection meets the standards that scientists use for scientific purposes.

But might jurors and judges interpret a brain image as having more evidentiary value than it actually possesses?  If a brain scan supporting a defendant’s alibi, for example, had a reliability rate of .72, might a juror, seeing a “picture” of a brain in vivid color, ignore the .28 chance of error and assume absolute accuracy?  Might a juror take a scan supporting only one aspect of a defendant’s story as “proving” that the defendant was innocent?  These risks are real, but the little research that exists on over-valuation of brain scans tends to fail to distinguish seeing a brain image from seeing an equally realistic color photograph not of a brain, or even any pseudoscientific explanation unrelated to brain imaging.  One part of one study, for example, compared perception of a brain image with straight text,  with a two-color bar graph, and with a complex color diagram, but such comparisons alone would be insufficient to conclude that the brain image produced more unjustified reliance than the color photographs that are a routine component of the typical trial.  On the existing research, we cannot say that the degree of unjustified reliance, therefore, is a function of a brain scan appearing photographic or instead a function of it appearing to be a photograph of a brain.  Studies that do not control for the potential confounds of the non-brain attributes of a brain image thus cannot tell us whether it is a brain image that produces unwarranted attribution of content, or whether the effect comes simply from the kinds of multi-color or photographic images routinely used for forensic purposes.  Although overvaluation is a legitimate worry, the degree of legitimacy is a function of how much, if at all, a brain image produces perceptual distortions beyond those that are already endemic to litigation.

The Standards for Legal Use of Science Cannot be Derived from Science Itself

Once we comprehend the range of standards the legal system now uses, the scalar rather than binary character of reliability and validity, and the legal system’s venerable reliance on techniques for identifying deception that are worse than even the most modest possibilities for neural lie-detection, the case against fMRI becomes substantially less compelling.  Still, the use of neural lie-detection now is probably unwarranted.  But, whether it is, and if not now then when, is a determination that the legal system cannot make solely on the basis of scientific standards.  The goals of the legal system differ from those of science, and law and science make different kinds of decisions for different purposes.  Consequently, what is good enough for science may sometimes not be good enough for law.  Conversely, and more directly relevant here, what is not good enough for science may sometimes be good enough for law.  Science must inform the legal system about reliability rates and degrees of validity, but whether some rate or degree is good enough for some legal purpose is, in the final analysis, a question of law and not of science.

Is Daubert the Problem?

Some of the foregoing conclusions may well be in some tension with the Supreme Court’s conclusions in Daubert v. Merrill-Dow,7 Kumho Tire v. Carmichael,8 and General Electric v. Joiner,9 To the extent that that is so, however, then perhaps certain aspects of the Daubert revolution need to be rethought.  Daubert’s concern about junk science is legitimate, but more vigorous application of summary judgment and related forms of pre-trial dismissal may better address that concern than does distorting longstanding evidentiary principles.  And, even if Daubert is correct in using the law of evidence to deal with the junk science problem, Daubert’s often-criticized acceptance of scientific criteria for determining validity and reliability may have started us on a path that is in need of correction.  If my conclusions about neuroscience-based lie-detection are in some tension with Daubert, and it is not entirely clear that they are, then it may be that it is Daubert and not those conclusions that is in need of adjustment.


Copyright © 2010 Cornell Law Review.

Frederick Schauer is David and Mary Harrison Distinguished Professor of Law at University of Virginia Law School.

  1. See Frye v. United States, 293 F. 1013, 1013–14 (D.C. Cir. 1923).
  2. Elizabeth A. Phelps, Lying Outside the Laboratory: The Impact of Imagery and Emotion on the Neural Circuitry of Lie Detection, in USING IMAGING TO IDENTIFY DECEIT: SCIENTIFIC AND ETHICAL QUESTIONS 14, 20 (American Academy of Arts and Sciences, 2009).
  3. Deborah Halber, Scientists: A Good Lie Detector is Hard to Find, MIT News, (quoting Nancy Kanwisher).
  4. Id. (speaker unidentified).
  5. Henry T. Greely & Judy Illes, Neuroscience-Based Lie Detection: The Urgent Need for Regulation, 33 AM. J.L. & MED. 377, 405–15 (2007).
  6. Joshua D. Greene & Joseph M. Paxton, Patterns of Neural Activity Associated with Honest and Dishonest Moral Decisions, 106 PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES USA 12506, 12509–10 (2009).
  7. 509 U.S. 579 (1993).
  8. 526 U.S. 137 (1999).
  9. 522 U.S. 136 (1997).

Post a Comment (all fields are required)

You must be logged in to post a comment.