Nos.

WR-75,015-01, WR-75,015-02 In the COURT OF CRIMINAL APPEALS OF TEXAS ______________________________________________

In re THE STATE OF TEXAS EX REL. PATRICIA R. LYKOS Relator, v. HON. KEVIN FINE, PRESIDING JUDGE, 177TH DISTRICT COURT OF TEXAS, Respondent. ______________________________________________

REAL PARTY IN INTEREST JOHN EDWARD GREEN ’S SECOND BRIEF IN OPPOSITION TO MOTION FOR LEAVE TO FILE PETITION FOR WRIT OF PROHIBITION AND PETITION FOR WRIT OF MANDAMUS

Richard Burr SBN 24001005 PO Box 525 Leggett, TX 77350 713-628-3391 713-893-2500 (fax)

John P. Keirnan SBN 11184700 917 Franklin St., Ste 550 Houston, TX 77002 713-236-9700 713-236-1802 (fax)

Robert K. Loper SBN 124562300 111 W. 15th Street Houston, TX 77008 713-880-9000 713-869-9912 (fax)

Counsel for Real Party in Interest, John Edward Green

Table of Contents

Introduction and Statement of the Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Argument . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 I. MR. GREEN ’S CLAIM IS THE KIND OF CLAIM THAT THIS COURT HAS HELD CAN BE A VIABLE CLAIM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 THE EVIDENCE GREEN HAS BEGUN TO PRESENT AND WILL CONTINUE TO PRESENT IF HE IS ALLOWED TO RESUME THE HEARING BEFORE THE TRIAL COURT IS RELEVANT TO HIS CLAIM THAT THE DEATH PENALTY STATUTE AS APPLIED TO HIS CASE CREATES AN UNACCEPTABLE RISK OF WRONGFUL CONVICTION . . . . . . . . . . . 6 A. The evidence presented thus far . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 1. 2. The prosecution’s case against Mr. Green . . . . . . . . . . . . . . . . . . . . . . . . 10 Testimony from two different data collections about wrongful convictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Testimony concerning eyewitness identification, the lack of discovery, and a Texas legislative response to the concern about wrongful convictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Testimony concerning the use of informants as prosecution witnesses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

II.

3.

4.

B. III.

Evidence yet to be presented . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

THE QUESTION PRESENTED BY MR. GREEN REQUIRES A WIDE -RANGING EXPLORATION OF EVIDENCE TO ASSIST THE COURT IN DECIDING THE ISSUE PRESENTED BY THE PARTICULAR CIRCUMSTANCES OF MR. GREEN ’S CASE, AND THE EXPLORATION OF THIS EVIDENCE IS SQUARELY WITHIN THE COURT ’S POWER AND DUTY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 Certificate of Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

i

Table of Authorities

Cases Atkins v. Virginia, 536 U.S. 304 (2002) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Buntion v. Harmon, 827 S.W.2d 945 (Tex.Crim.App 1992) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Curry v. Wilson, 853 S.W.2d 40 (Tex.Crim.App. 1993) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Daubert v. Merrill Dow Pharmaceuticals, Inc., 509 U.S. 579 (1993) . . . . . . . . . . . . . . . . . . . . 21 Morrow v. Corbin, 62 S.W.2d 641 (Tex. 1933) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 Paredes v. State, 129 S.W.3d 530 (Tex. Crim. App. 2004) . . . . . . . . . . . . . . . . . . . . . . . . . 2, 3, 4 Roper v. Simmons, 543 U.S. 551 (2005) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Scheanette v. State, 144 S.W.3d 503 (Tex.Crim.App. 2004) . . . . . . . . . . . . . . . . . . . . . . . . . . 2, 4 State ex rel. Hill v. Fifth Court of Appeals, 34 S.W.3d 924 (Tex.Crim.App. 2001) . . . . . . . . . . . 1 State v. Patrick, 86 S.W.3d 592 (Tex.Crim.App. 2002) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Trop v. Dulles, 356 U.S. 86 (1958) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 White v. Reiter, 640 S.W.2d 586 (Tex.Crim.App. 1982) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 Woodson v. North Carolina, 428 U.S. 280 (1976) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32-33 Wright v. West, 505 U.S. 277 (1992) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

Treatises and Articles Alexandra Natapoff, Snitching (New York University Press 2009) . . . . . . . . . . . . . . . . . . . . . . 22 Biklé, Judicial Determination of Questions of Fact Affecting the Constitutional Validity of Legislative Action, 38 Harv. L. Rev. 6 (1924) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

ii

Brandon Garrett, Convicting the Innocent: Where Criminal Prosecutions Go Wrong (Harvard University Press 2011) (in press) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 Chang Su, et al., Evaluation of Rarity of Fingerprints in Forensics, Proceedings of Neural Information Processing Systems, Vancouver, Canada, December 6-9, 2010 . . . . . . . . . . . . . . . 28 I.E. Dror, et al., Cognitive issues in fingerprint analysis: Inter- and intra-expert consistency and the effect of a ‘target’ comparison, Forensic Sci. Int. (2010) (in press) . . . . . . . . . . . . . . . . 28 National Academy of Sciences, Strengthening Forensic Science in the United States: A Path Forward (National Academies Press 2009) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

iii

Introduction and Statement of the Case The Court is reconsidering its denial of Harris County District Attorney Patricia Lykos’ motion for leave to file Petitions for Writ of Mandamus and Prohibition seeking to preclude the trial court from continuing the evidentiary hearing on, and further entertaining, John Edward Green’s Amended Motion to Declare Article 37.071, § 2 of the Texas Code of Criminal Procedure Unconstitutional as Applied. To prevail, Lykos must meet two requirements: First, she must have a clear and indisputable right to the relief sought. Second, she must have no other available legal remedy to complain about the action the court at issue is taking. State v. Patrick, 86 S.W.3d 592, 594 (Tex.Crim.App. 2002) (mandamus); State ex rel. Hill v. Fifth Court of Appeals, 34 S.W.3d 924, 927 (Tex.Crim.App. 2001) (mandamus); Curry v. Wilson, 853 S.W.2d 40, 43-44 (Tex.Crim.App. 1993) (prohibition); Buntion v. Harmon, 827 S.W.2d 945, 947 (Tex.Crim.App 1992) (prohibition). We argued in Green’s Brief in Opposition to Motion for Leave to File Petition for Writ of Prohibition and Petition for Writ of Mandamus, filed November 23, 2010, that Lykos meets neither of these requirements. In response, the Court denied leave to file the petitions. The Court’s order did not mention whether, in its view, Lykos met the second requirement for mandamus and prohibition – whether she had any other available legal remedy – but instead focused entirely on whether she had a clear and indisputable right to relief. As to this matter, the Court wrote the following: According to relator, during the hearing, “[r]espondent is to preside over the litigation of the actual innocence of a Texas capital murder defendant [who] has been executed. The substance of the evidentiary hearing is to provide the basis for the [r]espondent’s ruling on the defense motion to hold the Texas death penalty unconstitutional.” According to the title of his amended motion, the defendant is challenging the

constitutionality of Article 11.071 as it applies to him in his situation. In Paredes v. State, 129 S.W.3d 530, 540 (Tex. Crim. App. 2004), Paredes made a similar challenge to the constitutionality of Texas Code of Criminal Procedure Article 37.071. In our opinion, we recognized that while the “execution of an innocent person would violate due process, the risk that another person who may be innocent will be executed does not violate appellant’s due process rights.” Thus, it appears that much of the “evidence” respondent seems to want presented at this hearing is not relevant to the question at issue. However, because we cannot know whether relevant evidence will be presented, we find that relator's request that this Court order respondent to withdraw his order setting a hearing is premature and currently without a basis. Order, at 2, State of Texas ex rel. Lykos v. Fine, Nos. WR-75,015-01, WR-75,015-02 (November 29, 2010) [hereafter, “November 29 Order”]. The evidentiary hearing on Green’s Amended Motion commenced on December 6, and continued through December 7, 2010. On December 7, Lykos asked the Court to reconsider on its own motion the November 29 Order. At the end of the day on December 7, the Court agreed to reconsider its order and stayed further proceedings in the District Court pending its decision in this matter. As we demonstrate in this brief, the pending motion and ongoing hearing before the District Court do not present the issue Lykos says they present. Rather they present the very kind of issue that this Court suggested in Paredes v. State, 129 S.W.3d 530, 540 (Tex. Crim. App. 2004), and later acknowledged in Scheanette v. State, 144 S.W.3d 503, 505-06 (Tex.Crim.App. 2004), could state a claim under the Due Process Clause of the Fourteenth Amendment and the Eighth Amendment to the United States Constitution. The argument and evidence thus far presented before the District Court establish this very point. Together, they demonstrate indisputably that Green claims he is innocent and that, because of the nature of the evidence against him and the procedures by which his case will be tried and subsequently reviewed, he 2

faces a substantial risk of being convicted and executed despite his innocence. Green does not rest his claim on the execution of one or more innocent people in Texas. He does intend to show that two innocent people, who between them faced the same kind of evidence he faces – fraught with the risk of unreliability – have been executed in Texas. However, he will present this evidence solely for the purpose of demonstrating that the risk to him of wrongful conviction and execution – accruing solely because of the nature of the evidence against him and the procedures governing his case – is starkly real and not merely theoretical. For these reasons, the Court should deny Lykos leave to file her mandamus and prohibition petitions and allow the District Court to proceed to hear the evidence that is both relevant and necessary to a full and fair decision on Green’s Amended Motion. Argument I. MR . GREEN ’S CLAIM IS THE KIND OF CLAIM THAT THIS COURT HAS HELD CAN BE A VIABLE CLAIM The claim that the Court considered in Paredes was this: [A]ppellant claims that the death-penalty statute is unconstitutional because it violates the Due Process Clause of the Fifth and Fourteenth Amendments to the United States Constitution. Appellant argues that the risk of executing innocent persons and the long delays in uncovering evidence of innocence, often only possible with the benefit of newly developed scientific techniques such as DNA testing, compels a conclusion that our death-penalty statute violates due process. Appellant refers to reports, case studies, and court cases documenting the exoneration of actually innocent death row inmates. 129 S.W.3d at 540. The Court decided that this claim was without merit for the following reason: While execution of an innocent person would violate due process, the risk that another person who may be innocent will be executed does not violate appellant's due process rights. Appellant does not claim that he is innocent, and therefore 3

fails to demonstrate that his rights under the Due Process Clause have been violated by application of our death-penalty statute. See Cantu, 939 S.W.2d at 639 (challenge to constitutionality of Article 37.071 which did not state how operation of statute was unconstitutional as applied to defendant in his particular situation was without merit). Id (emphasis in original). Shortly thereafter, the Court faced a similar claim in Scheanette v. State, 144 S.W.3d at 505-06: Appellant claims in his fifth point of error that the Texas death-penalty scheme is unconstitutional under the Fifth and Eighth Amendments “because it leads the State to execute an unacceptable number of innocent defendants.” He further asserts that, under the cruel and unusual punishment clause of the Eighth Amendment, “the constitutionality of the death penalty must be determined and redetermined by the courts in keeping with evolving standards of decency and current knowledge about its operation.” Id. Again, because the appellant was not claiming that he was innocent and was not asserting his right not to be subject to wrongful conviction and execution, the Court rejected the claim: While the execution of an innocent person might violate federal due process and be considered cruel and unusual punishment, appellant does not claim that he is innocent. He therefore fails to demonstrate that his due process rights or his right to be free from cruel and unusual punishment have been violated by application of our death-penalty statute. Herrera v. Collins, 506 U.S. 390, 113 S.Ct. 853, 122 L.Ed.2d 203 (1993); Paredes v. State, 129 S.W.3d 530, 540 (Tex.Crim.App.2004). Id. at 506 (emphasis in original). Mr. Green’s claim is critically different from the claims rejected in Paredes and Scheanette. Unlike these two appellants, Green claims that he is innocent and that his constitutional rights – his Eighth Amendment right not to be subjected to an unacceptably high risk of wrongful conviction and execution – will be violated by being put to trial in a death penalty case. He is not complaining about the rights of others, but about his own. He is 4

complaining, precisely, about “how [the] operation of [the death penalty] statute [i]s unconstitutional as applied to [him] in his particular situation.” Paredes, 129 S.W.3d at 540 (parenthetical concerning Cantu v. State). As undersigned counsel explained at the opening of the hearing before Judge Fine, Your Honor, none of the claims described by [Assistant District Attorney] Curry are the claim we present. The claim that we present has not been decided by the U.S. Supreme Court. It is based entirely on the Court’s jurisprudence since 1972 which has a tremendous concern about the risk of unreliable decision making. Much of that risk has been focused on the penalty decision[]. But in Beck versus Alabama, the Court made it very clear that the risk of an unreliable guilt phase determination is just as much a concern under the Eighth Amendment. It’s a heightened risk. It’s different from the risk in any noncapital case because of the consequences of the capital conviction and sentence. That’s the starting point. Mr. Green is at risk because he has pled not guilty. He has maintained his innocence, and he is innocent. He is at risk for a wrongful conviction. The Eighth Amendment is concerned about that in a capital case more than it is in any other context. The Supreme Court has made that clear time and time again. That’s why this is an Eighth Amendment claim. The fact that other people have been convicted, condemned and executed and in subsequent light of new evidence appear to be wrongfully convicted is a part of [the] relevant evidence in this case, but it’s not Mr. Green's claim. His claim [is] that as he sits here today presumed innocent and actually innocent and facing evidence that is fraught with the possibility of mistake, he is at risk for a wrongful conviction and a wrongful sentence and a wrongful execution. Reporter’s Record, Volume 2 [hereafter, RR 2], at 17-18, State v. Green, No. 1170853 (177th District Court). Accordingly, Green’s claim is viable as a matter of law under Paredes and Scheanette.1

As we demonstrated in Mr. Green’s Brief in Opposition to Motion for Leave to File Petition for W rit of Prohibition and Petition for Writ of Mandamus, filed November 23, 2010, his claim is also viable under relevant Supreme Court precedent. Id. at 8-15 (explaining that the claim is rooted in the Supreme Court’s jurisprudence safeguarding capital defendants against the risk of unreliable determinations by juries in both the guilt-innocence and penalty phases of capital cases, and demonstrating that the Supreme Court has not held, as argued by Lykos, that the risk of wrongful conviction in a capital case can never be so great as to violate the Eighth Amendment).

1

5

II.

THE EVIDENCE GREEN HAS BEGUN TO PRESENT AND WILL CONTINUE TO PRESENT IF HE IS ALLOWED TO RESUME THE HEARING BEFORE THE TRIAL COURT IS RELEVANT TO HIS CLAIM THAT THE DEATH PENALTY STATUTE AS APPLIED TO HIS CASE CREATES AN UNACCEPTABLE RISK OF WRONGFUL CONVICTION This Court’s November 29 Order observed that if the hearing below were to focus

entirely or primarily on what the District Attorney claimed it would – “the actual innocence of a Texas capital murder defendant [who] has been executed” – then this evidence would “not [be] relevant to the question at issue,” because Green “is challenging the constitutionality of Article 11.071 as it applies to him in his situation.” November 29 Order, at 2 (emphasis supplied). Mr. Green agrees with this, and the record of the hearing demonstrates his agreement, because the evidence Green has presented and will continue to present “applies to him in his situation.” Defense counsel’s preview of the evidence for the trial court demonstrated this explicitly: Let me – let me talk about why this case is about John Green and what he – the charges he faces and the procedure that is coming for him unless the Court declares the death penalty unavailable in this case. Because that is the heart of the case. That’s where the risk arises and it’s how the risk has to be judged. If we don’t – if we don’t persuade you that there is a substantial risk right now today of the wrongful conviction of this young man, we lose. Plain and simple, we lose. Our job is to present to you the case that he stands here today in substantial risk of being wrongfully convicted, wrongfully sentenced to death and later put to death. That’s our burden and we gladly take it on. I’m sorry that the State doesn’t see this as a serious enough issue to stand in here and try get to the bottom of this. I think that’s unfortunate. But let me talk about the substantial risk for this young man. The case against him rests, as we understand it today, rests on three prongs. There is an eyewitness identification that we submit is a misidentification. There is a palm print, a partial palm print from the door of the vehicle on the side of the vehicle where the injured but surviving victim of the crime was, and there is informant – there are informant statements and presumably will be informant testimony. By informants, that is[,] people who are providing information in an exchange for 6

some benefit to themselves. That’s the case against Mr. Green. He made no inculpatory statement. There is no other evidence that we know of so far against him. We will present witnesses over the course of the next few days who will help the Court understand and see and find, we believe, that each of these areas of evidence as they are particularly constituted in Mr. Green's case present a substantial risk of wrongful conviction. Each of these kinds of evidence [is] highly correlated with wrongful convictions that have been studied across the United States and in Texas. They are – they are tremendous risk factors in and of themselves. Nestled into the factual context of this case, they are extremely relevant. We will demonstrate that to you. That's the heart of this case. We will demonstrate[,] again[,] to show that this is a deadly risk that Mr. Green faces[,] that in the cases of two men who have been executed, Todd Willingham and Claude Jones, together these same three risk factors were in their cases. There was, in Mr. Willingham’s case, informant testimony. There was forensic science which has turned out to be flawed, as we think the fingerprint evidence here, palm print evidence[,] will turn out to be flawed here. And in Claude Jones’ case, it was an eyewitness misidentification. We have all three of those factors that this young man faces. Those two factors in Willingham and one factor in Claude Jones’ case, we believe we can show led to their wrongful conviction and subsequent executions. We’re not putting forward that evidence as evidence that by itself is why Mr. Green should get relief. He shouldn’t. We’re putting that forward to simply show the Court that the risk we’re talking about that accrues from these evidentiary factors is tremendous and is of a nightmare proportion. You know, in the – in the law we talk about prejudice and harm. The showing about Willingham and Jones will be about the ultimate kind of prejudice and harm that can come from a wrongful conviction. What the Eighth Amendment is concerned about is the risk of a wrongful conviction so great that something should be done to ameliorate that in a particular case, and that’s our claim. RR 2, at 25-28. In addition to the risk of wrongful conviction created by the prosecution’s evidence against Mr. Green, counsel explained, There – there are some other procedures that put Mr. Green at risk. I’ve mentioned the three evidentiary factors that he faces in the State's case against 7

him. There are other – there are procedural matters which are not unique to his case but which are procedures that are followed in capital cases here in Harris County that [also] put him at risk or add to the risk. RR 2, at 37. The procedural factors noted by counsel for Green “include the lack of complete disclosure by the District Attorney of their files.” Id. The defense will present “evidence about ... inadequate discovery procedures generally contributing to wrongful convictions.” Id. Because of this, “there’s no full assessment of the weaknesses in the evidence, there's no full assessment by everybody of the case against Mr. Green.” Id. These procedural factors also include the consequences of the process of death-qualifying the jury in a capital case, “which allows [prospective] jurors to be taken off for cause ... if they cannot give fair and adequate consideration to death as a possible punishment.” RR 2, at 38. Counsel explained that the evidence will demonstrate that there are two relevant consequences of the death qualification process. Id. “One is[,] those who are left and sit on the jury are people who tend to be biased toward conviction.” Id. The other is that “death qualification communicates to prospective jurors that the person[] [is] guilty, that there’s going to be a penalty phase and that’s the thing that everybody is concerned about so this guy must be guilty.... [This] process ... distorts the fairness of the fact finding in guilt/innocence.” RR 2, at 38-39. Counsel explained that a third procedural factor adding to the risk of wrongful conviction “comes from the long history in the District Attorney’s office of making ... racially based peremptory challenges to prospective jurors.” RR 2, at 39. The concern is that the Harris County District Attorney has become adept at exercising racially-motivated peremptory strikes without that motivation being exposed and precluded under Batson v. Kentucky, 476 U.S. 79 (1986), and 8

that this creates a risk in and of itself of wrongful conviction, because “if you end up with an all white jury anywhere in the United States, empirical study after empirical study has shown and interviews with actual jurors have shown ... [that such] juries [are] not as accurate in fact-finding about guilt/innocence as a diverse jury.” The consequence of the prosecutor’s striking minority jurors is “not just that it demeans people of color who are trying to do their duty as jurors.... It is that the resulting jury is not as an effective jury as a jury that’s diverse[,] that reflects all of the experiences of a community rather than just those of a single group. And that has an effect on how a death case comes out at the guilt/innocence phase of the trial.” RR 2, at 39-40. Finally, counsel for Mr. Green explained that they would present evidence about two other procedural factors that contribute to the risk that Green will be executed if he is wrongfully convicted. These are the procedures by which convictions are examined through the lens of the whole case, including evidence not considered at trial – procedure[s] that the Supreme Court has time and again referred to as the safety net[,] as that thing that will catch people who have been wrongfully convicted and are actually innocent and are on their way to being put to death[,] and that is state habeas procedure, habeas corpus proceedings[,] and clemency proceedings. And we’ll demonstrate again through expert witnesses that those kinds of proceedings in this state provide no safety net at all. [It] is ... as if nothing is there when the artist falls off the trapeze. There is no safety net. People who are innocent are no more likely to be caught and saved by state habeas proceedings in the state than in clemency proceedings. And in clemency proceedings, the current governor [in] ten years has granted one clemency and has presided over the execution of hundreds of people. There is no safety net. So what happens here in the 177th Court to Mr. John Green is probably what’s going to happen to him throughout his days. If he’s convicted and he’s sentenced to death, he is going to be put to death, no matter what. RR 2, at 40-41. 9

A.

The evidence presented thus far 1. The prosecution’s case against Mr. Green

In the two days of evidence following the defense opening statement, counsel for Mr. Green began to put on the case we told the court we would present. We called six witnesses. Two, Brian Benken and Jim Willis, were Green’s investigators and were called to establish the facts of the offense and the nature of the prosecution case against Green. Benken established that the crime with which Green is charged involved the armed robbery and shooting of two Vietnamese women, Houng Thien Nguyen and her sister, My Houng Nguyen, as they were sitting in their car in the driveway of Thien’s home with her two young children at 1:00 or 1:30 am on June 8, 2008, after returning from a family reunion. RR 3, at 53-54. The assailant came up to the passenger side of the car and demanded My’s purse and other valuables. RR 3, at 53. Before she could comply, the assailant shot her twice with a pistol. Id. The assailant then went to the driver’s side of the car. RR 3, at 53-54. Thien opened the door and threw her purse out, then closed the door. RR 3, at 54. The assailant then shot Thien though the door, ran away, and got into a minivan and drove away. Id. Thien died, but My survived. RR 3, at 53-54. The children, sitting in the back seat, were not shot or otherwise assaulted. Benken testified that two descriptions were given of the assailant: one by Thien’s husband, who witnessed the incident from an upstairs window of their home, and one by My. RR 3, at 56, 57-58. Thien’s husband described the assailant as 5'9" and approximately 170 pounds. RR 3, at 56. My described him as a black male, 5'4", thin, and in his mid-30's, and apparently also helped make a sketch of the assailant. RR 3, at 57-58. Benken also testified that My Houng Nguyen was shown a six-person photo spread, 10

Defendant’s Exhibit 11, on June 12, 2008, while she was still in the hospital recovering from her wounds. RR 3, at 58. The spread included a photograph of Mr. Green when he was younger; Ms. Nguyen identified Mr. Green as the assailant. RR 3, at 61. Finally, Benken testified that five latent fingerprints were lifted from Thien’s car. RR 3, at 62. The Houston Police Department fingerprint examiner determined that there were insufficient discernible characteristics in these prints to allow comparison. RR 3,at 63. The District Attorney then engaged the firm in Mississippi which was involved in the examination of fingerprints instead of the Houston Police Department crime lab when the crime lab was being audited for unreliable procedures. That firm determined that one of the latent prints could be compared and that it matched Green’s left thumb and palm. RR 3, at 63-65. Jim Willis testified that two or three informants had also provided information to the police and prosecution. One, Kirk Felton, told the police, as he and his brother Kelly were being arrested for an unrelated crime, that he had information that tied Green to the shooting of the Nguyens. RR 3, at 75-76. The police then allowed Kirk Felton to go to the location where he said the gun used in the crime was located, and he produced the gun. RR 3, at 76-77. Willis testified that it appeared that both brothers were provided compensation in exchange for this information, because both were released from jail soon after the gun was found. Id. Willis also testified that a jailhouse informant, Brandon Zenon, told the police that Green made incriminating statements to him. RR 3, at 72. Zenon was in jail for a felony probation violation on a family violence charge and expects some form of benefit for his assistance to the prosecution. RR 3, at 74-75.

11

2.

Testimony from two different data collections about wrongful convictions

In addition to setting out the facts of the prosecution’s case against Mr. Green, counsel for Green began to present evidence concerning the causes of wrongful conviction and how that information related to Green’s risk of wrongful conviction. Two witnesses, Richard Dieter and Brandon Garrett, provided perspective on what is known about wrongful convictions on the basis of two different collections of data concerning cases of wrongful conviction. Two other witnesses, Sandra Thompson and Alexandra Natapoff, provided more in-depth information about two of the evidentiary risk factors in Mr. Green’s case, eyewitness identifications and informant testimony. Because of this Court’s stay order, no other witnesses were presented on behalf of Mr. Green. A review of these four witnesses’ testimony demonstrates that they provided evidence relevant to the risk of wrongful conviction in the circumstances of Mr. Green’s case. Richard Dieter is the director of the Death Penalty Information Center in Washington, DC. The Death Penalty Information Center, or DPIC, “is a nonprofit organization that does research and reports on issues related to capital punishment.” RR 2, at 54. It “make[s] information available to the public through reports, through contacts with the media, through the Internet and any other way to help inform the public, educate the public about some of the problems with the death penalty.” Id. at 54-55. In 1993, a Congressman on the House Judiciary Committee asked DPIC for a report on the risks of innocent people being executed in capital cases. Id. at 58. DPIC came to the conclusion that the best way to examine this question “would be to look at mistakes, wrongful convictions, people who actually were eventually exonerated and freed from Death Row as examples of the kind of cases that might exist....” Id. DPIC

12

identified 48 case of death row exonerations and reported that information to Congress. Id. at 59. Because of the continuing public interest in this question, DPIC continued to collect data concerning death row exonerations. The number of such cases has now grown from 48 in 1993 to 138 in 2010. Id.2 While DPIC does not make a systematic effort to discern the causes of wrongful conviction in the exonerees’ cases, Mr. Dieter is familiar with the various national studies of exonerees in both capital and non-capital cases. The research shows the following about the causes of wrongful conviction: [T]he one which gets the highest percentage is mistaken eyewitness identification. Then there are other examples of scientific evidence that was faulty. There’s informant testimony that was unreliable. There is the new information that hadn’t been handed over in the first place and there’s even false confessions by defendants in cases. Id. at 71. When asked, on the basis of his knowledge of the data on exonerations, to assess the continuing risk of wrongful conviction in cases in which death sentences are imposed, Mr. Dieter responded that he believed the risk still existed. Explaining, he noted that since the beginning of his collection of data there has been a fairly constant ratio between the number of executions and exonerations: 138 of these cases that have been found is a large number. One thing to compare it with is the number [o]f executions that have occurred during this same time. There’s been something, like, 1,233 executions in the time that there’s been 138 exonerations. And so being careful not to say that it’s from those cases [where there has been an execution] we find the mistakes, but just as a ratio for every nine

DPIC lists a case as an exoneration if “a person [i]s convicted and then through a court that conviction is overturned and then the charges against the person are dismissed either through an acquittal at a retrial or through the State dismissing all charges.” Id.

2

13

executions that occur, there has been one exoneration found[,]... it does give ... a sense of the gravity of the problem and that ratio, nine executions, and then another exoneration has remained steady throughout our ... research.... So I would say the risk is still there. Id. at 80-81. He also noted that fortuity has played a major role in many exonerations: I observe the system as closely as I can and it causes me concern that these mistakes, these exonerations so often are the result of unusual or fortuitous circumstances. It’s ... a concern that as far as the system working, clearly the trial system did not work in any of these 138 cases. Then we get to the appellate process where ... the burden shifts and it takes extraordinary effort to be able to get a reversal and to get an ultimate exoneration in a case and sometimes that's the result of a number of attorneys in a high-powered law firm donating thousands of hours equivalent to millions of dollars just to reinvestigate the case and they were able to win the exoneration, but that certainly is not applied to the 3,300 people on Death Row today. In some cases, it’s been the work of outside people like journalism students who happen to have time on their hands and were assigned the case merely because the defendant got a stay to look at some mental health issues.... So what I would conclude is that a lot of these exonerations were due to fortuitous circumstances.... Id. at 77-78. The other person who has testified thus far about the causes of wrongful conviction based on the study of cases of wrongful conviction is Brandon Garrett, a professor of law at the University of Virginia School of Law. RR 4, at 23. Before he became a law professor, Mr. Garrett represented several people who had been exonerated by DNA analysis, and he began to wonder “whether the cases that I had worked on were representative of a larger problem or whether they were unusual accidents.” Id. at 25. As he conducted his initial research, he “began to realize there wasn’t enough data on DNA exonerations and wrongful convictions for me to answer some of the basic questions that I had. And so, I started a project of trying to assemble as much data as I could about these known DNA exoneration cases.” Id. at 25-26. The result was

14

an in-depth study of the first 250 DNA exoneration cases,3 a series of law journal articles based on this study, and a book forthcoming from Harvard University Press about the study, Convicting the Innocent: Where Criminal Prosecutions Go Wrong, based on data from these exonerations, from 1989 through February, 2010.4 Id. 38-40. To conduct his study, Professor Garrett attempted to collect the trial transcripts from all the cases that had gone to trial (as opposed to a plea of guilty). Id. at 42. He was able to get transcripts in 207 of the 234 trial cases. For all 250 cases, he collected all the available judicial opinions. Id. The most frequent crime of conviction was rape, 68% of the cases, while rapemurders comprised 21%, and murder 9%. Defendant’s Exhibit 16 [hereafter DX 16], at slide 8. The evidence supporting the convictions in these cases was eyewitness identification, in 76% of the cases, forensic evidence, in 74% of the cases, informant testimony, in 21% of the cases, and confessions, in 16% of the cases. RR 4, at 46-47; DX 16, at slide 14.5 Professor Garrett’s methodology in analyzing the evidence in the cases was the following: For each type of evidence, I wanted to go back and examine, to the extent possible, from the trial transcripts what went wrong. And I wanted to see whether the trial transcripts concerning eyewitnesses reflected the use of sound identification procedures, or on the other hand the use of suggestive identification

Professor Garrett described his criteria for counting a case as a DNA exoneration in terms similar Mr. Dieter: “An[] exoneration[] occurs if the judge, after hearing new evidence of innocence, vacates the conviction and there is no retrial or there is an acquittal at a new trial or if the governor grants a pardon, all on the basis of that new evidence of innocence. A DNA exoneration, as opposed to an exoneration not necessarily involving DNA, is one substantially based on post-conviction DNA testing. That's the definition I have used in my research.” RR 4, at 3132. Professor Garrett noted that, since February, 2010, there have been ten more DNA-based exonerations, raising the total to 260. RR 4, at 28. 40 of the 250 cases Professor Garrett studied are from Texas. RR 4, at 30. The Timothy Cole Advisory Panel on W rongful Convictions, established by the Legislature in 2009, examined the first 39 of these cases and found that 85% of the Texas cases had eyewitness identifications, 46% had faulty or unreliable forensic evidence, and 13% had informant testimony and false confessions. Defendant’s Exhibit 10, at 3.
5 4

3

15

procedures. I wanted to see whether other aspects of eyewitness identifications should have suggested problems at the time or whether they did not suggest problems at the time. Similarly, I wanted to know whether the forensic testimony at the time suggested problems or whether it was just that the technology at the time wasn’t as good as the technology we have now. With the jailhouse informants and the confessions, I was interested in whether these innocent people had supposedly admitted details about the crime that only the perpetrator could have known since now that we know they’re innocent, we know they couldn’t have independently had knowledge of those inside crime scene details. So, I wondered whether – these cases seemed powerful at the time, but whether we might now suspect that the evidence had been contaminated. RR 4, at 48-49. As to each of the three kinds of evidence involved in the case against Mr. Green – eyewitness identification, forensic evidence, and informant testimony – Professor Garrett found substantial reasons for the evidence leading to wrongful convictions. With respect to eyewitness testimony, the vast majority of these eyewitness[es][6] made identifications based on procedures that are now known to be suggestive. And these were also eyewitness who although certain at trial had expressed early [un]certainty or they had identified other people, or fillers for example, or they had admitted they couldn’t even see their attacker’s face. And so, there was stark evidence at the time of trial of both suggestion and uncertain[t]y or unreliability, nevertheless, these identifications were admitted and supported these erroneous convictions. RR 4, at 50. Another well-established source of unreliability in eyewitness identifications arises when the perpetrator and the eyewitness are different races. Id. at 36-37, 71. Professor Garrett found that the unreliability associated with cross-racial identifications explains the greater proportion of African Americans, 62%, in the population of DNA exonerees, than is in the population of persons convicted of rape, which is only 40%. Id.; DX 16, slide 9.

In 88% of the cases involving eyewitness testimony, the testimony revealed that the identification was unreliable or based on suggestive procedures. DX 16, at slide 33.

6

16

With respect to forensic evidence, Professor Garrett found that in half of all the exonerees’ cases, the forensic evidence was invalid, unreliable, vague or erroneous. DX 16, slide 49. Professor Garrett explained that unreliability infects all the forensic sciences except for DNA typing – including, as in Mr. Green’s case, fingerprint comparison: [A]s the National Academy of Sciences concluded in its landmark report from last year, there is no method aside from DNA typing that can reliably individualize evidence and point to a particular individual person with any reliability. And so, for most of the techniques used in these cases, ... we don't know what frequency of the population shares certain hair characteristics or bite characteristics or what percent has shoes that wear in a certain way, or even fingerprints with particular patterns. The underlying empirical research hasn’t been done. And the limited evidence that we have suggest that there are serious error rates using those kinds of subjective comparisons where analysts can and do disagree about conclusions since their analysis is ultimately subjective and based on their experience and looking and comparing such objects. Id. at 74-75. With respect to informant testimony, Professor Garrett found as follows: These informants claim that the defendants knew things about the crime and had admitted to things about the crime that only ... the true perpetrator could have known. So, you have the situation where the prosecutor would say: Look, this is a liar, this is a jailhouse informant, this is a person with a terrible record, but we know this person is telling the truth in this case because the story that he says that this defendant told matches the crime scene evidence. We now know that these innocent people couldn’t have known how the crime happened. And so, somehow these informant statements must have been contaminated. All but two of the jailhouse informants said that the defendant said – admitted their guilt in some detail, and so making their testimony seem reliable and corroborated at the time, whereas we now know either they found out about those details from police or prosecutors or through some kind of a jailhouse network or some other source. We don't know exactly how it happened. All we know is that at trial they claim that the defendants admitted in detail. We now know that these innocent people couldn’t have known these details. Often police and prosecutors were quite clear that these details about how the crime happened were carefully blacked out and not made public precisely to avoid contamination, which we now know is likely to have occurred. 17

Id. at 83-84. Relevant to Mr. Green’s argument that the post-conviction review process does not catch wrongful convictions, Professor Garrett also examined “what happened during the appeals and post-conviction process before the DNA testing.” Id. at 89. He found that the post-conviction process did not effectively screen innocence in these cases. And, in fact, fairly consistently courts would deny relief, saying there’s evidence that we think these people are guilty.... Very few of the people who attacked the evidence that we now know to have been flawed had any success. Id. at 90-91. Finally, counsel asked Professor Garrett whether the flawed evidence that contributed to the wrongful convictions in the DNA exoneration cases – which were mostly rape cases – could be expected to play a similar role in capital cases, where the Eighth Amendment requires greater reliability. Professor Garrett responded, [U]nfortunately, in capital cases, courts do not conduct an inquiry that is any different when they evaluate forensics or eyewitness testimony or informant testimony. The standards for admitting that evidence are the same in all criminal cases. So, there is no heightened reliability inquiry.... Nor is the harmless error test or prejudice test for Brady and Strickland claims any different in capital cases as opposed to other cases. So, there’s no reason to think that in death penalty cases post-conviction courts would do a better job of evaluating trial evidence, and certainly police departments don’t use sound ... eyewitness identification procedures at the outset in capital cases, nor do they pay closer attention to the validity of the conclusions that they reach when they conduct forensic analysis in a capital case. So, these same problems would be expected to occur in any case where invalid forensics or unreliable forensics are presented, or in any case where there isn’t careful documentation of informant testimony and scrutiny of their reliability, or any case where suggestive eyewitness identification procedures are used. Id. at 93-94.

18

3.

Testimony concerning eyewitness identification, the lack of discovery, and a Texas legislative response to the concern about wrongful convictions

Sandra Guerra Thompson is a chaired faculty member of the University of Houston Law Center and the director of the Criminal Justice Institute. RR 3, at 8. She is a nationally recognized legal scholar on eyewitness identifications, and has published four articles on eyewitness identification and is working on a fifth. Id. at 9. Because of her expertise in eyewitness identification, she was nominated by the deans of the public law schools in Texas to serve on, and was appointed to, the Timothy Cole Advisory Panel on Wrongful Convictions. Id. at 9-10.7 Professor Thompson testified about the unreliability of eyewitness identifications, the reforms that can reduce their unreliability, the need to require fuller discovery by prosecutors statewide, and the recommendations made by the Cole Advisory Panel. Professor Thompson was asked whether, on the basis of her research, eyewitness identifications were reliable or unreliable. Id. 20. She responded, “[A] lot of my research focuses in particular on violent crimes, that are stranger-on-stranger crimes. In those cases, I would say they are very unreliable.” Id. at 20-21. She explained that there are two sources of

7

The Cole Advisory Panel was established by the Legislature with the following mission:

(d) The Task Force on Indigent Defense, with the advice and assistance of the advisory panel, shall conduct a study regarding: (1) the causes of wrongful convictions; (2) procedures and programs that may be implemented to prevent future wrongful convictions; (3) the effects of state law on wrongful convictions as determined based on state statutes regarding eyewitness identification procedures, the recording of custodial interrogations, post-conviction DNA testing, and writs of habeas corpus based on relevant scientific evidence; and (4) whether the creation of an innocence commission to investigate wrongful convictions would be appropriate. Defendant’s Exhibit 9, section 1(d).

19

error in eyewitness identification – “estimator variables” and “system variables.” Id. at 21-23. Estimator variables are “inherent to the witness as well as ... the situation under which the witness observed the person in question.” Id. at 21-22. Estimator variables include the age of the witness, the witness’s mental state, factors affecting the witness’s ability to see the perpetrator (such as lighting, distance, time available to view the person), the race of the witness and the perpetrator, and the presence of a weapon. Id. System variables refer to the factors that “come into play when law enforcement become involved.” Id. at 23. These include the method of questioning the witness (whether the questioning is suggestive), the use of photographs in a photo array (whether the suspect’s photograph somehow stand out), and whether confirmatory feedback is given the witness after an identification. Id. Professor Thompson explained that there is much agreement concerning the reforms needed to reduce the risk of unreliable eyewitness identification. These include: • • • full documentation of police officers’ contact with eyewitnesses; the elimination of any form of suggestion; the conduct of procedures in a blind fashion, in which the witness is told that the

investigator presenting a photo array or lineup does not know the identity of the suspect, and in fact the investigator does not know the identity of the suspect;8 and • the presentation of photographs (or live lineup participants) sequentially, one at a

Blind procedures are necessary “to eliminate the probability that the witness will look to the investigator for clues. Because witnesses want to help the police. And so, they want to – they view it as a test where they're supposed to find the right person. And they will want to identify someone. So, if they don’t know, there is a tendency to look to the investigator for help, which they may sometimes actually get intentionally, or unintentionally, or they may just look to the investigator who doesn’t mean to give clues, but somehow they may perceive that they're being helped.” Id. at 27.

8

20

time, rather than in the traditional “six-pack,” or groups of six.9 RR 3, at 25-32. Professor Thompson noted that the Cole Advisory Panel recommended that the Legislature require a model policy to be developed addressing these safeguards, providing for “the use of cautionary instructions, filler selections, double-blind procedures, and documentation, and any other best practices.” Id. at 38; Defendant’s Exhibit 10, at 5-6. Professor Thompson also testified that two additional safeguards are needed. The first is that there must be corroborating evidence independent of the identification. Id. at 31. The reason is to reduce unreliability due to estimator variables: [I]f you have a case where you have a lot of estimator variables, which suggest that the eyewitness was just not in a position to make an accurate identification, and especially if you don’t have any corroborating evidence, then ... that’s the sort of case that, per se, raises reasonable doubt and where the Court should screen out the unreliable evidence. Id. at 31-32. The second is that trial courts be required to hold reliability hearings before eyewitness identifications can be admitted, parallel to the reliability hearings held with respect to scientific evidence under Daubert v. Merrill Dow Pharmaceuticals, Inc., 509 U.S. 579 (1993). Id. at 32-33. Since “social science research is really very solid in identifying factors that reduce[] the reliability of eyewitness testimony,” id. at 33, “criminal courts should be able to develop some tests for reliability and some competence on the scientific literature such that they should ... develop pretty readily a competence to do a reliability screening.” Id. at 34.

Sequential presentation addresses the problem “social scientists ... have ... called relative judgment.” Id. at 29. “[R]ather than identifying a suspect based on actual recall, witnesses will tend to compare one suspect to another and through a process of elimination find the person who most closely looks like the person they recall. And the problem with that is that if the suspect ... is not in the lineup, if the police have mistakenly arrested the wrong person, then the chances of those same witnesses picking one of those are – one of those innocent people is actually very high. And, again, because they will pick the person who most closely resembles the true culprit. And once they have made that erroneous selection, that person's face becomes their memory, their actual memory of the event. They will remember the innocent person's face in their minds and be convinced that’s the right person.” Id. at 29

9

21

Finally Professor Thompson also addressed the work of the Cole Advisory Panel with respect to discovery.10 The problem is that “Texas does not have any state – any state discovery requirements other than, of course, constitutional requirements under Brady. And Texas is in the minority in this regard.” Id. at 42. The Panel’s “recommendation is for much broader discovery and that it be made mandatory. So, things like police reports and witness statements would as a matter of course in all jurisdictions in the state be mandatorily required to be produced for the defense.” Id. 4. Testimony concerning the use of informants as prosecution witnesses

The last witness who was able to testify for the defense was Alexandra Natapoff, the person described by Professor Thompson as the leading scholar in the nation with respect to informant testimony. RR 3, at 32-33. Ms. Natapoff is a law professor at Loyola Law School in Los Angeles. RR 4, at 97. Her recent book, Snitching, published by New York University Press in 2009, received an award from the American Bar Association. Id. at 97-98.

10

The Panel found that inadequate discovery plays a significant role in wrongful convictions:

Discovery as a component of effective counsel is especially important in helping to guard against wrongful convictions. A relationship between discovery and wrongful conviction is sometimes difficult to ascertain at first glance, but “[t]he record of wrongful convictions has demonstrated that exculpatory evidence can be withheld for years, even decades, while an innocent person sits in prison.” In fact, seven of Texas’ first thirty-nine DNA exonerations involved suppression of exculpatory evidence or other prosecutorial misconduct. This statistic includes the case of Timothy Cole, whose defense counsel was never informed that only one victim chose Cole out of a photo lineup as the perpetrator of a rape on the Texas Tech campus. Although the Supreme Court’s decision in Brady v. Maryland provides defendants with a constitutional right of access to exculpatory information held by the State and in the possession of law enforcement, it is an insufficient tool to prevent wrongful convictions because Brady complaints are made post-conviction. Since a wrongful conviction cannot be retroactively prevented once it has already occurred, other means of prevention must be explored. One way to reduce the potential for errors is to increase the scope of discovery, the process of pre-trial information exchange between prosecution and defense. Defendant’s Exhibit 10, at 23-24 (footnotes omitted).

22

Professor Natapoff described an “informant” as “any criminal suspect or offender or defendant who gives information to the government in exchange for benefit or the anticipation of a benefit....” Id. at 103. “Typically, the benefit is leniency for their own offenses, the dropping of charges, the reduction of charges, the reduction in a sentence; but other benefits, informants are sometimes paid, they may received improvements in their conditions of confinement. There are a wide range of benefits that a criminal informant can provide information in the hopes of getting.” Id. Informants are “the only witnesses that are testifying, in effect, on their own behalf. They are there to obtain a benefit for themselves, to obtain leniency for their own offenses, their own liberty may be on the line.” Id. at 115. Informants have been “heavily demonstrated” to provide unreliable testimony, in “the sense they lead to wrongful convictions, that their information is often false, fabricated, or otherwise unreliable.” Id. at 105. [T]he first comprehensive analysis of all the wrongful conviction cases of which we know, not limited to DNA exonerations, ... in 2004 [by Northwestern University law School][,] ... concluded that of all the wrongful convictions that we know about, in over 46 percent of them the wrongful convictions flowed in whole or in part from the testimony of a lying criminal informant. That mak[es] criminal informants the largest single source of unreliability and wrongful convictions in the U.S....” Id. at 113-114. The Los Angeles County Grand Jury in 1989 conducted a far-ranging inquiry into the use of jailhouse informants. The Grand Jury found that informants were often unreliable, because they would tailor their information to what they thought the government wanted to hear, knowing that the only way that they would ever get benefits is if they provided information that was useful to the government. So, in effect, the deferral of benefits was an incentive not to the tell the truth, but to tell information that would be – would be beneficial to the government. And the Grand Jury expressed 23

its concern that this was a recipe for fabrication of precisely the kind that they saw in Los Angeles in the 1980s. Id. at 109. Despite its pervasive unreliability, informant testimony is extremely difficult to challenge, because “the use and creation and reward of informants is so secretive[,] [i]t’s very difficult after the fact, even in our most regulated settings, which is, of course, the trial setting, to go back and figure out what actually happened, what the informant knew, what he said to the government, what the government said to him, what kind of deals were offered or promised or implied..” Id. at 107. Informant testimony has devastating effects on the ability of criminal prosecutions to focus on the real culprits. The first consequence is the diversion of police officers and prosecutors from their duty to prosecute the right people: [T]he use of criminal informants can change the very nature of the case. It can change the direction of an investigation. It can mean that police will not investigate additional suspects because a criminal informant will make a particular suspect look more attractive as a suspect. Professor Ellen Yaroshefsky, who is a law professor at Cardozo Law School, has interviewed prosecutors as to their relationships and the use of criminal informants. And the prosecutors themselves have relayed how their cases are changed by reliance on a criminal informant, that they themselves start to adopt the stories of the informants because their cases start to become reliant on those [stories]. Id. at 126-127. The second consequence is that informants strengthen otherwise weak cases. [W]e have seen in the capital context[,] because jailhouse informants in particular have this entrepreneurial culture of coming forward sua sponte on high-profile cases or homicide cases, one of the great dangers of the use of criminal informants is that they bolster what would otherwise be weak cases. And we have seen now a number of examples where jailhouse informant information has come to light after the fact and becomes a kind of a filler to make cases that would otherwise 24

look weak or otherwise might not look as strong to the government, to make those cases look much stronger. Id. at 127. The ability of an informant to bolster a weak prosecution case often rests on the ability to tailor information so that it connects to the otherwise independent evidence associated with the case – providing the much-needed link between the informant’s incriminating account and apparently independent evidence that “corroborates” it. As Professor Natapoff explained, We know from the Los Angeles Grand Jury report[,] from the Kaufman Commission, from the Northwestern University Law School report, and from many other reports and journalistic accounts of wrongful convictions that informants, jailhouse informants in particular, but also other kinds of informants, have developed tactics by which evidence is fabricated and used by the system. And often these tactics involve getting information about high-profile cases from the media, obtaining information from outside sources about cases, asking friends and family to obtain court records, stealing information. In other words, informants have developed quite sophisticated methods of getting information about suspects to bolster their own informant credi[]bility. And so, because these kinds of fabricated evidence, particularly fabricated confessions, by definition are piggy-backing off of evidence that may be in the public record, they are in some sense always going to be corroborated. They're built to be corroborated. Id. at 121. Professor Natapoff explained that a new set of safeguards is needed to reduce the risk of wrongful conviction in informant cases, because the exiting safeguards have proven inadequate to the task. In particular, • the prosecution cannot discern whether informants are truth-tellers because they

are relying on the informants to make their cases, id. at 129; • the lack of discovery hampers defense counsel’s ability to get access to

information that would facilitate effective investigation, id. at 130-131;

25

“criminal informants are deeply invested in the consistency of their own story

because it is on the basis of that story that they anticipate being rewarded[,] [s]o, crossexamination has ended up being a weaker tool than it's supposed to be to actually discern whether informants are lying or not,” id. at 131; and • “the great bulwark against a lying witness, of course, is the jury[;] [a]nd we've

seen time and time that jurors are not very good at telling the difference between the lying informants and a truth-telling informant.” Id. at 131-132. Since “our great procedural mechanisms for protecting the trial process, prosecutorial screening and ethics, the adversarial system, defense discovery, and cross-examination, and the jury itself, have shown to be poor safeguards against the very powerful phenomenon of the motivated compensated criminal witness,” systematic reforms are needed. These include: (1) (2) testified,” id.; (3) write down “all benefits that informants were to get..., in effect making “make more transparent the use and creation of informants,” id. at 134; “curtail[] the ability of informants to get informal deals long after they’ve

discoverable those deals,” id.; (4) (5) “heighten discovery requirements,” id.; hold “[r]eliability hearings,” id., “in which the court screens the reliability of

the criminal informant to ensure that unreliable evidence does not go before the jury,” id, at 116; (6) (7) have “corroboration requirements,” id. at 134; allow juries “to hear from experts at trial about the kinds of tactics that different

sorts of informants use,” id.; and 26

(8)

give “cautionary jury instructions ... that remind the jury to be particularly

suspicious of compensated criminal informant witnesses.” Id. Save for the corroboration requirement, which Texas has enacted, id. at 119, Texas has enacted none of these reforms. However, the corroboration requirement, without the other reforms, is ineffectual. See RR 4, at 121, supra. B. Evidence yet to be presented

The evidentiary hearing was cut short before Mr. Green was able to present a claim unique to his case and one that has never been presented before in a capital case: the special problem that arises when a capital prosecution turns on disputed fingerprint evidence in light of the findings of the landmark 2009 report by the National Academy of Sciences, Strengthening Forensic Science in the United States: A Path Forward (hereinafter NAS Report). The NAS Report made clear that more generalized and applied research must be done in order to get a scientifically and statistically sound assessment of fingerprint evidence: “Given the general lack of validity testing for fingerprinting; the relative dearth of difficult proficiency tests; the lack of a statistically valid model of fingerprinting; and the lack of validated standards for declaring a match, ... claims of absolute, certain confidence in identification are unjustified.... Therefore, in order to pass scrutiny under Daubert, fingerprint identification experts should exhibit a greater degree of epistemological humility. Claims of ‘absolute’ and ‘positive’ identification should be replaced by more modest claims about the meaning and significance of a ‘match.’” NAS Report at 142. If the hearing is allowed to proceed, Dr. Simon Cole will testify that in light of the NAS report there is a consensus in the scientific community that more generalized and applied research must be conducted before a scientifically valid statistical statement can be made about

27

the evidentiary weight of the fingerprint evidence in such cases as Mr. Green’s. Two recent publications, one by a member of the NAS Report panel, attached to this application as Appendices 1 and 2,11 provide a roadmap to the kind of research that is ongoing and the way fingerprint evidence will be presented statistically in the future. Indeed, what can be stated with certainty today is that the evidentiary weight of the fingerprint evidence in the Green case will be significantly different ten years from now than it is today. This is true not just because the general and applied research called for by the NAS, and being undertaken today, will totally change the way the evidentiary weight of fingerprint evidence will be conveyed to juries but because the fingerprint evidence in this case is in dispute based on the analyses of the prosecution’s own experts. In a case where the prosecution’s own experts disagree as to whether there is enough fingerprint information (referred to as “minutiae”) to make a determination that the partial latent prints “match” Mr. Green, it is entirely conceivable that ten years from now, using scientifically valid methods, fingerprint analysts might provide a statistical analysis of the print evidence here that concludes a “match” to Green is unlikely, much less not uniquely his print. The fact that we know the evidentiary weight of fingerprint evidence will change dramatically in the next decade has special implications in a capital case in Texas, as will be illustrated by Mr. Green’s proof with respect to Cameron Todd Willingham’s case. In Willingham the fire marshal’s testimony that the fire at Willingham’s home was incendiary was

I.E. Dror, et al., Cognitive issues in fingerprint analysis: Inter- and intra-expert consistency and the effect of a ‘target’ comparison, Forensic Sci. Int. (2010) (in press) [Appendix 1]; Chang Su, et al., Evaluation of Rarity of Fingerprints in Forensics, Proceedings of Neural Information Processing Systems, Vancouver, Canada, December 6-9, 2010 [Appendix 2].

11

28

offered just after the publication of National Fire Protection Association 921, a seminal report that changed the way arson evidence was analyzed among fire scientists. Mr. Green will show that despite submission of an affidavit from a leading arson expert, Dr. Gerald Hurst, just prior to Mr. Willingham’s execution, demonstrating that the arson evidence used to convict Willingham had been discredited as unreliable for a decade, the Texas courts, the Board of Pardons and Paroles, and the Governor’s office all failed to investigate, much less acknowledge the truth of, Dr. Hurst’s analysis. Mr. Green will, in fact, prove that given the weaknesses in the postconviction capital “safety net” in Texas, he is in danger of being convicted and executed based upon unreliable fingerprint evidence in the same way Willingham was convicted based upon unreliable arson evidence. In addition to the fingerprint evidence, counsel for Mr. Green intend to bring forth additional information the trial court must have to fairly assess the various factors that contribute to the risk of Mr. Green being wrongfully convicted. If allowed to proceed, counsel will be able to produce additional evidence relevant to the following: (a) the factors identified in the social science research that make eyewitness

identifications unreliable (though Jennifer Dysart, Ph.D.); (b) the evidence that death-qualified juries are conviction prone and less able to

scrutinize guilt-innocence evidence critically (Wanda Foglia, Ph.D.); (c) the Harris County District Attorney’s historic pattern of exercising peremptory

strikes against minority jurors in an effort to secure all-white juries, the diminished ability of the resulting all-white juries to resolve factual issues accurately, and the measures which must be taken by an office such as the Harris County District Attorney’s Office to change office culture 29

and practice so that it avoids the racial exercise of peremptory strikes (Christina Swarns, director of the Race and Criminal Justice Project of the NAACP Legal Defense and Educational Fund; Samuel Sommers, Ph.D., social scientist and jury researcher; Bryan Stevenson, director of the Equal Justice Initiative); (d) the ineffectiveness of Texas’ habeas corpus and clemency procedures in

discerning and remedying wrongful convictions (University of Texas School of Law professors, Jim Marcus and Maurie Levin); (e) the lessons learned from the most comprehensive data base of DNA and non-

DNA wrongful conviction cases in the United States, together with evidence of the public and legislative response to these lessons (University of Michigan School of Law professor Samuel Gross); (f) the evidence concerning the risk of wrongful conviction where the prosecution’s

case rested, as it does in Mr. Green’s case, on eyewitness identification, informant testimony, and faulty forensic evidence, in two cases in which people have been executed in Texas, the cases of Cameron Todd Willingham and Claude Jones (various law enforcement witnesses, arson experts, and DNA evidence); (g) the case history of Ernest Willis, which rested on evidence nearly identical to the

evidence in Mr. Willingham’s case, focusing particularly on how Mr. Willis came to be exonerated – to show that the difference between exoneration and wrongful conviction and execution is not due to the safeguards in the criminal justice process but to luck (University of Texas School of Law professor Rob Owen, counsel for Mr. Willis); and (h) former Texas Governor Mark White, who will testify about the need for certainty 30

about guilt before anyone sentenced to death is executed. III. THE QUESTION PRESENTED BY MR . GREEN REQUIRES A WIDE-RANGING EXPLORATION OF EVIDENCE TO ASSIST THE COURT IN DECIDING THE ISSUE PRESENTED BY THE PARTICULAR CIRCUMSTANCES OF MR . GREEN ’S CASE, AND THE EXPLORATION OF THIS EVIDENCE IS SQUARELY WITHIN THE COURT’S POWER AND DUTY The question presented by Mr. Green – whether in the circumstances of his case there is a sufficiently high risk of wrongful conviction to violate the Eighth Amendment – rests on the Eighth Amendment’s concern about the need for heightened reliability in the jury’s determinations in a capital case. This rule is a rule of “general application, a rule designed for the specific purpose of evaluating a myriad of factual contexts,” Wright v. West, 505 U.S. 277, 308-09 (1992) (Kennedy, J., concurring). When interpreting and applying such a rule, a court cannot proceed in the routine manner that is suited, for example, to deciding under Federal Civil Rule 12(b)(6) whether the allegations of a complaint state a claim under a federal statute – first spelling out the rules of legal liability enacted by the statute in the abstract without reference to the facts, and then determining whether the alleged facts bring the case within those rules. Rather, when the application of a rule such as the Eighth Amendment’s concern for greater reliability is at issue, the analysis and exposition of the constitutional rule are themselves informed by the factual circumstances to which they are being applied, in the manner classically described by Henry Wolf Biklé, who said about courts’ making constitutional rulings that “if this requires, as a condition precedent, the resolution of some issue of fact, this also the Court must undertake.” Biklé, Judicial Determination of Questions of Fact Affecting the Constitutional Validity of Legislative Action, 38 Harv. L. Rev. 6, 23 (1924). The trial court here is required to determine the meaning that must be given to the Eighth 31

Amendment’s requirement for greater reliability in the context of the risk of wrongful conviction to John Green. Both the need for heightened reliability and the concern about the risk of error under the Eighth Amendment are informed in part by the “evolving standards of decency that mark the progress of a maturing society.” Trop v. Dulles, 356 U.S. 86, 101 (1958). The “evolving standards of decency” doctrine of Trop v. Dulles applies not only to substantive limitations on the use capital punishment, as in Atkins v. Virginia, 536 U.S. 304 (2002) (capital defendants with mental retardation), and Roper v. Simmons, 543 U.S. 551 (2005) (capital defendants under age 18), but also to limitations on the procedures that are acceptable in capital prosecutions. In Woodson v. North Carolina, 428 U.S. 280, 293-295, 293-301 (1976), the Court explained that the history of progressive, nationwide repudiation of mandatory capital sentencing required an invalidation of that procedure as a Cruel and Unusual Punishment even in the case of crimes that could be capitally punished through procedures providing for individualized consideration of aggravating and mitigating circumstances: The history of mandatory death penalty statutes in the United States thus reveals that the practice of sentencing to death all persons convicted of a particular offense has been rejected as unduly harsh and unworkably rigid. The two crucial indicators of evolving standards of decency respecting the imposition of punishment in our society jury determinations and legislative enactments both point conclusively to the repudiation of automatic death sentences. At least since the Revolution, American jurors have, with some regularity, disregarded their oaths and refused to convict defendants where a death sentence was the automatic consequence of a guilty verdict. As we have seen, the initial movement to reduce the number of capital offenses and to separate murder into degrees was prompted in part by the reaction of jurors as well as by reformers who objected to the imposition of death as the penalty for any crime. Nineteenth century journalists, statesmen, and jurists repeatedly observed that jurors were often deterred from convicting palpably guilty men of first-degree murder under mandatory statutes. Thereafter, continuing evidence of jury reluctance to convict persons of capital offenses in mandatory death penalty jurisdictions resulted in legislative authorization of discretionary jury sentencing by Congress for federal crimes in 32

1897, by North Carolina in 1949, and by Congress for the District of Columbia in 1962. As we have noted today in Gregg v. Georgia, ante, 428 U.S. [153], 179, 181 [(1976)], legislative measures adopted by the people’s chosen representatives weigh heavily in ascertaining contemporary standards of decency. The consistent course charted by the state legislatures and by Congress since the middle of the past century demonstrates that the aversion of jurors to mandatory death penalty statutes is shared by society at large.... It is now well established that the Eighth Amendment draws much of its meaning from “the evolving standards of decency that mark the progress of a maturing society.” Trop v. Dulles, 356 U.S., at 101 (plurality opinion). As the above discussion makes clear, one of the most significant developments in our society’s treatment of capital punishment has been the rejection of the common-law practice of inexorably imposing a death sentence upon every person convicted of a specified offense. North Carolina’s mandatory death penalty statute for firstdegree murder departs markedly from contemporary standards respecting the imposition of the punishment of death and thus cannot be applied consistently with the Eighth and Fourteenth Amendments’ requirement that the State’s power to punish “be exercised within the limits of civilized standards.” 428 U.S. at 293-295, 301. In determining whether capital trial procedures which create a risk of wrongful conviction are violative of the Eighth Amendment, therefore, the trial court must examine, at least in part, what degree of risk of wrongful conviction in a capital case is acceptable under the evolving standards of decency, as measured by current legislative reforms, public opinion, and trends in death sentencing. To be able to assess this matter, the trial court must have as much evidence as possible about wrongful convictions and public concern about them and what is being done to minimize the risk of such errors. Viewed from this perspective, as well as from the narrower confines of Mr. Green’s case, none of the evidence that has been presented by Mr. Green or that will be presented on his behalf is irrelevant to the issue he presents. The 177th District Court is plainly acting within its power 33

and complying with its duty in hearing this evidence. See Morrow v. Corbin, 62 S.W.2d 641, 650 (Tex. 1933) (“the jurisdiction of our trial courts embraces not only the power and duty to hear causes, but the power and duty to pass upon the facts and law and enter final decrees in accordance therewith; and then to execute their judgments, without interference by any other tribunal, except and until the appellate power of a revisory court is invoked”). The mandamus or prohibition powers of this Court cannot, and should not, be employed to interfere with the 177th Court’s quintessential exercise of its inherent responsibilities. See White v. Reiter, 640 S.W.2d 586, 593-594 (Tex.Crim.App. 1982) (“[i]t is ... well settled that mandamus will not issue to compel a particular result in what is manifestly a discretionary decision”). Conclusion For these reasons, the Court should deny Ms. Lykos leave to file the petitions for writs of prohibition and mandamus. Respectfully submitted, Richard Burr SBN 24001005 PO Box 525 Leggett, TX 77350 713-628-3391 713-893-2500 (fax) John P. Keirnan SBN 11184700 917 Franklin St., Ste 550 Houston, TX 77002 713-236-9700 713-236-1802 (fax) Robert K. Loper SBN 12562300 111 W. 15th Street Houston, TX 77008 713-880-9000 713-869-9912 (fax)

By Counsel for Real Party in Interest, John Edward Green

34

Certificate of Service I hereby certify that the foregoing pleading was served by mail on counsel for Relator, Allen Curry, Assistant District Attorney, 1201 Franklin Street, Ste 600, Houston, TX 77002; and on Respondent, Honorable Kevin Fine, Presiding Judge, 177th District Court, 1201 Franklin Street, Ste 1900, Houston, TX 77002; Greg Abbott, Office of the Attorney General, PO Box 12548, Austin, TX 78711; and Jeffrey Van Horn, State Prosecuting Attorney, PO Box 12405, Austin, TX 78711, this 22nd day of December 2010.

Counsel for Real Party in Interest John Edward Green

35

Appendix 1

G Model

FSI-6238; No. of Pages 8
Forensic Science International xxx (2010) xxx–xxx

Contents lists available at ScienceDirect

Forensic Science International
journal homepage: www.elsevier.com/locate/forsciint

Cognitive issues in fingerprint analysis: Inter- and intra-expert consistency and the effect of a ‘target’ comparison
Itiel E. Dror a,b,*, Christophe Champod c, Glenn Langenburg c,d, David Charlton e,f, Heloise Hunt a, Robert Rosenthal g
a

Institute of Cognitive Neuroscience, University College London, London, United Kingdom Cognitive Consultants International Ltd., United Kingdom Ecole des sciences criminelles, Institut de police scientifique, University of Lausanne, Lausanne, Switzerland d Minnesota Bureau of Criminal Apprehension Forensic Science Services, St. Paul, MN, United States e School of Applied Sciences, Bournemouth University, United Kingdom f Fingerprint Bureau, Sussex Police, United Kingdom g Department of Psychology, University of California Riverside, United States
b c

A R T I C L E I N F O

A B S T R A C T

Article history: Received 24 June 2010 Received in revised form 4 October 2010 Accepted 9 October 2010 Available online xxx Keywords: Latent fingerprinting Human cognition Fingerprint analysis

Deciding whether two fingerprint marks originate from the same source requires examination and comparison of their features. Many cognitive factors play a major role in such information processing. In this paper we examined the consistency (both between- and within-experts) in the analysis of latent marks, and whether the presence of a ‘target’ comparison print affects this analysis. Our findings showed that the context of a comparison print affected analysis of the latent mark, possibly influencing allocation of attention, visual search, and threshold for determining a ‘signal’. We also found that even without the context of the comparison print there was still a lack of consistency in analysing latent marks. Not only was this reflected by inconsistency between different experts, but the same experts at different times were inconsistent with their own analysis. However, the characterization of these inconsistencies depends on the standard and definition of what constitutes inconsistent. Furthermore, these effects were not uniform; the lack of consistency varied across fingerprints and experts. We propose solutions to mediate variability in the analysis of friction ridge skin. ß 2010 Elsevier Ireland Ltd. All rights reserved.

Cognitive processes underpin much of the work carried out in many forensic disciplines which require examination of visual images. Fingerprints, bite and shoe marks, tire tracks, firearms, hair, handwriting and other forensic domains all hinge on comparative examination involving visual recognition. Although human experts are the ‘instrument’ in judging whether two patterns originate from the same source, understanding the factors that shape such judgements in forensic science has been relatively neglected. In the past it has been misconceived that ‘‘fingerprint identification is an exact science’’ ([1] p. 8); and this perception goes across all forensic disciplines [2]. The recent National Academy of Sciences report further highlights that ‘‘the findings of cognitive psychology... the extent to which practitioners in a particular forensic discipline rely on human interpretation... are significant’’ and that ‘‘...Unfortunately, at least to date, there is no

* Corresponding author at: Institute of Cognitive Neuroscience, Department of Psychology, University College London, London, United Kingdom. E-mail address: i.dror@ucl.ac.uk (I.E. Dror). 0379-0738/$ – see front matter ß 2010 Elsevier Ireland Ltd. All rights reserved. doi:10.1016/j.forsciint.2010.10.013

good evidence to indicate that the forensic science community has made a sufficient effort to address the bias issue’’ ([3] p. 8–9). The task demands imposed on the examiners require them to search through a rich stimulus, filter out noise, and determine characteristics and ‘signals’ for comparison (see [4,5] for discussion of signal detection theory (SDT) applied to fingerprint evidence). This initial analysis and determination of ‘signals’ can take place before the actual comparison between stimuli (e.g., the latent mark left at a crime scene and the comparison print of a known suspect). Scientists have long accepted that observations, including those in their own scientific research, encompass errors. A study examining 140,000 scientific observations reported in published research not only revealed that erroneous observations were made, but that they were systematically biased in favour of the hypothesis being researched [6]. Many different forms of contextual and cognitive influences affect our perception and bias it in a variety of ways [7]. Previous research on fingerprinting specifically examined potential cognitive contextual influences on comparing prints and decision making as to whether or not they originated from the same source [8–20].

Please cite this article in press as: I.E. Dror, et al., Cognitive issues in fingerprint analysis: Inter- and intra-expert consistency and the effect of a ‘target’ comparison, Forensic Sci. Int. (2010), doi:10.1016/j.forsciint.2010.10.013

G Model

FSI-6238; No. of Pages 8
2 I.E. Dror et al. / Forensic Science International xxx (2010) xxx–xxx

Bias in different aspects of forensic decision making has been examined in a number of studies (see review articles [21,22]). Specifically focusing on the initial analysis phase of the mark, before being actually compared to any prints, Langenburg [23] found that examiners generally reported more minutiae than novice controls. Furthermore, although the examiners varied in how many minutiae they observed in the initial analysis, they were more consistent than the novice control group (in 8 out of the 10 latent marks used in this study). These results were in agreement with those of Evett and Williams [1]. Following Langenburg study, Schiffer and Champod [24] found that training and experience increased the number of characteristics reported, and at the same time reduced the variability among observers. Schiffer and Champod also reported that the number of characteristics observed during the analysis phase was not affected by contextual information about the case or by the presence of a comparison print. Consequently, they concluded that the initial analysis stage (pre-comparison) is relatively robust and relatively free from the risk of contamination through contextualisation of the process. Although Langenburg [23] and Schiffer and Champod [24] show that these inconsistencies decrease with training and experience,1 they also make the point that ‘‘quite important variations do subsist between examiners’’ ([24] p. 119). All the studies consistently show that there is variability in the number of minutiae observed in the analysis stage, and that these inconsistencies are attenuated but not eliminated, during the initial training and experience in fingerprint examination. Furthermore, as reported by Schiffer and Champod [24], even in the relatively robust stage of analysis ‘‘a clear subjective element persists’’. A further study [25] suggests that the combined presence of contextual pressure and availability of the target comparison print influences the evaluation stage (following the analysis and comparison), but this effect varies among different marks. Dror et al. [11] suggested that as finger marks are more difficult (bottom-up), the more influence external factors (top-down) have on the observations. Bottom-up refers to the incoming data, where as top-down relies on pre-existing knowledge [26]. Top-down has many forms and manifestations, which include the context in which the data are presented, past experiences and knowledge, expectations, and so forth. Expertise is top-down, and as such experts rely more on top-down information. This allows efficient and effective processing of the bottom-up data, but also means it can distort and bias how the data are processed [27]. Variations in observation among different observers (‘‘inter-observer’’ differences) and variations in observation for the same observer for the same task, taken at different times (‘‘intra-observer’’ differences) are a well-known phenomenon in other fields involving expert decisions, such as radiologists or other medical technicians [28,29]. In the research reported here we examined three main issues: 1. The potential effect that a ‘target’ comparison fingerprint may have on the analysis of the latent mark. 2. The consistency in analysis among different examiners. 3. The consistency in analysis within the same examiner.

2. Applying a within-subject (intra-observer) experimental design. This allows us to measure consistency in analysis, as we compare examiners to themselves. Such intra-observer measurements are extremely accurate and informative because they are not only statistically more powerful then inter-observer measures, but they allow us to confidently draw conclusions because the data cannot be attributed to individual differences, such as visual acuity, experience, strategy, cognitive style, and training. 3. Subjecting the experimental data to statistical procedures and standards (e.g., retest reliability) that quantify the consistency of latent fingerprint examiners in the analysis of latent marks. 4. Statistically differentiating between factors that contribute to inconsistencies in latent mark analysis; thus determining what portion of the variance is attributed to the examiners’ performance and what portion is attributed to the latent marks themselves (using statistical effect sizes). 5. Suggesting a number of recommendations for dealing with issues surrounding latent mark analysis. 1. Effects of a ‘target’ comparison The human cognitive system is limited in its capacity to process information. The information available far exceeds available brain power and cognitive resources, and therefore we can only process a fraction of the information presented to us. This mismatch between computational demands and available cognitive resources caused the development of cognitive mechanisms that underpin intelligence. For example, we prioritize what information to process according to our expectations (e.g., [30]). Expectations are derived from experience, motivation, context, and other topdown cognitive processes that guide visual search, allocation of attention, filtering of information, and what (and how) information is processed. These mechanisms are vital for cognitive processes to be successful. Expertise is characterised by further development and enhancement of such mechanisms [26,27,31,32]. Therefore, there are good scientific data showing that the presence of any contextual information may affect cognitive information processing. Various factors and specific parameters define the context, whom it may affect, how, and to what extent. Understanding these factors and parameters will help develop science-based training and best practices that will enhance objectivity in fingerprint analyses, as well as in other forensic comparative examinations involving visual recognition. In the first experiment reported in this paper we used 20 experienced latent fingerprint examiners, to investigate whether the presence of a comparison ‘target’ print would affect the characteristics they observe in the latent mark. Each of the 20 experts received ten stimuli: five latent marks by themselves (solo condition) and five latent marks with the matching target print (pair condition). All the participants were instructed identically, requiring them to examine the latent marks and to count all the minutiae present in the image. The experimental conditions were counterbalanced across participants using a Latin Square design to minimize any affects due to the order of presenting the experimental trials [33]. We found that the presence of the accompanying comparison print affected how many minutiae were perceived by the expert latent print examiners. These differences were statistically significant (t(9) = 2.38, p = .021; with an effect size, r = .62). Interestingly, as evident in Table 1, the presence of the accompanying matching comparison print mainly reduced the number of minutiae perceived. This is consistent with attention guided visual search, whereby our cognitive system operates within the contextual expectation. It is important to note that the reduced number of minutiae was perhaps due to the comparison print being from the same source (a match); if it had been a non-match,

This paper further investigates and contributes to the studies on the analysis of fingerprints in the following ways: 1. Using actual latent fingerprint examiners, rather than forensic science or psychology students (such as in [11,25]).
1 This is particularly noticeable at the earlier stage of professional development, when a trainee has some experience and training. It is by no means a continuous linear change; there is a strong initial effect, but then it levels off and may even decline.

Please cite this article in press as: I.E. Dror, et al., Cognitive issues in fingerprint analysis: Inter- and intra-expert consistency and the effect of a ‘target’ comparison, Forensic Sci. Int. (2010), doi:10.1016/j.forsciint.2010.10.013

G Model

FSI-6238; No. of Pages 8
I.E. Dror et al. / Forensic Science International xxx (2010) xxx–xxx Table 1 The mean number of minutiae observed when the latent mark was presented by itself (‘solo’), within the context of a comparison print (‘pair’), and the differences between these two conditions. Latent mark A B C D E F G H I J MEAN SD Solo 20.6 13.4 20.1 9.8 10.7 8.4 12.1 15.6 7.1 9.1 12.7 4.7 Pair 14.1 9.9 10.8 9.7 11.1 8.8 10.7 10.5 8.5 6.6 10.1 2.0 Difference À6.5 À3.5 À9.3 À0.1 0.4 0.4 À1.4 À5.1 1.4 À2.5 À2.6 3.5 3

then it may have directed the perceptual cognitive system differently, possibly observing more minutiae. The importance of the finding is not whether the presence of the comparison print reduced or increased the number of minutiae perceived in the latent mark, but that the presence of a target comparison print had an effect on the perception and judgment of the latent mark. This finding emphasises the importance of examining the latent mark in isolation, prior to being exposed to any potential comparison print. This is to maximize the ‘clean’ bottom-up and more objective analysis, driven by the actual latent mark, and to minimize external influences that may bias the process of analysing the latent mark itself. This is especially important when the latent mark is of low quality. Such recommendations are also appropriate for other forensic domains (e.g., DNA, see sequential unmasking [34]), as well as for scientific investigations in general: ‘‘Keep the processes of data collection and analysis as blind as possible for as long as possible’’ (Rosenthal [6] p. 1007). However, Dror points out that the comparison print can play an important role in helping examiners optimize their analysis by correctly guiding their cognitive resources and interpretation [35]. Therefore it seems reasonable to balance the vulnerabilities and cues presented by making the comparison print available to the examiner. A solution may be to first examine the latent mark in

isolation, clearly documenting this more objective and uninfluenced analysis, but at the same time also allowing further analysis to be conducted later after exposure to the context of the target comparison print. Hence, the ACE approach needs to be initially applied linearly, making sure that the initial Analysis of the latent mark is done in isolation and documented, prior to moving to Comparison and Evaluation; yet still allowing flexibility, with transparency of when and why this took place, as well as procedures that control and limit the circumstances and extent for such retroactive changes so as to maximize performance but avoid (or at least minimize) circular reasoning and bias (for details, see [35]). It is interesting and important to note that some latent marks were more susceptible to this effect than others. For example, Table 1 shows that latent mark D was basically unaffected by the presence of the comparison print, whereas latent mark B was quite dramatically affected (see Fig. 1, below, for the actual latent marks). It is clear from all the studies on latent mark analysis that findings are highly dependent on the specific fingerprints used. This suggests that we can (and probably should) tailor procedures and best practices to specific types of prints, rather than inflexibly applying identical procedures prescribed to all prints [35]. Such knowledge-based procedures will allow for higher quality work without requiring more resources, because it wisely and appropriately allocates resources to where they are needed. The large variability in the effects of the presence of the comparison print on the latent mark analysis may explain why Schiffer and Champod [24] did not find such an effect: The latent marks they used may have been prints that are less (or not at all) affected by the presence of the comparison print, such as latent mark D in this study. An alternative (not mutually exclusive) explanation of why Schiffer and Champod did not find this effect is that they used students, and these effects may occur as examiners are more experienced and knowledgeable, and thus have expertise in how to utilise the information from the comparison print more effectively. The study reported here used experienced experts in latent print examination. It is also possible that experienced examiners tend to be more risk prone at calling minutiae, as opposed to students who will be more conservative. It is also interesting to note that the largest differences were observed with the latent marks that had the highest number of

Fig. 1. Some latent marks were more affected by the presence of a target comparison print than other latent marks. For example, latent mark B (left panel) was more affected then latent mark D (right panel).

Please cite this article in press as: I.E. Dror, et al., Cognitive issues in fingerprint analysis: Inter- and intra-expert consistency and the effect of a ‘target’ comparison, Forensic Sci. Int. (2010), doi:10.1016/j.forsciint.2010.10.013

G Model

FSI-6238; No. of Pages 8
4 I.E. Dror et al. / Forensic Science International xxx (2010) xxx–xxx

Fig. 2. The high correlation (0.9) between the mean number of minutiae observed when analysis was conducted when latent mark was presented by itself (‘solo’ condition) and the differences in analysis between the solo and pair conditions (absolute values, Table 1, right column).

minutiae observed in the solo condition (see, e.g., A, C, & H in Table 1). Overall, the correlation between the number of minutiae observed in the solo condition and the difference (absolute value) from those observed when shown in the pair condition was 0.9 (see Fig. 2). This may be due just to a ceiling effect, i.e., an artifact reflecting that as there are more minutiae marked in the solo condition, then there is more scope to reduce this number in the pair condition, and as the number of minutiae are lower in the solo condition, there is much less scope for a drop when they are presented in the pair condition. An alternative, not mutually exclusive, explanation is the effect of motivational factors.2 In the solo condition, examiners may be motivated to mark as many minutiae as they can, as they are not sure which ones may be useful and informative when they have a target print at the comparison stage. However, when the latent mark is analyzed while the target comparison print is available (as in the pair condition), examiners’ motivation may drop when they get to a critical mass of minutiae they need for comparison purposes. Once they get to that threshold, they may be less likely to detect more minutiae. This effect further strengthens our suggestion that the initial analysis of latent marks should be done in isolation of a comparison exemplar print (especially when the latent mark is judged to be low quality, distorted, or has limited information available). 2. Inter-observer consistency As we have shown, the presence of a ‘target’ comparison print can affect the perception and judgment of the latent mark in a number of ways. The next issue investigated was the consistency in the perception and judgment of minutiae in a latent mark across participants, even without the presence of a target comparison. The ‘solo’ condition data contain the answer to this question; it allows us to examine and compare the minutiae observed by different experts, and hence to report the variability in how latent print examiners may perceive and judge minutiae. Table 2 presents the relevant data, with the range of values for each latent mark (bottom row). The apparent lack of consistency may reflect the absence of objective and quantifiable measures as to what constitutes a minutia, especially with latent marks that are of varying quality. However, these differences may also reflect individual differences between the examiners (arising from variations in eyesight, training, feature selection strategy, cognitive style, threshold criteria, etc.). It is important to understand the cognitive issues

in latent mark analysis, and the variabilities in the analysis among and within examiners provides insights to the underling cognitive processing. Evett and Williams [1], Langenburg [13,23], and Schiffer and Champod [24], all found inconsistencies among examiners regarding the number of minutiae observed. Evett and Williams suggest that this ‘‘confirms the subjective nature of points of comparison’’ (p. 7), and Langenburg [23] and Schiffer and Champod report that these variations are larger with novices. As fingerprint examination advances, more objective measures and standards will ensure greater consistency among examiners. The potential influence introduced by a ‘target’ comparison print was addressed earlier. Another issue is the calibration of the threshold for determining whether a minutia is a ‘signal.’ Different examiners may be using different threshold criteria, and hence the large variance in how many minutiae different latent fingerprint examiners report on the same latent mark (similar problems occur in other forensic domains; see, for example, the lack of agreement on colour description used to determine the age of a bruise [35]). A simple training tool could help deal with this problem.3 A set of latent marks can be made available for examiners to analyse. After analysis, personal feedback will be provided to the examiner as to how consistent they are with other examiners. For example, it may state that ‘your analysis resulted in similar minutiae as most examiners (and hence no need to calibrate thresholds), or it may state that ‘your analysis resulted in a larger (or much larger, or smaller, as the case may be) number of minutiae relative to most examiners (and hence the examiners may consider changing their thresholds). The idea is that this would be a private measure, with results and feedback confidentially available only to the individual examiner. The full technical details of such a training calibration tool and its implementation are beyond the scope of this paper, but they are straightforward. Some more conceptual issues that need to be addressed are which latent marks should be used for this purpose, and how to make sure the feedback is taken on board and examiners do indeed calibrate their judgements. These must be scientifically based decisions. Furthermore, a fundamental issue that needs to be addressed is that the calibration is done to the ‘correct’ threshold, because ensuring different examiners use the same criterion, does not mean they are using the ‘correct’ one. 3. Intra-observer consistency Judgment and subjectivity affect the number of minutiae characteristics reported, resulting in inconsistency among experts on how many minutiae are present within a specific latent mark. This study and the other studies [1,23,24] all consistently show that these variations are further dependent on the actual latent mark and exemplar prints in questions (i.e., some produce higher inconsistency than others). Furthermore, Dror and Charlton [8,9] report that some examiners are more affected by context than others. To ascertain the role of individual differences (such as experience, motivation, training, feature selection strategy, thresholds, cognitive style, personality) vs. the contribution of lack of objective quantifiable measures for determining characteristics in analysis of latent marks, we conducted an intra-observer (withinexpert subject) experimental design.

2 We mean a cognitive motivation, not an intentional motivation, or lack thereof, to conduct proper analysis. That is, how motivated and driven is the cognitive system to spend resources on processing and evaluating additional information. Generally, our cognitive system is efficient and economical in the sense that it does the minimal amount of processing needed to get the job done. This enables it to best utilize the brain’s limited cognitive resources. 3 This idea was first presented by Arie Zeelenberg, and referred to as Fingerprint Analyses Consistency Tester FACT (finder).

Please cite this article in press as: I.E. Dror, et al., Cognitive issues in fingerprint analysis: Inter- and intra-expert consistency and the effect of a ‘target’ comparison, Forensic Sci. Int. (2010), doi:10.1016/j.forsciint.2010.10.013

G Model

FSI-6238; No. of Pages 8
I.E. Dror et al. / Forensic Science International xxx (2010) xxx–xxx 5

Table 2 The number of minutiae observed by each examiner for each latent mark (inter-observer). The minimum number per latent mark (‘Min’), the maximum number per latent mark (‘Max’), the standard deviation (‘SD’) and the range of minutiae observed for each latent mark (presented on the bottom row). Analysis of the latent marks A 22 21 19 21 17 20 22 9 30 25 Min Max Mean SD Range 9 30 20.1 5.49 21 B 9 11 9 21 16 14 17 9 15 13 9 21 13.4 4.01 12 C 15 25 18 29 15 22 15 19 25 18 15 29 20.1 4.93 14 D 8 7 10 14 11 9 10 6 10 13 6 14 9.8 2.49 8 E 9 10 7 12 16 10 10 9 12 12 7 16 10.7 2.45 9 F 3 9 9 9 9 7 8 8 12 10 3 12 8.4 2.32 9 G 8 9 15 8 7 13 11 18 19 13 7 19 12.1 4.25 12 H 11 10 19 9 12 18 24 16 22 15 9 24 15.6 5.15 15 I 7 6 6 4 5 7 8 9 12 7 4 12 7.1 2.23 8 J 10 5 6 8 5 9 11 10 17 10 5 17 9.1 3.54 12

Within-expert experimental design examines intra-observer effects, comparing an examiner’s responses at one time to their own responses at another time, thus controlling for individual differences (see Dror and Charlton [8,9]). The study reported here examined the consistency in analysis of latent marks within the same expert examiner. A new set of expert examiners was used. They were asked to report all the minutiae present on ten latent marks. A few months later, they were asked to do the same exercise, thus receiving the same identical instructions at time 1 and at time 2. The experts overall analyzed 200 latent marks, 100 latent marks twice. Table 3 presents the actual data: 10 latent print examiners, each making 20 analyses in total, analysing 10 latent marks (A–J), at Time1 and at Time2. In contrast to Table 2 where we examined the overall range and consistency obtained across examiners, here we focus on comparing the results of each examiner to his or herself, specifically looking at the degree to which the experts were consistent with themselves. Analysis of variance (ANOVA) of the data from Table 3 showed that examiners differed significantly from each other in the number of minutiae reported: F(9,81) = 8.28, p = .<0.001, effect size correlation eta = .69. This analysis also showed that the number of minutiae observed differed significantly from each other depending on the latent mark: F(9,81) = 57.30, p = <0.001, effect size correlation eta = .93. Note the larger effect size for the contribution of the latent marks compared to the effect size for the contribution of the examiners. Most important is the Retest Reliability reported in Table 3 (right column) which is a statistical measure for quantifying consistency; see also the stem-and-leaf Plot and the Five Point Summaries of retest reliabilities in Fig. 3. It was interesting to see whether the inconsistencies occurred over the typical range of thresholds for potential decision (e.g., 8 vs. 17, see examiner 3, latent mark G), or in ranges that do not typically matter for identification (examiner 6, latent mark A, 25 vs. 34). Both cases have a difference of 9 minutiae, but the former variability is more likely to cross a decision threshold for identification, while the latter’s range of values are more likely to all be above a decision threshold (of course, this cannot be determined with certainly from the data in the present study, as this analysis is on the latent mark alone, prior to comparison to a print). Do examiners even consider identification thresholds when conducting the initial analysis? Evett and Williams [1] reported
4 The Evett and Williams study was conducted when a 16-point standard was in place in the UK.

that the number of minutiae participants observed was influenced by decision thresholds, e.g., ‘‘participants tended to avoid returning 15 points’’ (p. 7).4 Categorical perception makes people perceive information according to psychological categories rather than by their actual physical appearance [36]. To further investigate and understand the inconsistency we calculated the absolute differences in the analysis between time1 and time2, for each examiner (1–10) for each latent mark (A–J), see Table 4 (see also the stem-and-leaf Plot and the Five Point Summaries of retest reliabilities in Fig. 4). A score of ‘0’ reflects a potentially perfectly consistent analysis.5 As evident in Table 4, there were only 16% such consistent analyses (this is a conservative value, best case scenario; the actual variability may be higher, see footnote 5). If we ‘relax’ our criteria for consistency, and characterize consistency as a difference of 0 or 1, then there are 40% consistent analyses; if we further relax our criteria for consistency to include differences of 0, 1, or 2, then there are 55% consistent analyses (or, stated differently, 45% of the analyses differed in at least more than two minutiae between the two analyses conducted by the same examiner – footnote 5 notwithstanding). These data raise questions about objective assessment even at the analysis stage (which seems to be more robust to influences and context than the other stages of fingerprint examination and decision making). The data reported here are conservative, as the variability may be much higher. However, Analyses of variance (ANOVA) of the data of Table 3 showed that although examiners differed significantly from each other in their degree of consistency in judging fingerprints (eta = .44), and in the number of minutiae observed (eta = .69), they still showed a high degree of inter-observer reliability with each other (rintraclass = .85) and with themselves (retest reliability r = .86). The examiners who showed the highest retest reliabilities also tended to show the smallest discrepancy between their two evaluations of the same fingerprints at time1 and at time2, r = À.84.6 The differences between time1 and time2 (see Table 4) show that some examiners are more consistent than others (see, e.g., examiner 10, who is relatively highly consistent vs. examiner 3). Indeed, analysis of variance (ANOVA) of the difference scores in
5 It is important to note that even if the examiner reported exactly the same number of minutiae, it does not necessarily reflect consistency because although they may have observed the same number of minutiae in Time1 and in Time2, these may have been a different set of minutia. Hence, reporting the number of minutiae (rather than their overlap) provides a best case scenario. 6 When we computed the same statistics but on the relative difference (i.e., the difference as a function of the total number of minutiae, we obtained essentially the same statistical results).

Please cite this article in press as: I.E. Dror, et al., Cognitive issues in fingerprint analysis: Inter- and intra-expert consistency and the effect of a ‘target’ comparison, Forensic Sci. Int. (2010), doi:10.1016/j.forsciint.2010.10.013

G Model

FSI-6238; No. of Pages 8
6 I.E. Dror et al. / Forensic Science International xxx (2010) xxx–xxx

Table 3 The number of minutiae observed by each examiner (1–10), for each latent mark (A–J), at time 1 and at time 2 (intra-observer). The last column shows the retest reliability statistic for each of the 10 examiners. Latent mark Examiner 1 2 3 4 5 6 7 8 9 10 Time Time Time Time Time Time Time Time Time Time Time Time Time Time Time Time Time Time Time Time 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 A 27 26 31 23 19 18 20 22 19 25 34 25 21 21 19 22 19 23 19 20 B 15 14 16 13 11 8 12 9 11 13 16 12 9 13 14 13 11 14 10 10 C 17 21 14 19 13 16 17 19 19 21 21 23 19 14 14 18 11 20 9 9 D 9 10 9 10 5 8 6 11 6 9 12 11 9 7 10 10 7 7 8 7 E 9 8 10 9 9 15 10 10 10 14 13 17 12 8 9 15 9 13 4 8 F 7 5 7 9 5 9 8 9 13 12 13 7 9 6 6 8 4 8 2 3 G 16 13 12 10 8 17 7 8 9 12 12 12 10 7 12 13 8 11 10 6 H 13 15 13 8 12 21 8 8 14 11 11 16 18 11 13 17 15 14 8 7 I 7 7 6 8 6 7 6 6 8 8 8 9 6 6 7 5 5 4 6 6 J 13 12 9 11 10 12 7 8 12 9 12 13 10 10 11 11 2 5 5 5 .95 .85 .65 .92 .84 .80 .80 .87 .88 .91 Retest reliability (r12)

Table 4 showed that examiners differed significantly from each other in the consistency with which they judged the 10 latent marks: F(9,81) = 2.17, p = .032, effect size correlation eta = .44. Are the more consistent examiners characterized by personality type and cognitive aptitudes? If so, we need to know how to select candidates with such cognitive profiles during recruitment. Or

Examiners 1, 2, 5 0, 0, 4, 5, 7, 8 5

Latent Marks .9 .8 .7 .6 .5 .4 .3 .2 .1 7 8 4, 9 9 0, 5, 6 4 6 0.46 0.22 0.16 0.43 0.87

perhaps these examiners receive a certain type of training, or maybe they adopted more objective definitions? All these are important questions that may help pave the way to understanding how such variations can be minimized. However, the inconsistencies did not only vary between examiners, they were also dependent on the latent mark itself. The analysis of the variance also showed that latent marks differed significantly from each other in the consistency with which they were judged: F(9,81) = 2.82, p = .006, effect size correlation eta = .49. This means that some latent marks are just more susceptible to issues of consistency than others. However, understanding and characterizing what constitutes such latent marks is not a simple matter, and we must be careful and not be hasty in determining how to a priori know which prints are susceptible to inconsistent analysis. With careful further research and converging studies, we should be able to learn and predict which latent marks are likely to be problematic.

Mean SD Min Median Max

0.85 0.085 0.65 0.86 0.95

Experts 1 1, 2, 3 3, 7, 8 2, 5, 6 4. 3. 2. 1. 0.

Latent Marks 4, 5, 6, 7 4, 6, 9 3, 7 7 Latent Marks 3.7 3.52 2.75 1.6 0.7 2.58 1.049 1.100

Examiners Maximum 75th percentile Median 25
th

Latent Marks .87 .612 .425 .285 .16 .46 .218 .048

Experts Maximum 75 25
th

.95 .912 .86 .80 .65 .85 .085 .007

4.1 3.22 2.75 1.58 1.2 2.58 .922 .851

percentile percentile

Median
th

percentile

Minimum Mean SD S
2

Minimum Mean S S2

Fig. 3. Stem-and-leaf plot of retest reliabilities of 10 fingerprint experts and 10 latent marks (top panel) and summaries statistics of retest reliabilities of 10 fingerprint experts and 10 latent marks.

Fig. 4. Stem-and-leaf plot of absolute difference scores of 10 fingerprint experts and 10 latent marks (top panel) and summary statistics of absolute difference scores of 10 fingerprint experts and 10 latent marks.

Please cite this article in press as: I.E. Dror, et al., Cognitive issues in fingerprint analysis: Inter- and intra-expert consistency and the effect of a ‘target’ comparison, Forensic Sci. Int. (2010), doi:10.1016/j.forsciint.2010.10.013

G Model

FSI-6238; No. of Pages 8
I.E. Dror et al. / Forensic Science International xxx (2010) xxx–xxx 7

Table 4 The differences in number of minutiae observed by the same examiner at different times. The bottom row is the mean difference per latent mark (A–J), and the right most column is the mean difference per examiner (1–10). Latent mark Examiner 1 2 3 4 5 6 7 8 9 10 Mean A 1 8 1 2 6 9 0 3 4 1 3.5 B 1 3 3 3 2 4 4 1 3 0 2.4 C 4 5 3 2 2 2 5 4 9 0 3.6 D 1 1 3 5 3 1 2 0 0 1 1.7 E 1 1 6 0 4 4 4 6 4 4 3.4 F 2 2 4 1 1 6 3 2 4 1 2.6 G 3 2 9 1 3 0 3 1 3 4 2.9 H 2 5 9 0 3 5 7 4 1 1 3.7 I 0 2 1 0 0 1 0 2 1 0 0.7 J 1 2 2 1 3 1 0 0 3 0 1.3 1.6 3.1 4.1 1.5 2.7 3.3 2.8 2.3 3.2 1.2 2.58 Mean

This is an important step to remedy the problem. Once we know which latent marks are likely to cause consistency issues, we can recommend appropriate scientifically based procedures that attenuate the problem. For example, in latent marks of low quality, instructing a number of examiners to only mark minutiae that they have high confidence in. And then allow only use of those minutiae that have been marked across different examiners, thus using consensus in high confidence to determine the reliable features to use in such marks. Another approach is for mapping quality and clarity across a latent mark, so as to map high, medium, and low quality regions. Variability of feature selection may be lower if examiners are required to select only from the higher quality regions, but that may entail losing out on information. In this study we have identified a common phenomenon found in many expert domains, invite debate on the topic and its significance, and have suggested recommendations to deal with it. Given that this present study has identified significant inter- and intra-observer variations during minutiae selection, it is relevant to ask: What impact can this have on the overall comparison decision making outcome? Is the lack of consistency a practical concern or an academic issue? The answer to these questions appears to be complex and depends on a number of factors. For example, in Evett and Williams [1] the variations in reported minutia did not totally predict the variations in overall decision outcome. In their study, Trials B, E, and F (which varied a lot in minutiae reported by some examiners), had 99%, 92%, and 100% consensus (N = 130) that the latent mark and the print originated from the same source. In other words, the variations (e.g., Trial F varied up to 42 minutiae), did not necessarily prevent experts reaching the same final conclusion. In contrast, other trials (such as Trial H) which had smaller variations, had less consensus on the final overall decision (in Trial H, e.g., 54% concluded they are likely from the same source, 38% reported insufficient detail to make a decision, and 8% reported they are not from the same source). Here the variation in feature selection appeared to be critical. In Langenburg et al. [14] a similar trend was observed. In their Fig. 12, participants reported ranges (maximum differences) of 21, 17, and 12 minutiae respectively for Q1, Q4, and Q5 trials. However, trials Q1 and Q5 resulted in 100% consensus (N = 43) for the reported decision. Q4 on the other hand resulted in three errors, and the remaining participants nearly split on reporting ‘‘identification’’ or ‘‘inconclusive’’. Those that reported ‘‘identification’’ had a statistically significant higher likelihood of also reporting more minutiae. In this trial, it appeared that the number of minutiae observed directly correlated to the decision reported and was a critical part of the decision making process. Therefore, it is clearly a critical issue and variation needs to be researched and understood better. It appears as a general trend that the reduction of available minutiae in a finger mark, especially to a point where the amounts

may hover around categorical decision thresholds (i.e., ‘‘identification’’ vs. ‘‘inconclusive’’), can lead to different decisions. Therefore, a possible best practice would be to identify a priori which marks are likely to produce such decision variations and apply special procedure, such as previously discussed (use of consensus high confidence minutiae, quality mapping, conservative selection procedures, etc.). Further research is recommended here, particularly to determine which suggested variation reduction technique is appropriate and effective. 4. Summary and conclusions Feature selection during the analysis stage of a latent mark is important because it sets the stage and the parameters for comparisons and decision making. Although this stage is relatively robust, it is still susceptible to observer effects. In this study we found that the presence of a comparison ‘target’ print may affect the analysis stage. Furthermore, there is lack of consistency in the analysis not only among different examiners (e.g., reliability among examiners r = .85), but also within the same examiners analysing identical latent marks at different times (retest reliability r = .86). The characterization of experts’ consistency depends on the standard applied. If we examine the purest test of consistency, i.e., how consistent examiners are with themselves, then the retest reliability of r = .86, though far from perfect is respectably high; but using another standard, we find that at best (see footnote 5) only 16% of experts observed the exact same number of minutiae when analysing the same latent mark (40% of the experts were within one minutia difference, and 55% were within a difference of two minutiae). Our study goes beyond establishing that analysis of latent marks by experienced latent print examiners is inconsistent. First, it demonstrates that the presence of a comparison print can affect the analysis of the latent mark. Second, it shows that examiners are inconsistent among themselves; i.e., different examiners vary in their analysis. Third, it reveals that the consistency of examiners with themselves varies; some examiners are relatively consistent with themselves and others are not. Fourth, we found that the lack of consistency does not only depend on the examiner in question, but it also highly depends on the nature of the latent mark itself. For each of these findings we suggest potential recommendations to mitigate the problems. First, given the effects of the comparison print, we suggest that initially the analysis of a latent mark should be done in isolation from the comparison print. Furthermore, we do not rule out reconsideration of the analysis after exposure to the comparison print, but stipulate that this process, should it occur, must be clearly and transparently

Please cite this article in press as: I.E. Dror, et al., Cognitive issues in fingerprint analysis: Inter- and intra-expert consistency and the effect of a ‘target’ comparison, Forensic Sci. Int. (2010), doi:10.1016/j.forsciint.2010.10.013

G Model

FSI-6238; No. of Pages 8
8 I.E. Dror et al. / Forensic Science International xxx (2010) xxx–xxx [2] I.W. Evett, Expert evidence and forensic misconceptions of the nature of exact science, Sci. Justice 36 (1996) 118–122. [3] NAS, Strengthening Forensic Science in the United States: A Path Forward. National Academy of Sciences, Washington, DC, 2009. [4] V.L. Phillips, M.J. Saks, J.L. Peterson, The application of signal detection theory to decision-making in forensic science, J. Forensic Sci. 46 (2) (2001) 294–308. [5] J. Vokey, J. Tangen, S. Cole, On the preliminary psychophysics of fingerprint identification, Quart. J. Exp. Psychol. 62 (5) (2009) 1023–1040. [6] R. Rosenthal, How often are our numbers wrong? Am. Psychol. 33 (11) (1978) 1005–1008. [7] R.S. Nickerson, Confirmation bias: a ubiquitous phenomenon in many guises, Rev. Gen. Psychol. 2 (2) (1998) 175–220. [8] I.E. Dror, D. Charlton, Why experts make errors, J. Forensic Ident. 56 (4) (2006) 600–616. ´ [9] I.E. Dror, D. Charlton, A. Peron, Contextual information renders experts vulnerable to make erroneous identifications, Forensic Sci. Intern. 156 (1) (2006) 74–78. [10] I.E. Dror, J.L. Mnookin, The use of technology in human expert domains: challenges and risks arising from the use of automated fingerprint identification systems in forensics, Law Prob. Risk 9 (1) (2010) 47–67. ´ [11] I.E. Dror, A. Peron, S. Hind, D. Charlton, When emotions get the better of us: the effect of contextual top-down processing on matching fingerprints, Appl. Cogn. Psychol. 19 (6) (2005) 799–809. [12] I.E. Dror, B. Rosenthal, Meta-analytically quantifying the reliability and biasability of fingerprint experts’ decision making, J. Forensic Sci. 53 (4) (2008) 900–903. [13] G. Langenburg, A method performance pilot study: testing the accuracy, precision, repeatability, reproducibility, and biasability of the ACE-V process, J. Forensic Ident. 59 (2) (2009) 219–257. [14] G. Langenburg, C. Champod, P. Wertheim, Testing for potential contextual bias effects during the verification stage of the ace-v methodology when conducting fingerprint comparisons, J. Forensic Sci. 54 (3) (2009) 571–582. [15] D.M. Risinger, M.J. Saks, W.C. Thompson, R. Rosenthal, The Daubert/Kumho implications of observer effects in forensic science: hidden problems of expectation and suggestion, Calif. Law Rev. 90 (1) (2002) 1–56. [16] L.J. Hall, E. Player, Will the instruction of an emotional context affect fingerprint analysis and decision making? Forensic Sci. Intern. 181 (2008) 36–39. [17] M. Saks, L.J. Concerning, E. Hall, Player ‘Will the introduction of an emotional context affect fingerprint analysis and decision-making?’, Forensic Sci. Intern. 191 (2009) e19. [18] I.E. Dror, On proper research and understanding of the interplay between bias and decision outcomes, Forensic Sci. Intern. 191 (2009) e17–e18. [19] R.B. Stacey, Report on the erroneous fingerprint identification bombing case, J. Forensic Ident. 54 (6) (2004) 706–718. [20] K. Wertheim, G. Langenburg, A. Moenssens, A report of latent print examiner accuracy during comparison training exercises, J. Forensic Ident. 56 (1) (2006) 55–93. [21] I.E. Dror, S. Cole, The vision in ‘blind’ justice: expert perception, judgment and visual cognition in forensic pattern recognition, Psychol. Bull. Rev. 17 (2) (2010) 161–167. [22] W.C. Thompson, Interpretation: observer effects, in: A. Moenssens, A. Jamieson (Eds.), Encyclopaedia of Forensic Sciences, John Wiley & Sons, London, 2009, pp. 1575–1579. [23] G. Langenburg, Pilot study: a statistical analysis of the ACE-V methodology – analysis stage, J. Forensic Ident. 54 (1) (2004) 64–79. [24] B. Schiffer, C. Champod, The potential (negative) influence of observational biases at the analysis stage of finger mark individualization, Forensic Sci. Intern. 167 (2007) 116–120. [25] B. Schiffer, The relationship between forensic science and judicial error: a study ´ covering error sources bias and remedies, Ph.D. Thesis, Universite de Lausanne, Lausanne, 2009. [26] T. Busey, I.E. Dror, Special Abilities and Vulnerabilities in Forensic Expertise in Fingerprint Sourcebook, NIJ Press, Washington, DC, USA, 2010 (Chapter 15). [27] I.E. Dror, The paradox of human expertise: why experts can get it, in: N. Kapur (Ed.), The Paradoxical Brain, Cambridge University Press, Cambridge, UK, in press, (Chapter 9). [28] E.J. Potchen, T.G. Cooper, A.E. Sierra, G.R. Aben, M.J. Potchen, M.G. Potter, J.E. Siebert, Measuring performance in chest radiography, Radiology 217 (2000) 456– 459. [29] S. Bektas, B. Bahadir, N.O. Kandemir, F. Barut, A.E. Gul, S.O. Ozdamar, Intraobserver and interobserver variability of fuhrman and modified fuhrman grading systems for conventional renal cell carcinoma, Kaohsiung J. Med. Sci. 25 (2009) 596–600. [30] C. Summerfield, T. Egner, Expectation (and attention) in visual cognition, Trends Cogn. Sci. 13 (9) (2009) 403–409. [31] K.A. Ericsson, N. Charness, P.J. Feltovich, in: R.R. Hoffman (Ed.), The Cambridge Handbook of Expertise and Expert Performance, Cambridge University Press, New York, 2006. [32] K.A. Ericsson (Ed.), Development of Professional Expertise, Cambridge University Press, New York, 2009. [33] R. Rosenthal, R.L. Rosnow, Essentials of Behavioral Research: Methods and Data Analysis, third ed., McGraw-Hill Press, 2007. [34] D.E. Krane, et al., Sequential unmasking: a means of minimizing observer effects in forensic DNA interpretation, J. Forensic Sci. 56 (2008) 1006. [35] I.E. Dror, How can Francis Bacon help forensic science? The four idols of human biases, Jurimetrics: J. Law Sci. Techn. 50 (2009) 93–110. [36] S. Harnad (Ed.), Categorical Perception: The Groundwork of Cognition, Cambridge University Press, New York, 1987.

documented, and justified. Further research needs to consider other ways to deal with variation in the analysis stage. One suggestion may be, for example, that examiners should mark confidence levels in minutia detection; thereafter they can only reconsider low confidence judgements but cannot change those that were analyzed initially with high confidence (see Dror [35] for details). Second, given that examiners vary among themselves in their analysis, we support the development of a simple calibration tool that enables examiners to adjust their threshold so as to meet the standards in the field. Third, given that some examiners are more consistent with themselves than others, we are confident that with proper selection of examiners with the right cognitive profiles specifying the exact skills needed for latent fingerprint examination and with proper training, can reduce the examiners’ contribution to inconsistencies in finger mark analysis. Fourth, given that the latent marks themselves play a major contributing role to the inconsistencies, and that these contributions vary with different marks, we suggest that such marks be subject to a different analysis procedure. Namely this would require using only higher confidence consensus minutiae that a number of independent examiners agree on. Determining characteristics in finger mark analysis is critical and measures must be taken to minimize inconsistency and increase objectivity. These issues are not limited to fingerprint examination, there are similar issues across the forensic disciplines, including DNA. We do note that the potential problems with inconsistent analysis may be acute only when the comparison and latent mark are near the threshold for identification (and thus one analysis may result in identification whereas another analysis does not; problems may also arise around judgments of ‘inconclusive’ when another analysis may be sufficient for identification). When the decision is considerably beyond the threshold of determination, then these issues may not have important practical implication (as both analyses, although inconsistent, still will result in the same overall decision). Understanding the cognitive issues involved in pattern matching and decision making, and researching them within the realm of fingerprinting is a promising way to decrease expert variation, improve the reliability of fingerprinting, and to gain insights into the human mind and cognitive processes. Acknowledgments We would like first to thank all the latent print examiners who took part in our studies. Without such cooperation this research would not have been possible. We also want to thank Joseph Almog, Camille Bourque, Rebbeca Bucht, Thomas Busey, Gerald Clough, Ralph and Lyn Haber, Anthony Laird, Danielle Mannion, Wayne Plumtree, Norah Rudin, and Arie Zeelenberg for valuable comments on an earlier version of this paper. However, any opinions, findings, and conclusions or recommendations expressed in this paper are the sole responsibility of the authors. Correspondence concerning this article should be addressed to Itiel Dror, Institute of Cognitive Neuroscience, University College London, 17 Queen Square, London WC1N 3AR, England (e-mail: i.dror@ucl.ac.uk); further information is available at www.cci-hq.com. References
[1] I.W. Evett, R.L. Williams, A review of the sixteen points fingerprint standard in England and Wales, The Print 12 (1) (1996) 1–13 (Also published in Fingerprint Whorld 21(82) (1995) 125–43; and in J. Forensic Ident 46(1), 49–73).

Please cite this article in press as: I.E. Dror, et al., Cognitive issues in fingerprint analysis: Inter- and intra-expert consistency and the effect of a ‘target’ comparison, Forensic Sci. Int. (2010), doi:10.1016/j.forsciint.2010.10.013

Appendix 2

Proceedings of Neural Information Processing Systems, Vancouver, Canada, December 6-9, 2010

Evaluation of Rarity of Fingerprints in Forensics

Chang Su and Sargur Srihari Department of Computer Science and Engineering University at Buffalo Amherst, NY 14260 {changsu,srihari}@buffalo.edu

Abstract
A method for computing the rarity of latent fingerprints represented by minutiae is given. It allows determining the probability of finding a match for an evidence print in a database of n known prints. The probability of random correspondence between evidence and database is determined in three procedural steps. In the registration step the latent print is aligned by finding its core point; which is done using a procedure based on a machine learning approach based on Gaussian processes. In the evidence probability evaluation step a generative model based on Bayesian networks is used to determine the probability of the evidence; it takes into account both the dependency of each minutia on nearby minutiae and the confidence of their presence in the evidence. In the specific probability of random correspondence step the evidence probability is used to determine the probability of match among n for a given tolerance; the last evaluation is similar to the birthday correspondence probability for a specific birthday. The generative model is validated using a goodness-of-fit test evaluated with a standard database of fingerprints. The probability of random correspondence for several latent fingerprints are evaluated for varying numbers of minutiae.

1

Introduction

In many forensic domains it is necessary to characterize the degree to which a given piece of evidence is unique. For instance in the case of DNA a probability statement is made after a match has been confirmed between the evidence and the known, that the chance that a randomly selected person would have the same DNA pattern is 1 in 24,000,000 which is a description of rarity of the evidence/known [1]. In the case of fingerprint evidence there is uncertainty at two levels: the similarity between the evidence and the known and the rarity of the known. This paper explores the evaluation of the rarity of a fingerprint as characterized by a given set of features. Recent court challenges have highlighted the need for statistical research on this problem especially if it is stated that a high degree of similarity is present between the evidence and the known [2]. A statistical measure of the weight of evidence in forensics is a likelihood ratio (LR) defined as follows [3]. It is the ratio between the joint probability that the evidence and known come from the same source, and the joint probability that the two come from two different sources. If the underlying distributions are Gaussian the LR can be simplified as the product of two exponential factors: the first is a significance test of the null hypothesis of identity, and the second measures rarity. Since evaluation of the joint probability is difficult for fingerprints, which are characterized by variable sets of minutia points with each point itself expressed as a 3-tuple of spatial co-ordinates and an angle, the LR computation is usually replaced by one wherein a similarity (or kernel) function is introduced between the evidence and the known and the likelihood ratio is computed for the similarity [4, 5]. While such efforts concern the significance of the null hypothesis of identity, fingerprint rarity continues to be a difficult problem and has never been solved. This paper describes a systematic approach for the computation of the rarity of fingerprints in a robust and reliable manner. 1

The process involves several individual steps. Due to varying quality of fingerprints collected from the crime scene, called latent prints, a registration process is needed to determine which area of finger skin the print comes from; Section 2 describes the use of Gaussian processes to predict core points by which prints can be aligned. In Section 3 a generative model based on Bayesian networks is proposed to model the distribution of minutiae as well as the dependencies between them. To measure rarity, a metric for assessing the probability of random correspondence of a specific print against n samples is defined in Section 4. The model is validated using a goodness-of-fit test in Section 5. Some examples of evaluation of rarity are given in Section 6.

2

Fingerprint Registration

The fingerprint collected from the crime scene is usually only a small portion of the complete fingerprint. So the feature set extracted from the print only contains relative spatial relationship. It’s obvious that feature sets with same relative spatial relationship can lead to different rarity if they come from the different areas of the fingertip. To solve this problem, we first predict the core points and then align the fingerprints by overlapping their core points. In biometrics and fingerprint analysis, core point refers to the center area of a fingerprint. In practice, the core point corresponds to the center of the north most loop type singularity. For fingerprints that do not contain loop or whorl singularities, the core is usually associated with the point of maxima ridge line curvature[6]. The most popular approaches proposed for core point detection is the Poincare Index (PI) which is developed by [7, 8, 9]. Another commonly used method [10] is a sine map based method that is realized by multi-resolution analysis. The methods based on Fourier expansion[11], fingerprint structures [12] and multi-scale analysis [13] are also proposed. All of these methods require that the fingerprints are complete and the core points can be seen in the prints. But this is not the case for all the fingerprints. Latent prints are usually small partial prints and do not contain core points. So there’s no way to detect them by above computational vision based approaches. We proposes a core point prediction approach that turns this problem into a regression problem. Since the ridge flow directions reveal the intrinsic features of ridge topologies, and thus have critical impact on core point prediction. The orientation maps are used to predict the core points. A fingerprint field orientation map is defined as a collection of two-dimensional direction fields. It represents the directions of ridge flows in regular spaced grids. The gradients of gray intensity of enhanced fingerprints are estimated to obtain reliable ridge orientation [9]. Given an orientation map of a fingerprint, the core point is predicted using Gaussian processes. Gaussian processes dispense with the parametric model and instead define a probability distribution over functions directly. It provides more flexibility and better prediction. The advantage of Gaussian process model also comes from the probabilistic formulation[14]. Instead of representing the core point as a single value, the predication of the core point from Gaussian process model takes the form of a full predictive distribution. Suppose we have a training set D of N fingerprints, D = {(gi , yi )|i = 1, . . . , N }, where g denotes the orientation map of a fingerprint print and y denotes the output which is the core point. In order to predict the core points, Gaussian process model with squared exponential covariance function is applied. The regression model with Gaussian noise is given by y = f (g) + ǫ (1) where f (g) is the value of the process or function f (x) at g and ǫ is a random noise variable whose value is chosen independent for each observation. We consider the noise processes that have a Gaussian distribution, so that the Gaussian likelihood for core point is given by p(y|f (g)) = N (f , σ 2 I) (2) where σ 2 is the variance of the noise. From the definition of a Gaussian process, the Gaussian process prior is given by a Gaussian whose mean is zero and whose covariance is defined by a covariance function k(g, g′ ) so that f (g) ∼ GP(0, k(g, g′ )) (3) The squared exponential covariance function is used here to specify the covariance between pairs of variables, parameterized by θ1 and θ2 . θ2 k(g, g′ ) = θ1 exp(− |g − g′ |2 ) (4) 2 2

where the hyperparameters θ1 and θ2 are optimized by maximizing of the log likelihood p(y|θ1 , θ2 ) Suppose the orientation map of a input fingerprint is given by g∗ . The Gaussian predictive distribution of core point y ∗ can be evaluated by conditioning the joint Gaussian prior distribution on the observation (G, y), where G = (g1 , . . . , gN )⊤ and y = (y1 , . . . , yN )⊤ . The predictive distribution is given by p(y ∗ |g∗ , G, y) = N (m(y ∗ ), cov(y ∗ )) (5) where m(y ∗ ) = k(g∗ , G)[K + σ 2 I]−1 y (6) cov(y ∗ ) = k(g∗ , g∗ ) + σ 2 − k(g∗ , G)⊤ [K + σ 2 I]−1 k(G, g∗ ) where K is the Gram matrix whose elements are given by k(gi , gj ). (7)

Note that for some fingerprints such as latent fingerprints collected from crime scene, their locations in the complete print are unknown. So any g∗ only represents the orientation map of the print in one possible location. In order to predict the core point in the correct location, we list all the possible print locations corresponding to the different translations and rotations. The orientation maps of ∗ them are defined as G = {gi |i = 1, . . . , m}. Using (5), we obtain the predictive distributions ∗ ∗ ∗ ∗ ∗ p(y |gi , G, y) for all the gi . The core point y ∗ should maximize p(y ∗ |gi , G, y) with respect to gi . ˆ Thus the core point of the fingerprint is given by
∗ y ∗ = k(gM AX , G)[K + σ 2 I]−1 y ˆ ∗ gM AX

(8)

where is the orientation map where the maximum predictive probability of core point can be obtained, given by ∗ gM AX = argmax p(m(y ∗ )|g∗ , G, y) (9)
g∗

After the core points are determined, the fingerprints can be aligned by overlapping their core points. This is done by presenting the features in the Cartesian coordinates where the origin is the core point. Note that the minutia features mentioned in following sections have been aligned first.

3

A Generative Model for Fingerprints

In order to estimate rarity, statistical models need to be developed to represent the distribution of fingerprint features. Previous generative models for fingerprints involve different assumptions: uniform distribution of minutia locations and directions [15] and minutiae are independent of each other [16, 17]. However, minutiae that are spatially close tend to have similar directions with each other [18]. Moreover, fingerprint ridges flow smoothly with very slow orientation change. The variance of the minutia directions in different regions of the fingerprint are dependent on both their locations and location variance [19, 20]. These observations on the dependency between minutiae need to be accounted for in eliciting reliable statistical models. The proposed model incorporates the distribution of minutiae and the dependency relationship between them. Minutiae are the most commonly used features for representing fingerprints. They correspond to ridge endings and ridge bifurcations. Each minutia is represented by its location and direction. The direction is determined by the ridge at the location. Automatic fingerprint matching algorithms use minutiae as the salient features [21], since they are stable and are reliably extracted. Each minutia is represented as x = (s, θ) where s = (x1 , x2 ) is its location and θ its direction. In order to capture the distribution of minutiae as well as the dependencies between them, we first propose a method to define a unique sequence for a given set of minutiae. Suppose that a fingerprint contains N minutiae. The sequence starts with the minutia x1 whose location is closest to the core point. Each remaining minutia xn is the spatially closest to the centroid defined by the arithmetic mean of the location coordinates of all the previous minutiae x1 , . . . xn−1 . Given this sequence, the fingerprint can be represented by a minutia sequence X = (x1 , . . . , xN ). The sequence is robust to the variance of the minutiae because the next minutia is decided by the all the previous minutiae. Given the observation that spatially closer minutiae are more strongly related, we only model the dependence between xn and its nearest minutia among {x1 , . . . , xn−1 }. Although not all the dependence is taken into account, this is a good trade-off between model accuracy and computational complexity. Figure 1(a) presents an example where x5 is determined because its distance to the centroid of {x1 , . . . , x4 } is minimal. Figure 1(b) shows the minutia sequence and the minutia 3

(a) Minutiae sequencing. (b) Minutiae dependency.

Figure 1: Minutia dependency modeling: (a) given minutiae {x1 , . . . , x4 } with centroid c, the next minutia x5 is the one closest to c, and (b) following this procedure dependency between seven minutiae are represented by arrows.

Figure 2: Bayesian network representing conditional dependencies shown in Figure 1, where xi = (si , θI ). Note that there is a link between x1 and x2 while there is none between x2 and x3 . dependencies (arrows) for the same configuration of minutiae. Based on the characteristic of fingerprint minutiae studied in [18, 19, 20], we know that the minutia direction is related to its location and the neighboring minutiae. The minutia location is conditional independent of the location of the neighboring minutiae given their directions. To address the probabilistic relationships of the minutiae, Bayesian networks are used to represent the distributions of the minutia features in fingerprints. Figure 2 shows the Bayesian network for the distribution of the minutia set given in Figure 1. The nodes sn and θn represent the location and direction of minutia xn . For each conditional distribution, a directed link is added to the graph from the nodes corresponding to the variables on which the distribution is conditioned. In general, for a given fingerprint, the joint distribution over its minutia set X is given by
N

p(X) = p(s1 )p(θ1 |s1 )
n=2

p(sn )p(θn |sn , sψ(n) , θψ(n) )

(10)

where sψ(n) and θψ(n) are the location and direction of the minutia xi which has the minimal spatial distance to the minutia xn . So ψ(n) is given by ψ(n) = argmin xn − xi (11)
i∈[1,n−1]

To compute above joint probability, there are three probability density functions need to be estimated: distribution of the location of minutiae f (s), joint distribution of the location and direction of minutiae f (s, θ), and conditional distribution of minutia direction given its location, and the location and direction of the nearest minutia f (θn |sn , sψ(n) , θψ(n) ). It is known that minutiae tend to form clusters [18] and minutiae in different regions of the fingerprint are observed to be associated with different region-specific minutia directions. A mixture of Gaussian is a natural approach to model the minutia location given by (12). Since minutia orientation is a periodic variable, it is modeled by the von Mises distribution which itself is derived from the Gaussian. The minutia represented by its location and direction is modeled by the mixture of joint Gaussian and von-Mises distribution [22] give by (13). Given its location and the nearest minutia, the minutia direction has the mixture of von-Mises density given by (14).
K1

f (s) =
k1 =1

πk1 N (s|µk1 , Σk1 )

(12)

4

K2

f (s, θ) =
k2 =1

πk2 N (s|µk2 , Σk2 )V(θ|νk2 , κk2 )
K3

(13)

f (θn |sn , sψ(n) , θψ(n) ) =
k3 =1

πk3 V(θn |νk3 , κk3 )

(14)

where Ki is the number of mixture components, πki are non-negative component weights that sum to one, N (s|µk , Σk ) is the bivariate Gaussian probability density function of minutiae with mean µk and covariance matrix Σk , and V(θ|νk , κk ) is the von-Mises probability density function of minutia orientation with mean angle νk and precision (inverse variance) κk3 . Bayesian information criterion is used to estimate Ki and other parameters are learned by EM algorithm.

4

Evaluation of Rarity of a Fingerprint

The general probability of random correspondence (PRC) can be modified to give the probability of matching the specific evidence within a database of n items, where the match is within some tolerance in feature space [23]. The metric of rarity is specific nPRC, the probability that data with value x coincides with an element in a set of n samples, within specified tolerance. Since we are trying to match a specific value x, this probability depends on the probability of x. Let Y = [y1 , ..., yn ] represent a set of n random variables. A binary-valued random variable z indicates that if one sample yi exists in a set of n random samples so that the value of yi is the same as x within a tolerance ǫ. By noting the independence of x and yi , the specific nPRC is then given by the marginal probability p(z = 1|x) = p(z = 1|x, Y)p(Y) (15)
Y

where p(Y) is the joint probability of the n individuals. To compute specific nPRC, we first define correspondence or match, between two minutiae as follows. Let xa = (sa , θa ) and xb = (sb , θb ) be a pair of minutiae. The minutiae are said to correspond if for tolerance ǫ = [ǫs , ǫθ ], sa − sb ≤ ǫs ∧ |θa − θb | ≤ ǫθ (16) where sa − sb is the Euclidean distance between the minutia locations. Then, the match between two fingerprints is defined as existing at least m pairs of matched minutiae between two fingerprints. ˆ The tolerances ǫ and m depend on practical applications. ˆ To deal with the largely varying quality in latent fingerprints, it is also important to consider the minutia confidence in specific nPRC measurement. The confidence of the minutia xn is defined as (dsn , dθn ), where dsn is the confidence of location and dθn is the confidence of direction. Given the minutia xn = (sn , θn ) and its confidences, the probability density functions of location s′ and direction θ′ can be modeled using Gaussian and von-Mises distribution given by c(s′ |sn , dsn ) = N (s′ |sn , d−1 ) sn c(θ′ |θn , dθn ) = V(θ′ |θn , dθn ) (17) (18)

where the variance of the location distribution (Gaussian) is the inverse of the location confidence and the concentration parameter of the direction distribution (von-Mises) is the direction confidence. Let f be a randomly sampled fingerprint which has minutia set X′ = {x′ , ..., x′ }. Let X and X′ 1 M be the sets of m minutiae randomly picked from X and X′ ,where m ≤ N and m ≤ M . Using (10), ˆ ˆ ˆ the probability that there is a one-to-one correspondence between X and X′ is given by
m ˆ

pǫ (X) = pǫ (s1 , θ1 )
n=2

pǫ (sn )pǫ (θn |sn , sψ(n) , θψ(n) )

(19)

where pǫ (sn , θn ) =
s′ θ ′ |x−x′ |≤ǫ

c(s′ |sn , dsn )c(θ′ |θn , dθn )f (s, θ)ds′ dθ′ dsdθ

(20)

5

pǫ (sn ) =
s′ |s−s′ |≤ǫs

c(s′ |sn , dsn )f (s)ds′ ds c(θ′ |θn , dθn )f (θ|sn , sψ(n) , θψ(n) )dθ′ dθ
θ ′ |θ−θ ′ |≤ǫθ

(21)

pǫ (θn |sn , sψ(n) , θψ(n) ) = Finally, the specific nPRCs can be computed by

(22)

pǫ (X, m, n) = 1 − (1 − pǫ (X, m))n−1 ˆ ˆ

(23)

where X represents the minutia set of given fingerprint, and pǫ (X, m) is the probability that m pairs ˆ ˆ of minutiae are matched between the given fingerprint and a randomly chosen fingerprint from n fingerprints. N (m) ˆ m′ ′ · pǫ (Xi ) (24) pǫ (X, m) = ˆ p(m ) m ˆ ′ i=1
m ∈M

where M contains all possible numbers of minutiae in one fingerprint among n fingerprints, p(m′ ) is the probability of a random fingerprint having m′ minutiae, minutia set Xi = (xi1 , xi2 , ..., xim ) is ˆ the subset of X and pǫ (Xi ) is the joint probability of minutia set Xi given by (19). Gibbs sampling is used to approximate the integral involved in the probability calculation.

5

Model Validation

In order to validate the proposed methods, core point prediction was first tested. Goodness-of-fit tests were performed on the proposed generative models. Two databases were used, one is NIST4, and the other is NIST27. NIST4 contains 8-bit gray scale images of randomly selected fingerprints. Each print has 512 × 512 pixels. The entire database contains fingerprints taken from 2000 different fingers with 2 impression of the same finger. NIST27 contains latent fingerprints from crime scenes and their matching rolled fingerprint mates. There are 258 latent cases separated into three quality categories of good, bad, and ugly. 5.1 Core Point Prediction

The Gaussian process models for core point prediction are trained on NIST4 and tested on NIST27. The orientation maps are extracted by conventional gradient-based approach. The fingerprint images are first divided into equal-sized blocks of N × N pixels, where N is the average width of a pair of ridge and valley. The value of N is 8 in NIST4 and varies in NIST27. The gradient vectors are calculated by taking the partial derivatives of image intensity at each pixel in Cartesian coordinates. The ridge orientation is perpendicular to the dominant gradient angle in the local block. The training set consists of the orientation maps of the fingerprints and the corresponding core points which are marked manually. The core point prediction is applied on three groups of latent prints in different quality. Figure 3 shows the results of core point prediction and subsequent latent print localization given two latent fingerprints from NIST27. Table 1 shows the comparison of prediction precisions of Gaussian Processes (GP) based approach and the widely used Poincare Index (PI) [8]. The test latent prints are extracted and enhanced manually. The true core points of the latent prints are picked from the matching 10-prints. Correct prediction is determined by comparing the location and direction distances between predicted and true core points with the threshold parameters set at Ts = 16 pixels, and Tθ = π/6. Good quality set contains 88 images that mostly contain the core points. Both bad and ugly quality sets contain 85 images that have small size and usually do not include core points. Among the precisions of good quality latent prints, two approaches are close. Precisions of bad and ugly quality show distinct difference between two methods and indicate that GP based method provides core point prediction even though the core points can not be seen in the latent prints. The GP based method also results in higher overall prediction precisions. 5.2 Goodness-of-fit

The validation of the proposed generative model is by means of a goodness-of-fit test which determines as to how well a sample of data agrees with the proposed model distribution. The chi-square 6

(a) Latent print localization of case “g90”. (b) Latent print localization of case “g69”.

Figure 3: Latent print localization: Left side images are the latent fingerprints (rectangles) collected from crime scenes. Right side images contain the predicted core points (crosses) and true core points (rounds) with the orientation maps of the latent prints. Table 1: Comparison of prediction precisions of PI and GP based approaches. Poincare Index Gaussian Processes Good 90.6% 93.1% Bad 68.2% 87.1% Ugly 46.6% 72.7% Overall 68.6% 84.5%

statistical hypothesis test was applied [24]. Three different tests were conducted for : (i) distribution of minutia location (12), (ii) joint distribution of minutia location and orientation (13), and (iii) distributions of minutia dependency (14). For minutia location, we partitioned the minutia location space into 16 non-overlapping blocks. For minutia location and orientation, we partitioned the feature space into 16 × 4 non-overlapping blocks. For minutia dependency, the orientation space is divided into 9 non-overlapping blocks. The blocks are combined with adjacent blocks until both observed and expected numbers of minutiae in the block are greater than or equal to 5. The test statistic used here is a chi-square random variable χ2 defined by the following equation. χ2 =
i

(Oi − Ei )2 Ei

(25)

where Oi is the observed minutia count for the ith block, and Ei is the expected minutia count for the ith block. The p-value, the probability of observing a sample statistic as extreme as the test statistic, associated with each test statistic χ2 is then calculated based on the chi-square distribution and compared to the significance level. For the NIST 4 dataset, we chose significance level equal to 0.01. 4000 fingerprints are used to train the generative models proposed in Sections 3. To test the models for minutia location, and minutia location and orientation, the numbers of fingerprints with p-values above (corresponding to accept the model) and below (corresponding to reject the model) the significance level are computed. Of the 4000 fingerprints, 3387 are accepted and 613 are rejected for minutia location model, and 3216 are accepted and 784 are rejected for minutia location and orientation model. To test the model for minutia dependency, we first collect all the linked minutia pairs in the minutia sequences produced from 4000 fingerprints. Then these minutia pairs are separated by the binned locations of both minutiae (32 × 32) and orientation of the leading minutia (4). Finally, the minutia dependency models can be tested on the corresponding minutia pair sets. Of the 4096 data sets, 3558 are accepted and 538 are rejected. The results imply that the proposed generative models offer reasonable and accurate fit to fingerprints.

Table 2: Results from the Chi-square tests for testing the goodness of fit of three generative models. Generative models f (s) f (s, θ) f (θn |sn , sψ(n) , θψ(n) ) Dataset sizes 4000 4000 4096 Model accepted 3387 3216 3558 Model rejected 613 784 538

7

(a) Latent case “b115”.

(b) Latent case “g73”.

Figure 4: Two latent cases: The left images are the crime scene photographs containing the latent fingerprints and minutiae. The right images are the preprocessed latent prints with aligned minutiae with predicted core points. Table 3: Specific nPRCs for the latent fingerprints “b115” and “g73”, where n = 100, 000. Latent Print “b115” Latent Print “g73” N m ˆ pǫ (m, X) ˆ N m ˆ pǫ (m, X) ˆ 2 0.73 4 1 4 9.04 × 10−6 8 3.11 × 10−14 16 8 2.46 × 10−19 39 12 2.56 × 10−25 12 6.13 × 10−31 24 3.10 × 10−52 −46 16 1.82 × 10 39 7.51 × 10−79

6

Fingerprint Rarity measurement on Latent Prints

The method for assessing fingerprint rarity using the validated model is demonstrated here. Figure 4 shows two latent fingerprints randomly picked from NIST27. The first latent print “b115” contains 16 minutiae and the second “g73” contains 39 minutiae. The confidences of minutiae are manually assigned by visual inspection. The specific nPRC of the two latent prints are given by Table 3. The specific nPRCs are calculated through varying numbers of matching minutia pairs (m), assuming ˆ that the number of fingerprints (n) is 100, 000. The tolerance is set at ǫs = 10 pixels and ǫθ = π/8. The experiment shows that the values of specific nPRC are largely dependent on the given latent fingerprint. For the latent print that contains more minutiae or whose minutiae are more common in minutia population, the probability that the latent print shares m minutiae with a random fingerprint ˆ is more. It is obvious to note that, when m decreases, the probability of random correspondence ˆ increases. Moreover, the values of specific nPRC provide a strong argument for the values of latent fingerprint evidences.

7

Summary

This work is the first attempt of offering a systematic method to measure the rarity of fingerprints. In order to align the prints, a Gaussian processes based approach is proposed to predict the core points. It is proven that this approach can predict core points whether the prints contain the core points or not. Furthermore, a generative model is proposed to model the distribution of minutiae as well as the dependency between them. Bayesian networks are used to perform inference and learning by visualizing the structures of the generative models. Finally, the rarity of a fingerprint is able to calculated. To further improve the accuracy, minutia confidences are taken into account for specific nPRC calculation. Goodness of fit tests shows that the proposed generative offers an accurate fingerprint representation. We perform the specific nPRC computation on NIST27 dataset. It is shown that the proposed method is capable of estimating the rarity of real-life latent fingerprints. Acknowledgments This work was supported by the United States Department of Justice award NIJ: 2009-DN-BXK208. The opinions expressed are those of the authors and not of the DOJ. 8

References
[1] R. Chakraborty. Statistical interpretation of DNA typing data. American Journal of Human Genetics, 49(4):895–897, 1991. [2] United States Court of Appeals for the Third Circuit: USA v. Byron Mitchell, 2003. No. 02-2859. [3] D.V. Lindley. A problem in forensic science. Biometrika, 64(2):207–213, 1977. [4] C. Neumann, C. Champod, R. Puch-Solis, N. Egli, A. Anthonioz, and A. Bromage-Griffiths. Computation of likelihood ratios in fingerprint identification for configurations of any number of minutiae. Journal of Forensic Sciences, 51:1255–1266, 2007. [5] S.N. Srihari and H. Srinivasan. Comparison of ROC and Likelihood Decision Methods in Automatic Fingerprint Verification. International J. Pattern Recognition and Artificial Intelligence, 22(1):535–553, 2008. [6] A.K. Jain and D. Maltoni. Handbook of Fingerprint Recognition. Springer-Verlag New York, Inc., Secaucus, NJ, USA, 2003. [7] M. Kawagoe and A. Tojo. Fingerprint pattern classification. Pattern Recogn., 17(3):295–303, 1984. [8] A.M. Bazen and S.H. Gerez. Systematic methods for the computation of the directional fields and singular points of fingerprints. IEEE Trans. Pattern Anal. Mach. Intell., 24(7):905–919, 2002. [9] A.K. Jain, S. Prabhakar, and L. Hong. A multichannel approach to fingerprint classification. IEEE Trans. Pattern Anal. Mach. Intell., 21(4):348–359, 1999. [10] A.K. Jain, S. Prabhakar, L. Hong, and S. Pankanti. Filterbank-based fingerprint matching. IEEE Transactions on Image Processing, 9:846–859, 2000. [11] D. Phillips. A fingerprint orientation model based on 2d fourier expansion (fomfe) and its application to singular-point detection and fingerprint indexing. IEEE Trans. Pattern Anal. Mach. Intell., 29(4):573–585, 2007. [12] X. Wang, J. Li, and Y. Niu. Definition and extraction of stable points from fingerprint images. Pattern Recogn., 40(6):1804–1815, 2007. [13] M. Liu, X. Jiang, and A.C. Kot. Fingerprint reference-point detection. EURASIP J. Appl. Signal Process., 2005:498–509, 2005. [14] C.E. Rasmussen and C.K.I. Williams. Gaussian Processes for Machine Learning. the MIT Press, 2006. [15] S. Pankanti, S. Prabhakar, and A.K. Jain. On the individuality of fingerprints. IEEE Trans. Pattern Anal. Mach. Intell., 24(8):1010–1025, 2002. [16] Y. Zhu, S.C. Dass, and A.K. Jain. Statistical models for assessing the individuality of fingerprints. IEEE Transactions on Information Forensics and Security, 2(3-1):391–401, 2007. [17] Y. Chen and A.K. Jain. Beyond minutiae: A fingerprint individuality model with pattern, ridge and pore features. In ICB ’09 Proceedings, pages 523–533, Berlin, Heidelberg, 2009. Springer-Verlag. [18] S.C. Scolve. The occurence of fingerprint characteristics as a two dimensional process. Journal of the American Statistical Association, 367(74):588–595, 1979. [19] D.A. Stoney. Distribution of epidermal ridge minutiae. American Journal of Physical Anthropology, 77:367–376, 1988. [20] J. Chen and Y. Moon. A statistical study on the fingerprint minutiae distribution. In ICASSP 2006 Proceedings., volume 2, pages II–II, 2006. [21] C. Watson, M. Garris, E. Tabassi, C. Wilson, R. McCabe, and S. Janet. User’s Guide to NIST Fingerprint Image Software 2 (NFIS2). NIST, 2004. [22] C. Bishop. Pattern Recognition and Machine Learning. Springer, New York, 2006. [23] C. Su and S.N. Srihari. Probability of random correspondence for fingerprints. In IWCF ’09 Proceedings, pages 55–66, Berlin, Heidelberg, 2009. Springer-Verlag. [24] R.B. D’Agostino and M.A. Stephens. Goodness-of-fit Techniques. CRC Press, 1986.

9

Sign up to vote on this title
UsefulNot useful