You are on page 1of 37

ASSIGNING UNKNOWNS TO HIGHER TAXA USING DNA BARCODES: Effect of library completeness and assignment criteria

John J. Wilson, Rodolphe Rougerie, Justin Schonfeld, Daniel H. Janzen, Winnie Hallwachs, Mehrdad Hajibabaei, Ian J. Kitching, Jean Haxaire & Paul D. N. Hebert

Introduction

Methods

Results

Discussion

QUESTIONS
(i) Is the barcode of a sphingid species not present in the library assigned correctly to genus, tribe or subfamily?

Introduction

Methods

Results

Discussion

QUESTIONS
(i) Is the barcode of a sphingid species not present in the library assigned correctly to genus, tribe or subfamily? (ii) Does assignment accuracy increase with increased SPECIES COMPLETENESS of the LIBRARY?

Introduction

Methods

Results

Discussion

BCHax4451

Query Xylophanes ? epaphus .. ?

Xylophanes epaphus

Introduction

Methods

Results

Discussion

Query Chaerocina..? ? Choerocampina..? Macroglossinae..?

Chaerocina dohertyi

Introduction

Methods

Results

Discussion

QUESTIONS
(i) Is the barcode of a sphingid species not present in the library assigned correctly to genus, tribe or subfamily? (ii) Does assignment accuracy increase with increased SPECIES COMPLETENESS of the LIBRARY? (iii)To what extent does assignment accuracy depend on the ASSIGNMENT CRITERIA applied?

Introduction

Methods

Results

Discussion

Introduction

Methods

Results

Discussion

Inconsistent Methods Poorly explained


-frequency of best hits -level of sequence similarity -BLAST scores

Impossible to confirm independently

Introduction

Methods

Results

Discussion

QUESTIONS
(i) Is the barcode of a sphingid species not present in the library assigned correctly to genus, tribe or subfamily? (ii) Does assignment accuracy increase with increased SPECIES COMPLETENESS of the LIBRARY? (iii)To what extent does assignment accuracy depend on the ASSIGNMENT CRITERIA applied?

Introduction
QUERY DATASET

Methods

Results

Discussion

REFERENCE LIBRARY

86% spp. 118 species 1095 species

Introduction
Query dataset

Methods

Results

Discussion
Reference library

ASSIGNMENTS according to 3 criteria

TRUE or FALSE (morphology)

Introduction

Methods

Results

Discussion

SIMULATING SPECIES COMPLETENESS IN THE LIBRARY

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%*

*For GENUS TESTS we removed the species of the query *For TRIBE and SUBFAMILY TESTS we removed all representatives of the genus of the query -Single barcode per species -Randomly sampled 30 times at each completeness level

Introduction
Query dataset

Methods

Results

Discussion
Reference library

ASSIGNMENTS according to 3 criteria

TRUE or FALSE (morphology)

Introduction

Methods

Results

Discussion

ASSIGNMENT CRITERIA 1. STRICT

root

POSITIVE

Introduction

Methods

Results

Discussion

ASSIGNMENT CRITERIA 1. STRICT

root

root

POSITIVE

OR

AMBIGUOUS

Introduction

Methods

Results

Discussion

ASSIGNMENT CRITERIA 2. LIBERAL

root

POSITIVE

Introduction

Methods

Results

Discussion

ASSIGNMENT CRITERIA 2. LIBERAL

root root

POSITIVE

OR

AMBIGUOUS

Introduction

Methods

Results

Discussion

ASSIGNMENT CRITERIA 3. DISTANCE

Introduction
Query dataset

Methods

Results

Discussion
Reference library

ASSIGNMENTS according to 3 criteria

TRUE or FALSE (morphology)

Introduction

Methods

Results

Discussion

Assignments judged TRUE or FALSE based on congruence with morphological identification and current classification

Kitching & Cadiou (2000)

Introduction

Methods

Results

Discussion

QUESTIONS
(i) Is the barcode of a sphingid species not present in the library assigned correctly to genus, tribe or subfamily?

Introduction

Methods

Results

Discussion

100% library and liberal criterion: GENUS TRIBE SUBFAMILY 83.1% 74.4% 89.9%

Introduction

Methods

Results

Discussion

Genus assignments: 100% library and liberal criterion


False Positive (12)

Introduction

Methods

Results

Discussion

Tribe assignments: 100% library and liberal criterion


Dilophonotina
False Positives:

Cautethia Aleuron Enyo Eumorpha Pachygonida

Macroglossina

Sphingidae Phylogeny of Kawahara et al. 2009

Introduction

Methods

Results

Discussion

QUESTIONS
(i) Is the barcode of a sphingid species not present in the library assigned correctly to genus, tribe or subfamily? (ii) Does assignment accuracy increase with increased SPECIES COMPLETENESS of the LIBRARY?

Introduction
Zero spp.

Methods

Results

Discussion
100% spp.*

Library completeness

GENUS TRIBE SUBFAMILY

50

60

True assignments (%) 70 80

90

100

*For genus this excludes the conspecific of the query *For tribe and subfamily this excludes congenerics of the query

Introduction

Methods

Results

Discussion

QUESTIONS
(i) Is the barcode of a sphingid species not present in the library assigned correctly to genus, tribe or subfamily? (ii) Does assignment accuracy increase with increased SPECIES COMPLETENESS of the LIBRARY? (iii)To what extent does assignment accuracy depend on the ASSIGNMENT CRITERIA applied?

Introduction

Methods

Results

Discussion

GENUS ASSIGNMENTS
STRICT LIBERAL DISTANCE

TRUE

FALSE POSITIVE

Introduction

Methods

Results

Discussion

TRIBE ASSIGNMENTS
STRICT LIBERAL DISTANCE

TRUE

FALSE POSITIVE

Introduction

Methods

Results

Discussion

SUBFAMILY ASSIGNMENTS
STRICT LIBERAL DISTANCE

TRUE

FALSE POSITIVE

Introduction

Methods

Results

Discussion

(i) Correct assignment to genus, tribe or subfamily


Morphological diagnostics?
http://tpittaway.tripod.com/sphinx/m_ste.htm

Out of date classification? -tribe: especially unstable -subfamily: good success Few cases where barcode assignment was clearly misleading or conflicting with current thinking -deserve further scrutiny

Introduction

Methods

Results

Discussion

(ii) Effect of library completeness

Introduction

Methods

Results

Discussion

(iii) Effect of assignment criterion


Distance -many false positives -threshold is impossible Tree-based -more accurate -dependent on tree shape

Avise & Johns PNAS (1999)

Introduction

Methods

Results

Discussion

CONCLUSIONS
(i) Barcoding successfully assigns queries to higher taxa in absence of species match (ii) Assignment accuracy increases with increased species richness of the library -but high success is seen at low richness -subsequent increases in library produce rapidly diminishing returns (iii) Distance is less accurate than treebased approaches but these are highly dependent on tree-shape and the naturalness of the classification

Introduction

Methods

Results

Discussion

Implications for other families and other groups

Does anyone else have a species complete reference library?

Laboratory Area de Conservacin Guanacaste, Costa Rica


ACG Parataxonomists, Cox, Lambert, Wege Foundations & NSF

Funding & Support

Introduction
?

Methods

Results

Discussion

Questions

You might also like