Welcome to Scribd, the world's digital library. Read, publish, and share books and documents. See more
Download
Standard view
Full view
of .
Save to My Library
Look up keyword
Like this
6Activity
0 of .
Results for:
No results containing your search query
P. 1
Data Quality Concepts and Techniques Applied to Taxonomic Databases

Data Quality Concepts and Techniques Applied to Taxonomic Databases

Ratings:

4.33

(3)
|Views: 458 |Likes:
Published by Eduardo Dalcin
The thesis investigates the application of concepts and techniques of data quality in taxonomic databases to enhance the quality of information services and systems in taxonomy. Taxonomic data are arranged and introduced in Taxonomic Data Domains in order to establish a standard and a working framework to support the proposed Taxonomic Data Quality Dimensions, as a specialised application of conventional Data Quality Dimensions in the Taxonomic Data Quality Domains.

The thesis presents a discussion about improving data quality in taxonomic databases, considering conventional Data Cleansing techniques and applying generic data content error patterns to taxonomic data. Techniques of taxonomic error detection are explored, with special attention to scientific name spelling errors.

The spelling error problem is scrutinized through spelling error detecting techniques and algorithms. Spelling error detection algorithms are described and analysed. In order to evaluate the applicability and efficiency of different spelling error detection algorithms, a suite of experimental spelling error detection tools was developed and a set of experiments was performed, using a sample of five different taxonomic databases.

The results of the experiments are analysed from the algorithm and from the database point of view.

Database quality assessment procedures and metrics are discussed in the context of taxonomic databases and the previously introduced concepts of Taxonomic Data Domains and Taxonomic Data Quality Dimensions.

Four questions related to Taxonomic Database Quality are discussed, followed by conclusions and recommendations involving information system design and implementation and the processes involved in taxonomic data management and information flow.
The thesis investigates the application of concepts and techniques of data quality in taxonomic databases to enhance the quality of information services and systems in taxonomy. Taxonomic data are arranged and introduced in Taxonomic Data Domains in order to establish a standard and a working framework to support the proposed Taxonomic Data Quality Dimensions, as a specialised application of conventional Data Quality Dimensions in the Taxonomic Data Quality Domains.

The thesis presents a discussion about improving data quality in taxonomic databases, considering conventional Data Cleansing techniques and applying generic data content error patterns to taxonomic data. Techniques of taxonomic error detection are explored, with special attention to scientific name spelling errors.

The spelling error problem is scrutinized through spelling error detecting techniques and algorithms. Spelling error detection algorithms are described and analysed. In order to evaluate the applicability and efficiency of different spelling error detection algorithms, a suite of experimental spelling error detection tools was developed and a set of experiments was performed, using a sample of five different taxonomic databases.

The results of the experiments are analysed from the algorithm and from the database point of view.

Database quality assessment procedures and metrics are discussed in the context of taxonomic databases and the previously introduced concepts of Taxonomic Data Domains and Taxonomic Data Quality Dimensions.

Four questions related to Taxonomic Database Quality are discussed, followed by conclusions and recommendations involving information system design and implementation and the processes involved in taxonomic data management and information flow.

More info:

Published by: Eduardo Dalcin on Feb 20, 2008
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF or read online from Scribd
See more
See less

03/01/2012

pdf

 
University of Southampton
FACULTY OF MEDICINE, HEALTH AND LIFE SCIENCESSchool of Biological Sciences
Data Quality Concepts and Techniques Applied to TaxonomicDatabases
by
Eduardo Couto Dalcin
Thesis for the degree of Doctor of PhilosophyFebruary 2005
 
 2
University of Southampton
FACULTY OF MEDICINE, HEALTH AND LIFE SCIENCESSchool of Biological SciencesDoctor of Philosophy
AbstractData Quality Concepts and Techniques Applied to Taxonomic Databases
byEduardo Couto DalcinThe thesis investigates the application of concepts and techniques of data quality intaxonomic databases to enhance the quality of information services and systems intaxonomy. Taxonomic data are arranged and introduced in Taxonomic Data Domains inorder to establish a standard and a working framework to support the proposedTaxonomic Data Quality Dimensions, as a specialised application of conventional DataQuality Dimensions in the Taxonomic Data Quality Domains.The thesis presents a discussion about improving data quality in taxonomic databases,considering conventional Data Cleansing techniques and applying generic data contenterror patterns to taxonomic data. Techniques of taxonomic error detection are explored,with special attention to scientific name spelling errors.The spelling error problem is scrutinized through spelling error detecting techniques andalgorithms. Spelling error detection algorithms are described and analysed. In order toevaluate the applicability and efficiency of different spelling error detection algorithms,
 
 3a suite of experimental spelling error detection tools was developed and a set of experiments was performed, using a sample of five different taxonomic databases. Theresults of the experiments are analysed from the algorithm and from the database pointof view.Database quality assessment procedures and metrics are discussed in the context of taxonomic databases and the previously introduced concepts of Taxonomic DataDomains and Taxonomic Data Quality Dimensions.Four questions related to Taxonomic Database Quality are discussed, followed byconclusions and recommendations involving information system design andimplementation and the processes involved in taxonomic data management andinformation flow.

Activity (6)

You've already reviewed this. Edit your review.
1 hundred reads
1 thousand reads
jesusdaa1 liked this
s曾丽婵 liked this
babu_ranu liked this
mitchou642 liked this

You're Reading a Free Preview

Download
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->