Managing Specimens in Collection Systems

This document discusses collection management systems for natural history specimens. It notes that most systematists currently use personal databases that create isolated "silos" of information. There is no global system for assigning unique identifiers (GUIDs) to specimens to link related data across databases. The document reviews past and current initiatives to create global lists of natural history repositories and assign GUIDs. It recommends systematists work with the institutions housing their collections and reference specimens to obtain GUIDs, and consider using the open-source DINA consortium system as it develops a fully web-based collection management system by 2018.

Uploaded by

Juan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

63 views25 pages

Managing Specimens in Collection Systems

Uploaded by

Juan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Ref.

Ares(2016)1344931 - 17/03/2016

Collection Management Systems: A Research Perspective

Fredrik Ronquist
2016-03-14

Introduction
As a systematist, you are likely to examine, sequence or analyze a number of specimens
in the course of your studies. Those specimens can be on loan from a natural history
collection, or they can be in a collection you build yourself. In either case, you will have
to manage information about the specimens and link that information to the results of
your studies, which may be presented in papers or in various online databases. What is
the best way of approaching these information management tasks?
Most systematists today keep some kind of personal collection database. It is often built
from scratch using commercial database software like FileMaker Pro or Microsoft
Access. There are also dedicated software packages like Specify 1 and Biota 2, which allow
you to manage collections data, and VoSeq 3, which is focused on handling DNA sequence
and voucher information.
Unfortunately, keeping a personal collection database has the effect of building
information silos that are not easily connected. For instance, if you extract and sequence
DNA from some small part of a specimen from a natural history museum, you are likely
to create your own identifier for the voucher specimen, and this is the number that is
likely to end up with the sequence submission record. However, once you have returned
the specimen to the natural history museum where it belongs, it is unlikely that it will be
possible to link any additional information that might become available about the
specimen in the future through your identifier. Most natural history museums simply do
not have systems in place for routinely assigning globally unique identifiers (GUIDs) to
specimens, which make it possible to find information about them online. Many
entomology collections do not even keep specimen-level databases today, except
possibly for type specimens.

Linking data through identifiers

There is no easy solution to these problems just yet. There have been several initiatives
to create systems of unique identifiers or acronyms for the public collections around the
world. Index Herbariorum, covering the world’s herbaria, is probably the most well
known. There is also a list of insect and spider collections of the world, each one
associated with a unique acronym, typically four letters long. These acronyms are
commonly used in systematic entomology papers; you might already have seen them in
a listing of repositories of studied material. The list has a long history; the current
version is maintained online 4 by the Bishop Museum in Hawaii.
In recent years, there have been attempts to merge lists like these into a global list. The
current initiative seems to be with the Consortium for the Barcoding of Life, which is

1 [Link]
2 [Link]
3 [Link]
4 [Link]
maintaining GRBio5, the global registry of biodiversity repositories. Unfortunately, the
list is partly outdated and contains a number of obvious problems. It does not appear
that the list is very actively maintained.
Previously, there has been a movement to assign life science identifiers (LSIDs) as GUIDs
for biodiversity information records, such as specimen records. However, many experts
are now abandoning LSIDs in favor of plain old uniform resource identifiers (URIs), that
is, internet addresses. Some institutions will already now be able to provide permanent
URIs for their collection objects, often including the institutional address, a globally
recognized acronym for the collection, and a unique catalog number for the object.
However, these are still early days in the adoption of such URI schemes, and many
institutions are still working on this, or have not even started.
In conclusion, there is currently no way for you to assemble information about
specimens in natural history collections and be sure you can link this information online
through a GUID to other data associated with that specimen. The best you can do is to
ask each natural history museum you are borrowing specimens from for a GUID,
preferably a URI, for each specimen they send to you for loan or allow you to study. Then
use that GUID in your own database, and cite it in any online references you make to that
specimen in sequence databases or other online repositories to which you upload
results from your studies.

Building your own collection

How do you manage the information about the specimens you have collected yourself?
Again, there is no simple answer but a suggestion is that you choose a natural history
institution as the home of your collection. Make sure that the institution will grant you a
longtime loan of your own collection if you move to another institution; you should not
take this for granted. Most aspiring systematists will end up moving among institutions
several times in their career, and usually need to be able to take their collection with
them on a long-term loan (or possibly shift the home of the specimens based on a
transfer agreement).
Once you have chosen a home collection, use their system for assigning GUIDs to your
specimens. This will minimally require some interaction with whatever system the
institution is using to maintain catalog numbers. A more radical and future-oriented
approach is to choose to maintain all of your collection-related information in the
institutional system. However, this is only a reasonable choice if they provide a modern
web-based system that is adequate for your needs. Most institutional collection systems
in use today do not meet the requirements for research use but this may change in the
next few years thanks to projects like DINA (see below).
The institutional perspective is somewhat different. Typically, collection catalogs were
started a long time ago by curators who started to assemble information about the
objects in the collection on index cards. When computers became more commonly
available, the information was transferred to databases purpose-built by each curator
for the part of the collections they were responsible for. The end result was a large
number of heterogeneous systems, built using different software packages, often
maintained by single individuals, and using different data formats. This is still the
situation in many institutions housing natural history collections today.

5
In recent years, this situation has started to change as natural history museums and
similar institutions have become increasingly aware of the value of digital assets and the
need for professional information management. Recent trends like the push for open
science, shared public data, the semantic web, and linked open data have accelerated
this development. This has led to a movement towards a coherent information
management strategy and a central institutional collection management system in most
places.
In the choice of a central system, an organization can opt to: (1) acquire a commercial
system (EMu being the major system used currently by large natural history museums);
(2) develop a system in-house; or (3) join other institutions in distributed open-source
development. There are many reasons suggesting that the third choice is going to be the
most flexible and cost-efficient solution in the long term (see separate PowerPoint
presentation).
The DINA 6 consortium is currently the largest initiative for producing a web-based
collection management system through distributed open-source development. The
consortium currently includes six organizations in six different countries, four of which
contribute actively to the development. Two of the BIG 4 institutions are among the core
members of the DINA consortium: the Swedish Museum of Natural History and the
Natural History Museum of Denmark.
DINA is based on the Specify data model. A hybrid DINA-Specify system, relying on the
Specify 6 Java client for core collection management tasks, is available from the DINA
team at the Swedish Museum of Natural History (contact Markus Skyttner
[Link]@[Link]). The hybrid system has been in production at the Swedish
Museum of Natural History since 2011, when the first components were installed.
A fully web-based DINA version is not expected to be available until 2018 according to
the current DINA roadmap. Functionality specifically tailored for researchers is not
currently on the roadmap, but the consortium already now provides API specifications
that you can use to develop your own research client to the DINA-web database
backend. If you are interested in exploring this, contact Markus Skyttner (e-mail above)
for more information on how to install and run a DINA backend that you can use to
communicate with the front-end client you develop. You can run the entire DINA system
on your laptop, and you can share your front-end client with all other institutional and
individual DINA users through the DINA consortium and their github repository if you
like.
If you like the DINA approach and your institution is not a member of the DINA
consortium, you can ask the decision makers at your institution to consider the
possibility of joining the DINA initiative. An easy way of preparing yourself for a future
transition to the DINA system is to use Specify or a Specify-compatible data model for
your own private collection database.
Separately, you will find a PowerPoint presentation that gives you an introduction to the
DINA project, with pointers to web sites where you can find more information.

6 [Link]
Introduction to the DINA
project

Fredrik Ronquist
Dept. Bioinformatics and Genetics
Swedish Museum of Natural History
Collection Management Systems
Institutional Choices:
1. Develop your own system in-house
2. Acquire a commercial system (e.g., EMu)
3. Partner with other institutions in distributed open-
source development (e.g., DINA project)
The Case For Open Source
 Market considerations. Professional collection management systems not
viable commercial products in a pluralistic market.
 Long-term stability. An open-source software solution developed by
institutions with long-term focus will be more stable than a commercial
solution.
 Flexibility. A distributed open-source system must by necessity conform to a
modular design based on open API:s. This favors flexibility and adaptability
in a way that a commercial product will not.
 Cost effectiveness. Although some overhead is associated with distributed
development, more development teams involved in the effort will result in a
lower cost to the individual institution compared to in-house or commercial
solutions.
The Case For Open Source (cont’d)
 Opt-in opt-out scheme. Institutions can participate in the development
when they have resources to do so, and can opt out when they do not. At any
single point in time, it should be feasible to have enough institutions involved
for development to move forward at an acceptable pace.
 Community Control. A distributed open-source solution means that the
community retains control over both the information standards and the
system architecture and web service/API designs.
 Egalitarian. A professional open-source collection management system
offers a better way for developing countries to catch up than any commercial
product.
 Stable marketplace for extensions and services. A community-supported
de-facto standard for collection management systems architecture will ensure
that there is a stable market for various plugins, extensions and services based
on the system.
EMu: The major commercial collection management system used in natural
history museums
Axiell group, owned by Swedish venture capitalists, recently acquired the company
behind EMu. Lack of competition = profit.
The Natural History Museum in UK, one of several major natural history museums
currently running EMu. They have given Axiell 12 months to solve a number of
serious issues with the system; in parallel, open-source options are being reviewed.
Koha – Origin New Zealand, now 15 % of market share for Library Mgmt Systems
Atlas of Living Australia – Origin Australia (284 M SEK initial investment),
the world’s most complete system for integration, analysis and visualization
of biodiversity data, now 70+ developers around the world, running or being installed
in many countries in Europe and South America in addition to Australia
DINA Consortium
(Digital Information system for NAtural history data)
 Core mission. Pool resources to develop an open-source web-based
collection management system for natural history collections.
 Core Member. Required contribution 1.0 FTE to the project, of which at
least 0.5 to the development effort. Voting member of the DINA Technical
Committee (TC), which controls deliverables and deadlines for the 1.0 FTE
contribution.
 Associate Member. No contribution requirements. Non-voting member of
the Steering Group.
Steering Group
(All Members)

Technical
Committee
Task Force I Task Force II
(Core
Members)

Development Development Development

Team 1 Team II Team III
Current DINA Consortium
 Core Members
 Agriculture and Agri-Food Canada, Ottawa
 Estonia (University of Tartu)
 Denmark (University of Copenhagen)
 Sweden (Swedish Museum of Natural History)
 Associate Members
 Museum für Naturkunde, Berlin
 Royal Botanic Garden, Edinburgh
 Open to Additional Members
 Memorandum of Cooperation and more
information at [Link]
Challenges and Lessons Learned
 Commitment. Formalization of the collaboration and a good governance
model is essential to ensure work towards common goals.
 Patience. It may take an institution with long-term perspective several years
from a decision to join the consortium to actively contributing to the
development.
 Respect. Different teams come with different backgrounds, different skill
sets, and different external pressures. Striking the right balance between the
cathedral (centrally controlled) and the bazaar (locally controlled) approach to
collaborative development is crucial.
 Trust. A team needs to trust the other teams in the consortium to deliver
according to agreements, so that consortium membership pays off. In the
DINA consortium, we have just reached this stage, 5 years after we started
collaborating with the Specify group in Kansas, USA.
DINA Versions
 DINA Light (“Specify”, “DINA-Specify Hybrid”)
 Based largely on Specify 6 and the Specify data model, combined with new API:s and web
clients (collection web portal, biological survey client, species pages, DNA barcode portal,
loan request system)
 Fully compatible with Specify 7
 In production in Sweden since 2011. Currently includes many of the small Swedish collection
databases (NRM entomology, geology; GNM entomology, SMTP) with several more on the
way in (NRM zoology (part), GB herbarium, GNM zoology and geology).
 DINA Web
 Modern, modular, service-oriented architecture, optimized for distributed development,
based to a large extent on the Specify data model
 DINA API guidelines and style guidelines adopted
 Architectural road map, module overview and API blueprints under development
 Core modules available in proto-DINA versions: collection web portal, species pages system,
biological survey client, DNA barcode portal
 Core modules under development: taxonomy module, collection manager, DNA sequence
module, DINA data tool (batch uploading and editing)
DINA Light (DINA-Specify Hybrid) Original Version

web interface Morphbank

web client

central databases and file archives

Specify Morphbank Morphbank

database database img archive

Specify 6
”thick client“
Java client
DINA Web System Overview

Prototypes at these addresses:

Not available yet [Link] [Link] [Link] [Link]

Biodiversit DNA
Collection Collection Species
y survey barcode
Manager web portal portal pages
client

Collection-
Media
related Media files BLAST DB Taxon info
metadata
databases
Current Specify 6 client (Java stand-alone). Old technology, old-style interface.

Web-based Specify 7 client. Restricted functionality, same interface as Java client,

monolithic system with a code base that is difficult to work with
Modularization important to facilitate distributed development
Modern web form for the
collection manager UI,
developed in collaboration with
collection managers in the
DINA consortium
Estonian Pluto-F project, contributes taxonomy module to DINA-Web
Canadian SeqDB project, contributes DNA sequence module to DINA-Web
More DINA Info
 DINA project wiki ([Link]
 Project introduction
 Steering committee and technical committee information, minutes of meetings etc
 Status of the project in each of the participating institutions
 DINA github repository ([Link]
 DINA API guidelines and style guidelines
 Module map, system overview
 Code for DINA modules
 DINA components in production in Sweden:
 [Link] (species pages, in Swedish)
 [Link] (biodiversity survey client, requires login)
 [Link] (collection web portal)
 [Link] (DNA barcode portal)
 [Link] (loan request)

Augsep11 Heidorn
No ratings yet
Augsep11 Heidorn
7 pages
Expanding Role of Natural History Collections
No ratings yet
Expanding Role of Natural History Collections
14 pages
Cellinese 2012
No ratings yet
Cellinese 2012
12 pages
Animal Sysrematics
No ratings yet
Animal Sysrematics
6 pages
Biodiversity Informatics: Norman F. Johnson
No ratings yet
Biodiversity Informatics: Norman F. Johnson
20 pages
Ke Emu and The Future For Natural History Collections: María Consuelo Sendino
No ratings yet
Ke Emu and The Future For Natural History Collections: María Consuelo Sendino
10 pages
Building Natural History Collections
No ratings yet
Building Natural History Collections
14 pages
Rethinking The Value of Biological Specimens Labor
No ratings yet
Rethinking The Value of Biological Specimens Labor
20 pages
Natural History Museums in A Postbiodiversity Era
No ratings yet
Natural History Museums in A Postbiodiversity Era
5 pages
Web-Based Paleontology Data Sharing
No ratings yet
Web-Based Paleontology Data Sharing
17 pages
Ppig 2019a
No ratings yet
Ppig 2019a
12 pages
Rohwer 2022 Declining Natural History Collections
No ratings yet
Rohwer 2022 Declining Natural History Collections
4 pages
Datafication in Natural History Museums
No ratings yet
Datafication in Natural History Museums
14 pages
Museum Collection Care Guide
No ratings yet
Museum Collection Care Guide
84 pages
Bolton's Ant Catalogue Review 2006
No ratings yet
Bolton's Ant Catalogue Review 2006
1 page
Application of GIS in Biodiversity
No ratings yet
Application of GIS in Biodiversity
24 pages
Research in Museums
No ratings yet
Research in Museums
16 pages
Anderson Et Al. 2020. Optimizing Biodiversity Informatics To Improve Information Flow, Data Quality, and Utility For Science and Society
No ratings yet
Anderson Et Al. 2020. Optimizing Biodiversity Informatics To Improve Information Flow, Data Quality, and Utility For Science and Society
14 pages
CollectionProfiles McGinley 1993
No ratings yet
CollectionProfiles McGinley 1993
30 pages
Methods of Zoological Classification
No ratings yet
Methods of Zoological Classification
14 pages
Digitising Small Natural History Collections
No ratings yet
Digitising Small Natural History Collections
1 page
Majid Ali Khaskeli
No ratings yet
Majid Ali Khaskeli
9 pages
Biodiversity Informatics Overview
No ratings yet
Biodiversity Informatics Overview
9 pages
The Spider Species of The Great Lakes State S
No ratings yet
The Spider Species of The Great Lakes State S
96 pages
Lec-4 Taxonomic Collection, Identification & Publication
No ratings yet
Lec-4 Taxonomic Collection, Identification & Publication
21 pages
Revolutionizing The Use of Natural History Collections in Education
No ratings yet
Revolutionizing The Use of Natural History Collections in Education
10 pages
Importance of Taxonomy in Ecology
No ratings yet
Importance of Taxonomy in Ecology
7 pages
BIM for Heritage Documentation
No ratings yet
BIM for Heritage Documentation
7 pages
Practical 8
No ratings yet
Practical 8
1 page
Descriptive Taxonomy
No ratings yet
Descriptive Taxonomy
350 pages
Importance of Natural History Collections
No ratings yet
Importance of Natural History Collections
7 pages
Bioinformatics 2004 Parr 2997 3004
No ratings yet
Bioinformatics 2004 Parr 2997 3004
8 pages
MEGA Software for DNA Evolution Analysis
No ratings yet
MEGA Software for DNA Evolution Analysis
8 pages
Computer and Taxonomy
No ratings yet
Computer and Taxonomy
4 pages
DNA Sequencing Protocols for Museum Specimens
No ratings yet
DNA Sequencing Protocols for Museum Specimens
65 pages
Rabbani
No ratings yet
Rabbani
17 pages
Tools For Identifying Biodiversity: Progress and Problems
100% (2)
Tools For Identifying Biodiversity: Progress and Problems
480 pages
Interactive Plant Identification Methods
50% (2)
Interactive Plant Identification Methods
14 pages
EVS Sem 4 Du English
No ratings yet
EVS Sem 4 Du English
16 pages
Sober On 2010 Questions
No ratings yet
Sober On 2010 Questions
13 pages
Life Sciences Collections Strategy Overview
No ratings yet
Life Sciences Collections Strategy Overview
45 pages
Biodiversity Inventorying and Taxonomy Guide
No ratings yet
Biodiversity Inventorying and Taxonomy Guide
15 pages
List of Biodiversity Databases - Wikipedia
No ratings yet
List of Biodiversity Databases - Wikipedia
10 pages
Phylogenetic Biodiversity Assessment Based On Systematic Nomenclature
No ratings yet
Phylogenetic Biodiversity Assessment Based On Systematic Nomenclature
26 pages
Understanding Taxonomical Aids
No ratings yet
Understanding Taxonomical Aids
33 pages
2 Roles and Products of Sytematics On Modern Biology
No ratings yet
2 Roles and Products of Sytematics On Modern Biology
12 pages
Pengarsipan Inter 1
No ratings yet
Pengarsipan Inter 1
5 pages
7
No ratings yet
7
66 pages
Coddington Et Al Des TestSampProt91 PDF
No ratings yet
Coddington Et Al Des TestSampProt91 PDF
33 pages
Weeding Guidelines by Subject
No ratings yet
Weeding Guidelines by Subject
20 pages
Fekb 102
No ratings yet
Fekb 102
21 pages
User's Guide For iNEXT Online: Software For Interpolation and Extrapolation of Species Diversity
No ratings yet
User's Guide For iNEXT Online: Software For Interpolation and Extrapolation of Species Diversity
14 pages
GLAM Metadata in Museums and University Collections - A State-Of-The-Art
No ratings yet
GLAM Metadata in Museums and University Collections - A State-Of-The-Art
19 pages
Roopnarine Specimen Collecting
No ratings yet
Roopnarine Specimen Collecting
3 pages
Crowd Sourcing for Biodiversity Data
No ratings yet
Crowd Sourcing for Biodiversity Data
57 pages
6-Marcio Watanabe 2019 New Research Tools Move Specimens Data To Center Stage
No ratings yet
6-Marcio Watanabe 2019 New Research Tools Move Specimens Data To Center Stage
7 pages
Intro To Chemical Database
No ratings yet
Intro To Chemical Database
5 pages
BookReview CFN 1304 Canfield 2017
No ratings yet
BookReview CFN 1304 Canfield 2017
2 pages
Python Developer Resume and Skills
No ratings yet
Python Developer Resume and Skills
2 pages
Firmware Security Testing Guide
No ratings yet
Firmware Security Testing Guide
34 pages
E-Commerce Garments System Overview
No ratings yet
E-Commerce Garments System Overview
53 pages
Bhargavi - Sr. ServiceNow Developer (ITAM)
No ratings yet
Bhargavi - Sr. ServiceNow Developer (ITAM)
7 pages
Iccgi 2024 1 10 10002
No ratings yet
Iccgi 2024 1 10 10002
11 pages
SDN Control Plane
No ratings yet
SDN Control Plane
20 pages
OutSystems - Knowing Library Module Patterns
No ratings yet
OutSystems - Knowing Library Module Patterns
10 pages
Linux PCIe Drivers Guide Via chatGPT
No ratings yet
Linux PCIe Drivers Guide Via chatGPT
38 pages
Integrating Indian Pincode API in Salesforce Using Apex
No ratings yet
Integrating Indian Pincode API in Salesforce Using Apex
9 pages
Integrating SAP Employee Central With WFS
No ratings yet
Integrating SAP Employee Central With WFS
25 pages
Shilpi Sinha - QA - Cy
No ratings yet
Shilpi Sinha - QA - Cy
4 pages
Shopee API Integration Guide
No ratings yet
Shopee API Integration Guide
11 pages
Document Chat Application Overview
No ratings yet
Document Chat Application Overview
58 pages
How To Add A Descriptive Flexfield (DFF) in A Custom Oracle Apps Form
No ratings yet
How To Add A Descriptive Flexfield (DFF) in A Custom Oracle Apps Form
17 pages
Final Project For ASP - Net and Integrative Programming
No ratings yet
Final Project For ASP - Net and Integrative Programming
2 pages
Flutter Relation To API
No ratings yet
Flutter Relation To API
7 pages
F-IoT Unit-2
100% (2)
F-IoT Unit-2
85 pages
CoreBankingSystem Transformation ArchitectureBluePrint V0.9 1 1
No ratings yet
CoreBankingSystem Transformation ArchitectureBluePrint V0.9 1 1
73 pages
OmniPCX - RECORD (v2.2.2.13) - Coach Installation Guide v3.1 Edition 01
No ratings yet
OmniPCX - RECORD (v2.2.2.13) - Coach Installation Guide v3.1 Edition 01
69 pages
Real-Time Weather Forecasting App
No ratings yet
Real-Time Weather Forecasting App
24 pages
M3 User Group Meeting Summary
No ratings yet
M3 User Group Meeting Summary
8 pages
Huawei iMaster NCE: Network Automation Platform
No ratings yet
Huawei iMaster NCE: Network Automation Platform
9 pages
Cloud Report
No ratings yet
Cloud Report
12 pages
Unity Internship Presentation
No ratings yet
Unity Internship Presentation
22 pages
Spring Boot Hospital List
100% (1)
Spring Boot Hospital List
25 pages
AWS Codeguru Reviewer Api DOcumentation
No ratings yet
AWS Codeguru Reviewer Api DOcumentation
131 pages
Stock Price Alert System Final
No ratings yet
Stock Price Alert System Final
12 pages
Oracle Cloud Infrastructure Developer Exam Guide
100% (1)
Oracle Cloud Infrastructure Developer Exam Guide
12 pages
IICS Overview
No ratings yet
IICS Overview
6 pages
Android RecyclerView Guide
No ratings yet
Android RecyclerView Guide
20 pages