Professional Documents
Culture Documents
National Academies of Sciences, Engineering, and Medicine. 2020. Life Cycle Decisions for
Biomedical Data: The Challenge of Forecasting Costs. Washington, DC: The National
Academies Press. https://doi.org/10.17226/25639.
The framework presented should be considered the basis of a cost forecast rather than a
one-size-fits-all analytical tool for all applications. How it is applied in any situation depends
on the circumstances, needs, and resources available to those involved. The activities,
decisions, and cost drivers of the biomedical information resource will be situationally
dependent, and the framework will need to be modified to suit the specific purpose. Though it
is not presently required, filling in this template will walk the cost forecaster through the
necessary considerations for a biomedical information resource.
In an effort to greater facilitate the implementation of the long-term cost forecasting of data in
the biomedical community, this template provides cost forecasters with an editable version of
the cost-driver analysis presented in the report, and it will allow cost forecasters to document
responses to important decision points. Users are able to answer the prescribed questions, as
applicable, as well as add or remove questions so as to better fit their specific needs. This
document may also be shared and used as a collaborative tool for greater transparency and
communication among those responsible for managing the resource.
For more information on the listed cost drivers, please refer to Chapter 4 in the report.
A. Content (details)
Relative Cost
Category Cost Driver Decision Points/Issues Potential (Low,
Medium, High)
1. How many files will be in a single data submission?
Size (volume
and number of 3. Are the data sizes likely to stay stable over the life of
A.1
items) the resource?
> size = higher
costs 4. What is the total amount of data expected?
higher cost 5. How many different data types are being produced?
https://doi.org/10.17226/25639
2. Can the data be easily recreated?
> replaceability
= lower cost
https://doi.org/10.17226/25639
B. Capabilities (details)
Relative Cost
Category Cost Driver Decision Points/Issues Potential (Low,
Medium, High)
1. Will the repository have to provide user annotation
capabilities?
Search
Capabilities 3. How complex are the queries that will be supported?
B.4
> search
capabilities =
increased cost 4. What types of features for search will be provided?
https://doi.org/10.17226/25639
Data Linking 1. Will the data require/benefit from linkages to other
and Merging related items?
B.5
> linking and 2. Will the resource provide the ability to combine data
merging = across records based on common entities/standards?
increased cost
1. Will the resource provide the ability to track uploads,
views, and downloads?
Use Tracking> 2. If so, and if made available to users, how will this
B.6 tracking =
increased cost
information be made available?
3. Will the resource track data citations to its data?
Data Analysis 1. What types of data analyses and visualizations will the
and repository support?
B.7
Visualization
2. What types of other data operations will the repository
> services = support (e.g., file conversions, sequence comparison)?
higher cost
3. Do these services require significant computational
resources?
https://doi.org/10.17226/25639
C. Control (details)
Relative Cost
Category Cost Driver Decision Points/Issues Potential (Low,
Medium, High)
Content 1. Will all appropriate data be accepted or will there be a
Control review process?
C.1
> review 2. Will the review process be automated or will it require
processes = human oversight?
increased cost
1. What quality control process will the repository
support?
C.2
> quality control 5. Will any of the data be time sensitive and how will data
= increased cost currency be ensured?
Access Control 2. At what level are they instituted (e.g., individual users,
C.3
> controls =
individual data sets)?
increased cost
3. Does use of the data require approval by a data access
committee?
https://doi.org/10.17226/25639
D. External Context (details)
Relative Cost
Category Cost Driver Decision Points/Issues Potential (Low,
Medium, High)
1. Is there a requirement to replicate the information
Resource resource at multiple sites (i.e., mirroring)?
Replication
D.1
> replication =
increased cost
D.3
> distinctiveness
= increased cost
https://doi.org/10.17226/25639
E. Data Life Cycle (details)
Relative Cost
Category Cost Driver Decision Points/Issues Potential (Low,
Medium, High)
1. Is the repository expected to continuously grow over
its lifetime?
Useful Lifetime 2. Does the resource have a defined period of time for
E.3
limited lifetime =
which it will operate?
decreased cost 3. Does the resource have to provide a guarantee that the
data will be available for a finite period of time (e.g., 10
years)?
Offline and 1. Can the resource take advantage of offline storage for
Deep Storage data that are not heavily used?
> offline/deep
2. Does the resource have a plan for moving unused data
E.4 storage = to deep storage (i.e., State 3)?
decreased costs
> transfers =
increased cost
https://doi.org/10.17226/25639
F. Contributors and Users (details)
Relative Cost
Category Cost Driver Decision Points/Issues Potential (Low,
Medium, High)
1. Is the number of contributors known? If not, can it be
estimated?
2. Are all the data originating from the same source (e.g.,
a single instrument or a single organization)?
https://doi.org/10.17226/25639
3. Will a “help desk” be provided?
F.4
> outreach = 3. Will the resource have a booth at the conference for
increased costs live demos or conduct hands-on tutorials?
https://doi.org/10.17226/25639
G. Availability (details)
Relative Cost
Category Cost Driver Decision Points/Issues Potential (Low,
Medium, High)
1. What is the tolerance for outages of the resource?
Tolerance for
Outages 2. What measures will be taken to avoid and mitigate
G.1
< tolerance for
outages?
outages =
increased costs 3. How quickly and completely does the resource need to
recover from an outage?
https://doi.org/10.17226/25639
H. Confidentiality, Ownership, Security (details)
Relative Cost
Category Cost Driver Decision Points/Issues Potential (Low,
Medium, High)
1. Will any of the data require special protections?
Ownership 2. Will all the data be released by the data resource under
H.2
> ownership =
the same license or will different permissions be
assigned to different data sets?
increased costs
3. Will data submission agreements be necessary?
H.3
> security = 2. Do these measures require using protected computing,
increased cost storage, or networking platforms?
https://doi.org/10.17226/25639
I. Maintenance and Operations (details)
Relative Cost
Category Cost Driver Decision Points/Issues Potential (Low,
Medium, High)
Periodic 1. What processes will be put in place for checking the
Integrity integrity of the hardware, software, and data?
Checking
I.1
> integrity
2. How frequently will these checks be performed?
checking =
increased cost
1. Will the bandwidth available to the resource be
Data Transfer sufficient for the data sizes and rates required?
Capacity
I.2
> data transfer
upgrades =
increased cost
I.5
Billing and
Collections
https://doi.org/10.17226/25639
J. Standards, Regulatory, and Governance Issues (details)
Relative Cost
Category Cost Driver Decision Points/Issues Potential (Low,
Medium, High)
1. How many different standards will the resource have to
support?
c. If so, are the standards mature (i.e., how much are they
expected to evolve)?
Applicable
Standards 3. Are the data validators and converters available for the
J.1
> mature
standards, or do they have to be developed?
standards = 4. What is the plan for “retrofitting” data that have been
decreased costs uploaded without the standards in place?
Regulatory and 1. What laws and regulations cover the data and
Legislative operation of the resource?
J.2
Environment
2. Is the resource covered by an open-records act?
> regulation =
increased cost
1. Does the resource need to maintain an external
Governance advisory board?
https://doi.org/10.17226/25639
> consultations 2. Will external stakeholders be consulted on an ongoing
= increased time basis?
= increased
costs
https://doi.org/10.17226/25639