Professional Documents
Culture Documents
2
3
4
10
11
12
13
14
15 NIST Big Data Public Working Group
16 Standards Roadmap Subgroup
17
18
19
20
21
22 Draft Version 2
23 March 20, 2017
24 http://dx.doi.org/ZZZZ
25
26
28
29
30 NIST Special Publication 1500-7
31 Information Technology Laboratory
32
56
57
58 U. S. Department of Commerce
59 Wilbur L. Ross, Jr., Secretary
60
61 National Institute of Standards and Technology
62 Dr. Kent Rochford, Acting Under Secretary of Commerce for Standards and Technology
63 and Acting NIST Director
64
65
66
67 National Institute of Standards and Technology (NIST) Special Publication 1500-7
68 68 pages (March 20, 2017)
69
70
71
72Certain commercial entities, equipment, or materials may be identified in this document in order to describe an
73experimental procedure or concept adequately. Such identification is not intended to imply recommendation or
74endorsement by NIST, nor is it intended to imply that the entities, materials, or equipment are necessarily the best
75available for the purpose.
76There may be references in this publication to other publications currently under development by NIST in
77accordance with its assigned statutory responsibilities. The information in this publication, including concepts and
78methodologies, may be used by Federal agencies even before the completion of such companion publications. Thus,
79until each publication is completed, current requirements, guidelines, and procedures, where they exist, remain
80operative. For planning and transition purposes, Federal agencies may wish to closely follow the development of
81these new publications by NIST.
82Organizations are encouraged to review all draft publications during public comment periods and provide feedback
83to NIST. All NIST publications are available at http://www.nist.gov/publication-portal.cfm.
84
85
86
87
88 Comments on this publication may be submitted to Wo Chang
89
90 National Institute of Standards and Technology
91 Attn: Wo Chang, Information Technology Laboratory
92 100 Bureau Drive (Mail Stop 8900) Gaithersburg, MD 20899-8930
93 Email: SP1500comments@nist.gov
94
95
96 Request for Contribution
97The NIST Big Data Public Working Group (NBD-PWG) requests contributions to this draft version 2 of
98the NIST Big Data Interoperability Framework Volume 7, Standards Roadmap. All contributions are
99welcome, especially comments or additional content for the current draft.
100The NBD-PWG is actively working to complete version 2 of the set of nine (9) NBDIF documents. The
101goals of version 2 are to enhance the version 1 content, define general interfaces between the NIST Big
102Data Reference Architecture (NBDRA) components by aggregating low-level interactions into high-level
103general interfaces, and demonstrate how the NBDRA can be used.
104To contribute to this document, please follow the steps below, as soon as possible but no later than May
1051, 2017.
106 1. Register as a user of the NIST Big Data Portal (https://bigdatawg.nist.gov/newuser.php)
107 2. Record comments and/or additional content in one of the following methods:
108 a. TRACK CHANGES: make edits to and comments on the text directly into this Word
109 document using track changes
110 b. COMMENT TEMPLATE: capture specific edits using the Comment Template
111 (http://bigdatawg.nist.gov/_uploadfiles/SP1500-1-to-7_comment_template.docx), which
112 includes space for Section number, page number, comment, and text edits
113 3. Submit the edited file from either method above to SP1500comments@nist.gov with the volume
114 number in the subject line (e.g., Edits for Volume 1).
115 4. Attend the weekly virtual meetings on Tuesdays for possible presentation and discussion of your
116 submission. Virtual meeting logistics can be found at https://bigdatawg.nist.gov/program.php
117Please be as specific as possible in any comments or edits to the text. Specific edits include, but are not
118limited to, changes in the current text, additional text further explaining a topic or explaining a new topic,
119additional references, or comments about the text, topics, or document organization.
120The comments and additional content will be reviewed by the subgroup co-chair responsible for the
121volume in question. Comments and additional content may be presented and discussed by the NBD-PWG
122during the weekly virtual meetings on Tuesday.
123Three versions are planned for the NBDIF set of documents, with Versions 2 and 3 building on the first.
124Further explanation of the three planned versions and the information contained therein is included in
125Section 1.5 of each NBDIF document.
126Please contact Wo Chang (wchang@nist.gov) with any questions about the feedback submission process.
127Big Data professionals are always welcome to join the NBD-PWG to help craft the work contained in the
128volumes of the NBDIF. Additional information about the NBD-PWG can be found at
129http://bigdatawg.nist.gov. Information about the weekly virtual meetings on Tuesday can be found at
130https://bigdatawg.nist.gov/program.php.
131
132
133
134
135 Reports on Computer Systems Technology
136The Information Technology Laboratory (ITL) at NIST promotes the U.S. economy and public welfare
137by providing technical leadership for the Nation’s measurement and standards infrastructure. ITL
138develops tests, test methods, reference data, proof of concept implementations, and technical analyses to
139advance the development and productive use of information technology (IT). ITL’s responsibilities
140include the development of management, administrative, technical, and physical standards and guidelines
141for the cost-effective security and privacy of other than national security-related information in Federal
142information systems. This document reports on ITL’s research, guidance, and outreach efforts in IT and
143its collaborative activities with industry, government, and academic organizations.
144
145 Abstract
146Big Data is a term used to describe the large amount of data in the networked, digitized, sensor-laden,
147information-driven world. While opportunities exist with Big Data, the data can overwhelm traditional
148technical approaches and the growth of data is outpacing scientific and technological advances in data
149analytics. To advance progress in Big Data, the NIST Big Data Public Working Group (NBD-PWG) is
150working to develop consensus on important, fundamental concepts related to Big Data. The results are
151reported in the NIST Big Data Interoperability Framework series of volumes. This volume, Volume 7,
152contains summaries of the work presented in the other six volumes and an investigation of standards
153related to Big Data.
154
155 Keywords
156Big Data, reference architecture, System Orchestrator, Data Provider, Big Data Application Provider, Big
157Data Framework Provider, Data Consumer, Security and Privacy Fabric, Management Fabric, Big Data
158taxonomy, use cases, Big Data characteristics, Big Data standards
159
160
161 Acknowledgements
162This document reflects the contributions and discussions by the membership of the NBD-PWG, co-
163chaired by Wo Chang (NIST ITL), Nancy Grady (SAIC), Geoffrey Fox (University of Indiana), Arnab
164Roy (Fujitsu), Mark Underwood (Krypton Brothers), David Boyd (InCadence Corp), Russell Reinsch
165(Loci), and Gregor von Laszewski (University of Indiana).
166The document contains input from members of the NBD-PWG: Technology Roadmap Subgroup, led by
167Russell Reinsch (CFGIO); Definitions and Taxonomies Subgroup, led by Nancy Grady (SAIC); Use
168Cases and Requirements Subgroup, led by Geoffrey Fox (University of Indiana); Security and Privacy
169Subgroup, led by Arnab Roy (Fujitsu) and Mark Underwood (Krypton Brothers); and Reference
170Architecture Subgroup, led by David Boyd (InCadence Strategic Solutions).
171NIST SP1500-7, Version 2 has been collaboratively authored by the NBD-PWG. As of the date of this
172publication, there are _ NBD-PWG participants from industry, academia, and government. Federal
173agency participants include the National Archives and Records Administration (NARA), National
174Aeronautics and Space Administration (NASA), National Science Foundation (NSF), and the U.S.
175Departments of Agriculture, Commerce, Defense, Energy, Health and Human Services, Homeland
176Security, Transportation, Treasury, and Veterans Affairs.
177NIST acknowledges the specific contributionsa to this volume by the following NBD-PWG members:
178A list of contributors to version 2 of this volume will be added here.
179
180The editors for this document were Russell Reinsch and Wo Chang
181
182
1a “Contributors” are members of the NIST Big Data Public Working Group who dedicated great effort to prepare
2and substantial time on a regular basis to research and development in support of this document.
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
vi
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
241LIST OF FIGURES
242FIGURE 1: NIST BIG DATA REFERENCE ARCHITECTURE TAXONOMY...........................................................................................6
243FIGURE 2: NBDRA CONCEPTUAL MODEL...........................................................................................................................13
244
245LIST OF TABLES
246TABLE 1: SEVEN REQUIREMENTS CATEGORIES AND THEIR SPECIFICS...........................................................................................7
247TABLE 2: DATA SOURCE REQUIREMENTS-TO-STANDARDS MATRIX:..............................................................................................9
248TABLE 3: AN EXCERPT OF THE MASTER USE CASE-TO-STANDARDS MATRIX:...................................................................................9
249TABLE 4: M0165 BREAKOUT SUPPLEMENT.........................................................................................................................10
250TABLE 5: MAPPING USE CASE CHARACTERIZATION CATEGORIES TO REFERENCE ARCHITECTURE COMPONENTS AND FABRICS.............12
251TABLE 6: EXCERPT OF HEALTHCARE RELATED STANDARDS.......................................................................................................18
252TABLE 7: RESULTS FROM POLL ABOUT DATA SCIENCE PROJECT METHODOLOGY.........................................................................20
253TABLE B-1: BIG DATA RELATED STANDARDS......................................................................................................................B-1
254TABLE C-1: STANDARDS AND THE NBDRA........................................................................................................................C-1
255TABLE D-1: CATEGORIZED STANDARDS.............................................................................................................................D-1
256
257
258
259
260
vii
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
viii
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
266 1. INTRODUCTION
2671.1 BACKGROUND
268There is broad agreement among commercial, academic, and government leaders about the remarkable
269potential of Big Data to spark innovation, fuel commerce, and drive progress. Big Data is the common
270term used to describe the deluge of data in today’s networked, digitized, sensor-laden, and information-
271driven world. The availability of vast data resources carries the potential to answer questions previously
272out of reach, including the following:
273 How can a potential pandemic reliably be detected early enough to intervene?
274 Can new materials with advanced properties be predicted before these materials have ever been
275 synthesized?
276 How can the current advantage of the attacker over the defender in guarding against cyber-
277 security threats be reversed?
278There is also broad agreement on the ability of Big Data to overwhelm traditional approaches. The growth
279rates for data volumes, speeds, and complexity are outpacing scientific and technological advances in data
280analytics, management, transport, and data user spheres.
281Despite widespread agreement on the inherent opportunities and current limitations of Big Data, a lack of
282consensus on some important fundamental questions continues to confuse potential users and stymie
283progress. These questions include the following:
284 How is Big Data defined?
285 What attributes define Big Data solutions?
286 What is the significance of possessing Big Data?
287 How is Big Data different from traditional data environments and related applications?
288 What are the essential characteristics of Big Data environments?
289 How do these environments integrate with currently deployed architectures?
290 What are the central scientific, technological, and standardization challenges that need to be
291 addressed to accelerate the deployment of robust Big Data solutions?
292Within this context, on March 29, 2012, the White House announced the Big Data Research and
293Development Initiative.1 The initiative’s goals include helping to accelerate the pace of discovery in
294science and engineering, strengthening national security, and transforming teaching and learning by
295improving the ability to extract knowledge and insights from large and complex collections of digital
296data.
297Six federal departments and their agencies announced more than $200 million in commitments spread
298across more than 80 projects, which aim to significantly improve the tools and techniques needed to
299access, organize, and draw conclusions from huge volumes of digital data. The initiative also challenged
300industry, research universities, and nonprofits to join with the federal government to make the most of the
301opportunities created by Big Data.
302Motivated by the White House initiative and public suggestions, the National Institute of Standards and
303Technology (NIST) has accepted the challenge to stimulate collaboration among industry professionals to
304further the secure and effective adoption of Big Data. As one result of NIST’s Cloud and Big Data Forum
305held on January 15–17, 2013, there was strong encouragement for NIST to create a public working group
306for the development of a Big Data Standards Roadmap. Forum participants noted that this roadmap
1
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
307should define and prioritize Big Data requirements, including interoperability, portability, reusability,
308extensibility, data usage, analytics, and technology infrastructure. In doing so, the roadmap would
309accelerate the adoption of the most secure and effective Big Data techniques and technology.
310On June 19, 2013, the NIST Big Data Public Working Group (NBD-PWG) was launched with extensive
311participation by industry, academia, and government from across the nation. The scope of the NBD-PWG
312involves forming a community of interests from all sectors—including industry, academia, and
313government—with the goal of developing consensus on definitions, taxonomies, secure reference
314architectures, security and privacy, and—from these—a standards roadmap. Such a consensus would
315create a vendor-neutral, technology- and infrastructure-independent framework that would enable Big
316Data stakeholders to identify and use the best analytics tools for their processing and visualization
317requirements on the most suitable computing platform and cluster, while also allowing value-added from
318Big Data service providers.
319The NIST Big Data Interoperability Framework will be released in three versions, which correspond to
320the three stages of the NBD-PWG work. The three stages aim to achieve the following with respect to the
321NIST Big Data Reference Architecture (NBDRA.)
322 Stage 1: Identify the high-level Big Data reference architecture key components, which are
323 technology, infrastructure, and vendor agnostic.
324 Stage 2: Define general interfaces between the NBDRA components.
325 Stage 3: Validate the NBDRA by building Big Data general applications through the general
326 interfaces.
327On September 16, 2015, seven volumes NIST Big Data Interoperability Framework V1.0 documents
328were published (http://bigdatawg.nist.gov/V1_output_docs.php), each of which addresses a specific key
329topic, resulting from the work of the NBD-PWG. The seven volumes are as follows:
330 Volume 1, Definitions
331 Volume 2, Taxonomies
332 Volume 3, Use Cases and General Requirements
333 Volume 4, Security and Privacy
334 Volume 5, Architectures White Paper Survey
335 Volume 6, Reference Architecture
336 Volume 7, Standards Roadmap
337Currently the NBD-PWG is working on Stage 2 with the goals to enhance the version 1 content, define
338general interfaces between the NBDRA components by aggregating low-level interactions into high-level
339general interfaces, and demonstrate how the NBDRA can be used. As a result, the following two
340additional volumes have been identified.
341 Volume 8, Reference Architecture Interfaces
342 Volume 9, Adoption and Modernization
343Potential areas of future work for each volume during Stage 3 are highlighted in Section 1.5 of each
344volume. The current effort documented in this volume reflects concepts developed within the rapidly
345evolving field of Big Data.
2
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
3
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
3712.1 DEFINITIONS
372Section Scope: Point to Volume 1 definitions, highlight several relevant definitions, define new concepts
373specific to Volume 7 or previously unmarked, (when Volume 7 is done, revisit Volume 1 and decide
374where the definitions should be placed (Vol 1 or 7 or both))
375There are two fundamental concepts in the emerging discipline of Big Data that have been used to
376represent multiple concepts. These two concepts, Big Data and data science, are broken down into
377individual terms and concepts in the following subsections. As a basis for discussions of the NBDRA and
378related standards and measurement technology, associated terminology is defined in subsequent
379subsections. NIST Big Data Infrastructure Framework: Volume 1, Definitions contains additional details
380and terminology.
381Both technical and non-technical audiences need to keep abreast of the rapid changes in the big data
382landscape as those changes can affect their ability to manage information in effective ways. Consumption
383of written, audio or video information on big data is reliant on certain accepted definitions for terms. For
384non-technical audiences, a method of expressing the big data aspects in terms of volume, variety and
385velocity, known as the “Vs,” became popular for its ability to frame the somewhat complex concepts of
386big data in simpler, more digestible ways. For technical audiences however, the Vs do not closely
387correspond to the complete corpus of terminology used in the field of study.
388Tested against the corpus of use, the essential technical characteristics of big data can be clustered into
389four distinct segments:
390 heterogeneity and irregularity;
391 parallelism;
392 real time analysis; and
393 presentation or visualization. 1
394The concept of heterogeneity in big data includes the ‘variety’ of data types, commonly referred to as
395structured, semi structured or unstructured.
396Parallelism refers to distributed / horizontal processing big data and includes the ‘volume’ of data.
397The concept of real time analysis of big data includes the ‘velocity’ of data.
3b See NIST Big Data Interoperability Framework Volumes 3, 5, and 6, version 1 for additional information on the
4use cases, reference architecture information collection, and development of the NBDRA.
4
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
5
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
441Traditionally, the term analytics has been used as one of the steps in the data lifecycle of collection,
442preparation, analysis, and action.
443 Analytics is the synthesis of knowledge from information.
4442.2 TAXONOMY
445The NBD-PWG Definitions and Taxonomy Subgroup developed a hierarchy of reference architecture
446components. Additional taxonomy details are presented in the NIST Big Data Interoperability
447Framework: Volume 2, Taxonomy document.
448Figure 1 outlines potential actors for the seven roles developed by the NBD-PWG Definition and
449Taxonomy Subgroup. The dark blue boxes contain the name of the role at the top with potential actors
450listed directly below.
453
6
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
7
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
CPR-3 Needs to support legacy and advanced distributed computing clusters, co-processors,
input output processing (infrastructure).
CPR-4 Needs to support elastic data transmission (networking).
CPR-5 Needs to support legacy, large, and advanced distributed data storage (storage).
CPR-6 Needs to support legacy and advanced executable programming: applications, tools,
utilities, and libraries (software).
DATA CONSUMER REQUIREMENTS (DCR)
DCR-1 Needs to support fast searches (~0.1 seconds) from processed data with high
relevancy, accuracy, and recall.
DCR-2 Needs to support diversified output file formats for visualization, rendering, and
reporting.
DCR-3 Needs to support visual layout for results presentation.
DCR-4 Needs to support rich user interface for access using browser, visualization tools.
DCR-5 Needs to support high-resolution, multi-dimension layer of data visualization.
DCR-6 Needs to support streaming results to clients.
SECURITY AND PRIVACY REQUIREMENTS (SPR)
SPR-1 Needs to protect and preserve security and privacy of sensitive data.
SPR-2 Needs to support sandbox, access control, and multi-level, policy-driven
authentication on protected data.
LIFECYCLE MANAGEMENT REQUIREMENTS (LMR)
LMR-1 Needs to support data quality curation including pre-processing, data clustering,
classification, reduction, and format transformation.
LMR-2 Needs to support dynamic updates on data, user profiles, and links.
LMR-3 Needs to support data lifecycle and long-term preservation policy, including data
provenance.
LMR-4 Needs to support data validation.
LMR-5 Needs to support human annotation for data validation.
LMR-6 Needs to support prevention of data loss or corruption.
LMR-7 Needs to support multi-site archives.
LMR-8 Needs to support persistent identifier and data traceability.
LMR-9 Support standardizing, aggregating, and normalizing data from disparate sources.
OTHER REQUIREMENTS (OR)
OR-1 Needs to support rich user interface from mobile platforms to access processed
results.
OR-2 Needs to support performance monitoring on analytic processing from mobile
platforms.
OR-3 Needs to support rich visual content search and rendering from mobile platforms.
OR-4 Needs to support mobile device data acquisition.
OR-5 Needs to support security across mobile devices.
489
490
491Additional information about the Subgroup, use case collection, analysis of the use cases, and generation
492of the use case requirements are presented in the NIST Big Data Interoperability Framework: Volume 3,
493Use Cases and General Requirements document.
8
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
DSR-6
506
507Related: ACRL standards focus on the intellectual / interpretation competency of the consumer to
508evaluate accuracy of imagery, or critique, not the visualizing technology.
509Products note: DSR-4: IDL, AVS, OpenDX,
510
9
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
517
518Taken further, we investigate individual use cases in more detail, by dividing the overall task of the use
519case into its sub task components and activities, and map associated standards to the sub tasks.
10
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
11
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
12
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
I N F O R M AT I O N V A L U E C H A I N
System Orchestrator
Data Consumer
Data Provider
Preparation
DATA Collection / Curation Analytics Visualization Access DATA
SW SW
I T VA LU E C H A I N
DATA
SW
Big Data Framework Provider
Resource Management
Management
Platforms: Data Organization and Distribution
Indexed Storage
File Systems
Infrastructures: Networking, Computing, Storage
Virtual Resources
Physical Resources
KEY: DATA Big Data Information Service Use SW Software Tools and
Flow Algorithms Transfer
608The NBDRA is organized around two axes representing the two Big Data value chains: the information
609(horizontal axis) and the Information Technology (IT) (vertical axis). Along the information axis, the
610value is created by data collection, integration, analysis, and applying the results following the value
611chain. Along the IT axis, the value is created by providing networking, infrastructure, platforms,
612application tools, and other IT services for hosting of and operating the Big Data in support of required
613data applications. At the intersection of both axes is the Big Data Application Provider component,
614indicating that data analytics and its implementation provide the value to Big Data stakeholders in both
615value chains. The names of the Big Data Application Provider and Big Data Framework Provider
616components contain “providers” to indicate that these components provide or implement a specific
617technical function within the system.
618The five main NBDRA components, shown in Figure 2 and discussed in detail in Section 4, represent
619different technical roles that exist in every Big Data system. These functional components are as follows:
620 System Orchestrator
621 Data Provider
622 Big Data Application Provider
623 Big Data Framework Provider
624 Data Consumer
13
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
625The two fabrics shown in Figure 2 encompassing the five functional components are the following:
626 Management
627 Security and Privacy
628These two fabrics provide services and functionality to the five functional components in the areas
629specific to Big Data and are crucial to any Big Data solution.
630The “DATA” arrows in Figure 2 show the flow of data between the system’s main components. Data
631flows between the components either physically (i.e., by value) or by providing its location and the means
632to access it (i.e., by reference). The “SW” arrows show transfer of software tools for processing of Big
633Data in situ. The “Service Use” arrows represent software programmable interfaces. While the main focus
634of the NBDRA is to represent the run-time environment, all three types of communications or
635transactions can happen in the configuration phase as well. Manual agreements (e.g., service-level
636agreements [SLAs]) and human interactions that may exist throughout the system are not shown in the
637NBDRA.
638The components represent functional roles in the Big Data ecosystem. In system development, actors and
639roles have the same relationship as in the movies, but system development actors can represent
640individuals, organizations, software, or hardware. According to the Big Data taxonomy, a single actor can
641play multiple roles, and multiple actors can play the same role. The NBDRA does not specify the business
642boundaries between the participating actors or stakeholders, so the roles can either reside within the same
643business entity or can be implemented by different business entities. Therefore, the NBDRA is applicable
644to a variety of business environments, from tightly-integrated enterprise systems to loosely-coupled
645vertical industries that rely on the cooperation of independent stakeholders. As a result, the notion of
646internal versus external functional components or roles does not apply to the NBDRA. However, for a
647specific use case, once the roles are associated with specific business stakeholders, the functional
648components would be considered as internal or external—subject to the use case’s point of view.
649The NBDRA does support the representation of stacking or chaining of Big Data systems. For example, a
650Data Consumer of one system could serve as a Data Provider to the next system down the stack or chain.
651The five main components and the two fabrics of the NBDRA are discussed in the NIST Big Data
652Interoperability Framework: Volume 6, Reference Architecture and Volume 4, Security and Privacy.
653
14
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
15
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
16
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
17
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
795
796SNOMED is inconsistent within itself. FHIR looks to solve key problems by extending RESTful web
797services but is immature.
798
799
18
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
19
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
20
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
870A number of technology areas are considered to be of significant importance and are expected to have
871sizeable impacts heading into the next decade. Any list of important technologies will obviously not
872satisfy every community member; and any line of demarcation between sizeable impact and lesser than
873sizable will arguably remain fuzzy, but for purposes of highlighting key areas that are forecast to be
874especially impactful on the economy…
875Additional text needed: Reference data showing the largest problems: Ingest, preparation [~80%],
876integration. Transition to data referencing expected solutions: discovery, integration, BI [see planned-tech
877png]; The increased usage of search for query functions (Integration and search can be discussed as
878functions
879All relies on metadata.
21
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
912In an age where one web search engine maintains the mindshare of the American public, it may make
913sense to describe the significantly different challenges faced in using search for retrieval or analysis of
914data that resides within an organization’s storage repositories. In web search, “Civilian” consumers are
915familiar with the experience of web search technologies, namely, instant query expansion, ranking of
916results, and rich snippets and occasional KG container. Footnote? Civilians are also familiar with standard
917functionality in personal computer file folders for information management. For large enterprises and
918organizations needing search functionality over documents, challenges persist.
919Web search engines of 2016 provide a huge service to citizens but do apply bias [maintain control] over
920how and what objects are delivered. The surrender of control that citizens willingly trade in exchange for
921the use of the free web search services is widely accepted as a value for the citizen; however future
922technologies promise even more value for the citizens who will search across the rapidly expanding scale
923of the world wide web. The case in point is commonly referred to as the semantic web.
924Current semantic approaches to searching almost all require content indexing as a measure for controlling
925the enormous corpus of documents that reside online. In attempting to tackle this problem of enormity of
926scale, the automation of content indexes for the semantic web have proven difficult to program.
927Continuing challenges for the semantic web. Two promising approaches to for developing the semantic
928web are ontologies and linked data technologies, however neither approach has proven to be a complete
929solution. Standard Ontological alternatives OWL, and RDF, which would benefit from the addition of
930linked data, suffer from an inability to effectively use linked data technology. Reciprocally, linked data
931technologies suffer from the inability to effectively use ontologies. Not apparent how standards in these
932areas would be an asset to the concept of an all-encompassing semantic web, or improve retrieval over
933that scale of data.
934Use of search in data analysis: Steady increase in the use of [logical] search systems as the superior
935method for information retrieval on data at rest [citation needed]. Generally speaking analytics search
936indexes can be constructed more quickly than natural language processing [NLP] search systems; NLP
937requiring semi-supervision, can have 20% error rates [Thoughtspot (only an example placeholder which
938will be replaced with vendor neutral language)]. Currently, Contextual Query Language 4, declarative
939logic programming languages [Datalog (only an example placeholder which will be replaced with vendor
940neutral language)], and RDF5 query languages [Sparql (only an example placeholder which will be
941replaced with vendor neutral language)] serve as search query language / NoSQL language structure
942standards.
943Discuss strengths in data acquisition, connectors and ingest. Speed and scale. A particular product’s
944underlying tech will likely be document, metadata or numerically focused, not all three.
945Architecturally speaking, indexing is the centerpiece. Metadata provides context. ML can provide
946enrichment.
947After indexing: Query planning / functionality: the age of big data has applied a downward pressure on
948the use of standard indexes. Indexes are good for small queries but have three issues: cause slow loading;
949ad hoc queries require advance column indexing; and lastly, the constant updating that is required to
950maintain indexes quickly becomes prohibitively expensive. In an example case, one SQL engine project
951[HAWQ] dropped indexing in favor of _ when it migrated from its original [Greenplum (only an example
952placeholder which will be replaced with vendor neutral language)] DB to a distributed file system [HDFS
953(only an example placeholder which will be replaced with vendor neutral language)]. One open source
954search technology [Lucene (only an example placeholder which will be replaced with vendor neutral
955language)] provides an incremental indexing technique that solves some part of this problem.
956Generally speaking, access and IR functions will remain areas of continual work in progress. In some
957cases, silo architectures for data are a necessary condition for running an organization, legal and security
22
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
958reasons being the most obvious. Proprietary, patented access methods are a barrier to building connectors
959required for true federated search. Unified info access; UIMA. Unified indexing vs. federation.
960Incredibly valuable external data is underused in most search implementations because of the lack of an
961appropriate architecture. The ideal framework would separate content acquisition from content processing
962by putting a data buffer (a big copy of the data) between them. With this framework one could gather data
963but defer the content processing decisions until later. Documents are “pre-joined” when they are
964processed for indexing; and large, mathematically challenging [relevancy [fuzzy matching, vectors [better
965than tf-idf] [what about scoring]] algorithms and complex search security requirements [encryption] can
966be run at index time.
967With this framework search may become superior to SQL for OLAP and data warehousing. Search is
968faster, more powerful, scalable, and schema free. Records can be output in XML and JSON and then
969loaded into a search engine. Fields can be mapped as needed.
970Tensions exist between a search systems’ functional power and its ease of use. Discovery. Initially the
971facets in a sidebar, loaded when a search system returned a result set. Supplement. WAIS “relied on
972standards for content representation” but now there are hundreds of formats. Open source tech promises
973power and flexibility to customize, but comes at the price of being technically demanding or requiring
974skilled staff to setup and operate [or third party to maintain].
975Compatibility with different ETL techniques. Standards for connectors to: CMS, collaboration apps, web
976portals, social media apps, CRM, and file systems and databases.
977Standards for content processing. Compatibility with normalizing techniques, records merging formats,
978external taxonomies or semantic resources, REGEX, or use of metadata for supporting interface
979navigation functionality.
980Standards for describing relationships between different data sources; and maintaining metadata context
981relationships. Semantic platforms to enhance information discovery and data integration applications. Rdf
982and ontology mapping can provide semantic uniformity. Rdf graph for visualization, ontology for
983descriptions of elements.
23
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
9994.4.6 INTEGRATION
1000Section Scope: This section will discuss standards for integration.
1001
1002Integration use cases:
1003 Data acquisition for data warehouses and analytics applications.
1004 Supporting MDM [and sharing metadata].
1005 Supporting governance [potential interoperability with mining, profiling, quality].
1006 Data migration.
1007 Intra-organization and data consistency between apps, DWs, MDM.
1008 Inter-organizational data sharing.
1009 System integration; system consolidation; certified integration interfaces.
1010Traditional integration focused on the mechanics of moving data to or from different types of data
1011structures. Big data systems require more attention to other related systems such as MDM and data
1012quality.
1013Increased importance of metadata; increased demand for metadata interfaces that provide non-technical
1014users with functionality for working with metadata.
1015Message formats for table: EDI and SWIFT
1016Shallow integration.
1017
1018
24
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
25
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
10705.1.4 IPAAS
1071Lightweight platforms, open source technology for iPaaS delivery
26
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
1090Markets: scope of classification of: azure mktplace, bluekai, knoema, factual, datamarket
1091Azure marketplace:
1092Bluekai: acquired by oracle in 2014 [400m]. Aggregated consumer browsing behavior. Oracle integrated
1093it with marketing automation platforms.
1094Datamarket: acquired by Qlik in 2014 [13m]. Provided a search engine support for analyzing market
1095statistics; heavy focus on gov data. Qlik was acquired by thoma bravo for approx. 3 bn.
1096Factual: location data. 62m total funding, 35m series b.
1097Knoema: over 500 free DBs, visualization tool.
1098Aggregation APIs. Human readable mechanisms for sharing. Closed clearinghouses.
1099Gap #11: Open Data: Sharing improves the value of existing data. Availability and access to data through
1100APIs. Interoperability through customized metadata. Some government led initiatives in dataset
1101cataloging, include data.gov, National Laboratory of Medicine, but on average, Open Data is a very small
1102percentage of the data sourced by business and industry. Open Data is an area which will benefit from
1103additional standards work.
1104Collaborations and joint solicitations. CTTBD, NOAA, BD Hubs [may be able to apply some text from
1105chatlog here]. Pilots.
1106Context information: “Best practices for data sharing platforms today include not only [the retention] and
1107sharing [of] data assets themselves but also sharing the context of data collection, generation and analysis.
1108This process includes metadata strategies but may also include the actual gathering techniques, analytics
1109codes, processes and workflows used to collect and analyze the data.” [NITRD]
1110Gap #8: Annotation systems for workflow registries similar to business process modeling and notation
1111[BPMN]. Automated; NLP approaches for semantic context information extraction. More than NER,
1112include identification of [relationships] related relevant information; for the purposes of supporting cross-
1113domain or multi-level annotation systems. Data as a service is an application level service [in contrast to
1114IaaS].
1115Information valuation.
27
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
1130Innovation is itself a form of variation. While some sectors or industries such as construction are more
1131likely to attempt to reduce variation, which creates an environment fostering standardization, certain
1132industries / sectors innovate so frequently as to create environments that produce large amounts of
1133variance and divergence from standardization.
1134The software industry, a major part of the big data phenomenon, is one such industry that is divergent.
1135Distributed computing, cloud computing,
1136In comparison to financial services for example, [the data is on premise? The applications are on
1137premise?] The organizational structure of a software firm is decentralized; interactivity of information
1138flows in the software industry is decentralizing, adds to the challenge of developing standards big data
1139[Box]. Regulations.
1140Convergence of problem areas: parallelism; descriptive data;
1141 “[I]ncentivize big data and data science research communities to provide comprehensive documentation
1142on their analysis workflows and related data, driven by metadata standards and annotation systems. Such
1143efforts will encourage greater data reuse and provide a greater return on research investments.” [NITRD]
1144From 4.2: While it makes sense to discuss the implications of standards in relation to specific vertical
1145domains such as healthcare where various technologies are ultimately applied, Section 5.3 focuses
1146somewhat on the Big Data landscape with attention to the technologies themselves, not the applications of
1147those technologies. Technology itself is rarely a solution to any situation and is most likely to be an
1148empowering means to improve its users’ effectiveness, but for purposes of discussing standards in
1149technology, ………. From the perspective of underlying mechanisms.
1150U.S. industries have also developed technical nonproduct standards through a voluntary consensus
1151approach. [A technical standard is “a document that provides requirements, specifications, guidelines or
1152characteristics that can be used consistently to ensure that materials, products, processes and services are
1153fit for their purpose.”* * The source of this definition is the International Organization for Standardization
1154(http://www.iso.org/iso/home/standards.htm).]
1155
1156BARRIERS
1157Cloud architectures create challenges for governance.
1158
1159
28
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
A-1
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
A-2
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
A-3
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
A-4
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
B-1
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
B-2
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
B-3
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
B-4
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
B-5
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
W3C Simple Object Access Protocol SOAP is a protocol specification for exchanging structured information
(SOAP) 1.2 in the implementation of web services in computer networks.
SPARQL is a language specification for the query and manipulation of
W3C SPARQL 1.1
linked data in a RDF format.
W3C Web Service Description This specification describes the WSDL Version 2.0, an XML language
Language (WSDL) 2.0 for describing Web services.
This standard specifies protocols for distributing and registering public
keys, suitable for use in conjunction with the W3C Recommendations
W3C XML Key Management for XML Signature [XML-SIG] and XML Encryption [XML-Enc]. The
Specification (XKMS) 2.0 XKMS comprises two parts:
The XML Key Information Service Specification (X-KISS)
The XML Key Registration Service Specification (X-KRSS).
B-6
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
B-7
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
B-8
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
B-9
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
B-10
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
B-11
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
X.509 Public key encryption for securing email and web communication.
1292
1293
B-12
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
D-1
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
NBDRA Components
Standard Name/Number
SO DP DC BDAP BDFP S&P M
ISO/IEC 11179-* I I/U I/U U
ISO/IEC 10728-*
ISO/IEC 13249-* I I/U U I/U
ISO/IE TR 19075-* I I/U U I/U
ISO/IEC 19503 I I/U U I/U U
ISO/IEC 19773 I I/U U I/U I/U
ISO/IEC TR 20943 I I/U U I/U U U
ISO/IEC 19763-* I I/U U U
ISO/IEC 9281:1990 I U I/U I/U
ISO/IEC 10918:1994 I U I/U I/U
ISO/IEC 11172:1993 I U I/U I/U
ISO/IEC 13818:2013 I U I/U I/U
ISO/IEC 14496:2010 I U I/U I/U
ISO/IEC 15444:2011 I U I/U I/U
ISO/IEC 21000:2003 I U I/U I/U
ISO 6709:2008 I U I/U I/U
ISO 19115-* I U I/U U
ISO 19110 I U I/U
ISO 19139 I U I/U
ISO 19119 I U I/U
ISO 19157 I U I/U U
ISO 19114 I
IEEE 21451 -* I U
IEEE 2200-2012 I U I/U
ISO/IEC 15408-2009 U I
ISO/IEC 27010:2012 I U I/U
ISO/IEC 27033-1:2009 I/U I/U I/U I
ISO/IEC TR 14516:2002 U U
ISO/IEC 29100:2011 I
ISO/IEC 9798:2010 I/U U U U I/U
D-2
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
NBDRA Components
Standard Name/Number
SO DP DC BDAP BDFP S&P M
ISO/IEC 11770:2010 I/U U U U I/U
ISO/IEC 27035:2011 U I
ISO/IEC 27037:2012 U I
JSR (Java Specification Request) 221 (developed by the Java Community Process) I/U I/U I/U I/U
W3C XML I/U I/U I/U I/U I/U I/U I/U
W3C Resource Description Framework (RDF) I U I/U I/U
W3C JavaScript Object Notation (JSON)-LD 1.0 I U I/U I/U
W3C Document Object Model (DOM) Level 1 Specification I U I/U I/U
W3C XQuery 3.0 I U I/U I/U
W3C XProc I I U I/U I/U
W3C XML Encryption Syntax and Processing Version 1.1 I U I/U
W3C XML Signature Syntax and Processing Version 1.1 I U I/U
W3C XPath 3.0 I U I/U I/U
W3C XSL Transformations (XSLT) Version 2.0 I U I/U I/U
W3C Efficient XML Interchange (EXI) Format 1.0 (Second Edition) I U I/U
W3C RDF Data Cube Vocabulary I U I/U I/U
W3C Data Catalog Vocabulary (DCAT) I U I/U
W3C HTML5 A vocabulary and associated APIs for HTML and XHTML I U I/U
W3C Internationalization Tag Set (ITS) 2.0 I U I/U I/U
W3C OWL 2 Web Ontology Language I U I/U I/U
W3C Platform for Privacy Preferences (P3P) 1.0 I U I/U I/U
W3C Protocol for Web Description Resources (POWDER) I U I/U
W3C Provenance I U I/U I/U U
W3C Rule Interchange Format (RIF) I U I/U I/U
W3C Service Modeling Language (SML) 1.1 I/U I U I/U
W3C Simple Knowledge Organization System Reference (SKOS) I U I/U
W3C Simple Object Access Protocol (SOAP) 1.2 I U I/U
W3C SPARQL 1.1 I U I/U I/U
W3C Web Service Description Language (WSDL) 2.0 U I U I/U
W3C XML Key Management Specification (XKMS) 2.0 U I U I/U
D-3
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
NBDRA Components
Standard Name/Number
SO DP DC BDAP BDFP S&P M
® ®
OGC OpenGIS Catalogue Services Specification 2.0.2 - I U I/U
ISO Metadata Application Profile
OGC® OpenGIS® GeoAPI I U I/U I/U
OGC® OpenGIS® GeoSPARQL I U I/U I/U
OGC® OpenGIS® Geography Markup Language (GML) Encoding Standard I U I/U I/U
OGC® Geospatial eXtensible Access Control Markup Language (GeoXACML)
Version 1 I U I/U I/U I/U
OGC® network Common Data Form (netCDF) I U I/U
OGC® Open Modelling Interface Standard (OpenMI) I U I/U I/U
OGC® OpenSearch Geo and Time Extensions I U I/U I
OGC® Web Services Context Document (OWS Context) I U I/U I
OGC® Sensor Web Enablement (SWE) I U I/U
OGC® OpenGIS® Simple Features Access (SFA) I U I/U I/U
OGC® OpenGIS® Georeferenced Table Joining Service (TJS) Implementation
Standard I U I/U I/U
OGC® OpenGIS® Web Coverage Processing Service Interface (WCPS) Standard I U I/U I
OGC® OpenGIS® Web Coverage Service (WCS) I U I/U I
OGC® Web Feature Service (WFS) 2.0 Interface Standard I U I/U I
OGC® OpenGIS® Web Map Service (WMS) Interface Standard I U I/U I
OGC® OpenGIS® Web Processing Service (WPS) Interface Standard I U I/U I
OASIS AS4 Profile of ebMS 3.0 v1.0 I U I/U
OASIS Advanced Message Queuing Protocol (AMQP) Version 1.0 I U U I
OASIS Application Vulnerability Description Language (AVDL) v1.0 I U I U
OASIS Biometric Identity Assurance Services (BIAS) Simple Object Access Protocol
(SOAP) Profile v1.0 I U I/U U
OASIS Content Management Interoperability Services (CMIS) I U I/U I
OASIS Digital Signature Service (DSS) I U I/U
OASIS Directory Services Markup Language (DSML) v2.0 I U I/U I
OASIS ebXML Messaging Services I U I/U
OASIS ebXML RegRep I U I/U I
OASIS ebXML Registry Information Model I U I/U
D-4
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
NBDRA Components
Standard Name/Number
SO DP DC BDAP BDFP S&P M
OASIS ebXML Registry Services Specification I U I/U
OASIS eXtensible Access Control Markup Language (XACML) I U I/U I/U I/U
OASIS Message Queuing Telemetry Transport (MQTT) I U I/U
OASIS Open Data (OData) Protocol I U I/U I/U
OASIS Search Web Services (SWS) I U I/U
OASIS Security Assertion Markup Language (SAML) v2.0 I U I/U I/U I/U
OASIS SOAP-over-UDP (User Datagram Protocol) v1.1 I U I/U
OASIS Solution Deployment Descriptor Specification v1.0 U I/U
OASIS Symptoms Automation Framework (SAF) Version 1.0 I/U
OASIS Topology and Orchestration Specification for Cloud Applications Version 1.0 I/U U I I/U
OASIS Universal Business Language (UBL) v2.1 I U I/U U
OASIS Universal Description, Discovery and Integration (UDDI) v3.0.2 I U I/U U
OASIS Unstructured Information Management Architecture (UIMA) v1.0 U I
OASIS Unstructured Operation Markup Language (UOML) v1.0 I U I/U I
OASIS/W3C WebCGM v2.1 I U I/U I
OASIS Web Services Business Process Execution Language (WS-BPEL) v2.0 U I
OASIS/W3C - Web Services Distributed Management (WSDM): Management Using
Web Services (MUWS) v1.1 U I I U U
OASIS WSDM: Management of Web Services (MOWS) v1.1 U I I U U
OASIS Web Services Dynamic Discovery (WS-Discovery) v1.1 U I U I/U U
OASIS Web Services Federation Language (WS-Federation) v1.2 I U I/U U
OASIS Web Services Notification (WSN) v1.3 I U I/U
IETF Simple Network Management Protocol (SNMP) v3 I I I/U U
IETF Extensible Provisioning Protocol (EPP) U I/U
NCPDPD Script standard . . . . . . .
ASTM CCR message . . . . . . .
HITSP C32 HL7 CCD Document . . . . . . .
PMML Predictive Model Markup Language . . . . . . .
Add the standards that Russell gathered
1317
D-5
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
1318
1319
D-6
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
ISO/IEC 9075-*
ISO/IEC Technical Report (TR) 9789
ISO/IEC 11179-*
ISO/IEC 10728-*
ISO/IEC 13249-*
ISO/IE TR 19075-*
ISO/IEC 19503
ISO/IEC 19773
ISO/IEC TR 20943
ISO/IEC 19763-*
ISO/IEC 9281:1990
ISO/IEC 10918:1994
ISO/IEC 11172:1993
ISO/IEC 13818:2013
ISO/IEC 14496:2010
ISO/IEC 15444:2011
ISO/IEC 21000:2003
ISO 6709:2008
D-1
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
ISO 19115-*
ISO 19110
ISO 19139
ISO 19119
ISO 19157
ISO 19114
IEEE 21451 -*
IEEE 2200-2012
ISO/IEC 15408-2009
ISO/IEC 27010:2012
ISO/IEC 27033-1:2009
ISO/IEC TR 14516:2002
ISO/IEC 29100:2011
ISO/IEC 9798:2010
ISO/IEC 11770:2010
ISO/IEC 27035:2011
ISO/IEC 27037:2012
JSR (Java Specification Request) 221 (developed by the Java Community Process)
W3C XML
W3C Resource Description Framework (RDF)
W3C JavaScript Object Notation (JSON)-LD 1.0
W3C Document Object Model (DOM) Level 1 Specification
W3C XQuery 3.0
W3C XProc
W3C XML Encryption Syntax and Processing Version 1.1
W3C XML Signature Syntax and Processing Version 1.1
W3C XPath 3.0
W3C XSL Transformations (XSLT) Version 2.0
W3C Efficient XML Interchange (EXI) Format 1.0 (Second Edition)
D-2
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
D-3
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
OASIS Topology and Orchestration Specification for Cloud Applications Version 1.0
OASIS Universal Business Language (UBL) v2.1
OASIS Universal Description, Discovery and Integration (UDDI) v3.0.2
OASIS Unstructured Information Management Architecture (UIMA) v1.0
D-4
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
D-5
DRAFT NIST BIG DATA INTEROPERABILITY FRAMEWORK: VOLUME 7, STANDARDS ROADMAP
1335GENERAL RESOURCES
1336Institute of Electrical and Electronics Engineers (IEEE). https://www.ieee.org/index.html
1337International Committee for Information Technology Standards (INCITS). http://www.incits.org/
1338International Electrotechnical Commission (IEC). http://www.iec.ch/
1339International Organization for Standardization (ISO). http://www.iso.org/iso/home.html
1340Open Geospatial Consortium (OGC). http://www.opengeospatial.org/
1341Open Grid Forum (OGF). https://www.ogf.org/ogf/doku.php
1342Organization for the Advancement of Structured Information Standards (OASIS). https://www.oasis-
1343open.org/
1344World Wide Web Consortium (W3C). http://www.w3.org/
1345
1346Authoritative sources. OGAF, Zachman, MOD Architecture, MDM Institute, ITVAL and CoBIT, IFEAD,
1347FSAM, Carnegie Mellon Software Engineering Institute.
13481. Frank Farance, Definitions of Big Data, 2015. ( CITATION NEEDED)
13492. Dale Neef, Digital Exhaust: What Everyone Should Know About Big Data, Digitization and Digitally
1350Driven Innovation. http://searchdatamanagement.techtarget.com/feature/Three-ways-to-build-a-big-
1351data-system?utm_medium=EM&asrc=EM_ERU_53936785&utm_campaign=20160229_ERU
1352%20Transmission%20for%2002/29/2016%20(UserUniverse:%201963109)_myka-
1353reports@techtarget.com&utm_source=ERU&src=5484610
13542.3. IBM developerworks. Big data architecture and patterns, part 2.
1355http://www.ibm.com/developerworks/library/bd-archpatterns2/index.html
13563. NITRD Subcommittee on Networking and Information Technology Research and Development.
1357Federal Big Data R&D Strategic Plan, May 2016. https://www.nitrd.gov/pubs/bigdatardstrategicplan.pdf
13584. Box. The information economy: a study of five industries. 2014.
13594.4.1. Gartner. Magic quadrant for metadata management solutions.
1360
1361DOCUMENT REFERENCES
B-1
51 The White House Office of Science and Technology Policy, “Big Data is a Big Deal,” OSTP Blog, accessed February 21,
62014, http://www.whitehouse.gov/blog/2012/03/29/big-data-big-deal.
72 Cloud Security Alliance, Expanded Top Ten Big Data Security and Privacy Challenges, April 2013.
8https://downloads.cloudsecurityalliance.org/initiatives/bdwg/Expanded_Top_Ten_Big_Data_Security_and_Privacy_Challe
9nges.pdf
103 “Big Data, Preliminary Report 2014”, ISO/IEC JTC1: Information Technology. http://www.iso.org/iso/big_data_report-
11jtc1.pdf. (Accessed March 2, 2015). Pages 21-23.
124 universal, federated or unified search in the land of information silos. KMworld. Accessed may 5, 2016,
13http://www.kmworld.com/Articles/Editorial/Features/Universal-federated-or-unified-search-in-the-land-of-information-
14silos-101165.aspx
154 Context Query Language (CQL): http://www.loc.gov/standards/sru/cql/contextSets/
165 Resource Description Framework (RDF): https://www.w3.org/RDF/