You are on page 1of 76

SUSA Architecture Team – Working Draft – Version 0.

1 – 19 October 2010

Open Architecture Project:


A Key National Indicator System for the
United States

Managed by

The State of the USA -- Working Draft -- Version 0.1


Please provide comments on this draft to feedback@stateoftheusa.org

KNIS Draft Architecture by the State of the USA is licensed under a


Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

Supported by

The John D. And Catherine T. MacArthur Foundation

Please provide comments on this draft to feedback@stateoftheusa.org 1


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
Page Intentionally Left Blank

Please provide comments on this draft to feedback@stateoftheusa.org 2


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
October 19, 2010

To Whom It May Concern,

This document is a working draft (version 0.1) of an enterprise architecture for a Key
National Indicator System for the United States. It is being published solely by the State of
the USA in concert with its technical advisors for open comment. It is specifically intended
for technical audiences – in all sectors and at all levels of our society.

Its purpose is not to finalize a design but to start a specific dialogue over the next year that
will underpin important technical decision-making. (Please see www.stateoftheusa.org for
more information on the Key National Indicator System and the State of the USA.) Hence,
it is not a consensus document. There is ongoing debate and discussion amongst our
team and advisors on dozens of issues. However, it is time to open up the process and
expand involvement in the project with the publication of this initial version.

The architecture outlines key principles but does not suggest product selection. It
generates working hypotheses but does not define operational specifications. It has a five
year planning horizon but is only a first step toward designing an official Key National
Indicator System implementation. It represents the hard work of a dozen individuals but
anticipates engaging hundreds from around the country in a dialogue based on this
document.

We are actively seeking critiques, ideas and suggestions about purpose, structure, content
and process. Out of this dialogue will come the requirements, design and specifications for
how best to start and then evolve an architecture for a Key National Indicator System.

The evolution of democratic society has always been about striving to achieve increasing
specificity about progress and higher degrees of transparency. These mutually reinforce
one another to accelerate learning and improve accountability for the use of scarce
resources our nation grows. The State of the USA is grateful to the John D. and Catherine
T. MacArthur Foundation for their visionary support of this activity.

We have concluded that design challenge for a Key National Indicator System can only be
accomplished with a combination of an open and inclusive approach – guided by
individuals who have histories and track records of large-scale, complex enterprise and
systems development. For this reason, the State of the USA Open Architecture Project
has been especially fortunate to be guided in this early stage of our process by a

Please provide comments on this draft to feedback@stateoftheusa.org 3


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
distinguished group of technical advisors, which you will find listed at the conclusion of this
letter.

Why is openness so important? Our intent is to make the asset that is built for the nation
accessible to the widest range of possible users, from individuals to institutions and from
the public to application developers. This is best achieved through a transparent and
collaborative process that involves representatives from diverse user and stakeholder
communities. Hence, this architectural document anticipates discussions about open
standards, open communities and open source software that would all be a vital
underpinning of a KNIS.

If you have comments, please contact us at feedback@stateoftheusa.org. During the fall


of 2010, the State of the USA will host two national webinars, one in November and one in
December. Final dates will be posted on the SUSA website at the same time this
document is published. To register for either one of these webinars, please send an
email with the subject title ―ARCHITECTURE WEBINAR‖ to feedback@stateoftheusa.org.
Each of these discussions will be a chance to have more dynamic interactions with SUSA
technical advisors on topics raised in this document. It is also our intention to expand this
group of advisors over the coming year to increase its scope, depth and diversity.

Please join all of us on this journey to create a Key National Indicator System for the
United States. It is one that cannot help but advance our capability to answer vital
questions about how to define, measure and communicate about progress – or the lack
thereof – in entirely new ways.

Most Sincerely,

Christopher Hoenig
President and CEO

Please provide comments on this draft to feedback@stateoftheusa.org 4


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
State of the USA Technical Advisors

 Bill Allman, Vice President, E-Media for Bonniercorp.com and noted enterprise
information dissemination and visualization thought leader.
 Peter Blair, Executive Director of the Division of Engineering and Physical Sciences
at the National Research Council.
 George Brucia, Experienced enterprise technologist with a long history of fielding
well-engineered, user-focused systems of high quality, at large scale at both public
and private enterprises.
 Hank Conrad, Managing Partner of CounterPoint Corporation and expert in IT
business alignment, systems integration, program management, outsource
management, process improvement, relationship management, change
management, and new technology introduction.
 David Epstein, Chief Operating Officer, MAK. A senior technology executive with
leadership experience in the IBM, U.S. military, global research, and national health-
related information technology organizations on bio-surveillance, adverse drug and
quality of care events, intelligent building and city infrastructure, advanced water
management, and market analysis.
 Larry Filetti, Managing Partner, 716 Group, Inc and proven enterprise technology
and strategy executive known for enterprise architecture, IT transformation,
technology introduction and delivering systems of high usability, including business
intelligence and enterprise IT to organizations like Argonne National Labs, First
national Bank of Chicago and McDonald's.
 Jamie Gaughran-Perez, Partner at Threespot. Creator of user-focused web-
delivered content and systems including delivery of highly scalable solutions to
clients such as Brookings, the NFL, national TV programs and the U.S. Congress.
 Scott Gilkeson, Chief Data Officer, State of the USA, and Website Development
Team.
 Bob Gourley, Chief Technology Officer, Crucial Point LLC and editor,
CTOvision.com. Project Lead, SUSA KNIS Architecture.
 Marvin (Marv) F. Langston, Principal, Langston Associates, and former Deputy
Chief Information Officer for the Department of Defense.
 Howard Parnell, Vice President, Content and Creative at the State of the USA

Please provide comments on this draft to feedback@stateoftheusa.org 5


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
 Ron Ponder, EVP and CIO, Wellpoint. Globally known business and IT executive,
former CIO at Federal Express, Sprint and AT&T, expert in large scale operational
implementation and world class business performance.
 Ben Shneiderman, Member of the National Academy of Engineering. Professor of
Computer Science and founding Director of the Human-Computer Interaction
Laboratory at the University of Maryland and globally known expert in creativity and
cognition, information visualization, and information technology.
 Bill Vass, Globally known IT executive, former CIO of the Office of Secretary of
Defense, former CIO of Sun Microsystems, former President of Sun Federal.
Known for designing and building highly scalable, fast, interoperable user-focused
systems.

Please provide comments on this draft to feedback@stateoftheusa.org 6


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
Table of Contents
State of the USA Technical Advisors ............................................................................... 5
Introduction and Background ......................................................................................... 10
About This Document .............................................................................................................. 12
Sections of the Draft KNIS Architecture ................................................................................. 13
Foundations of the KNIS Architecture ........................................................................... 15
The KNIS Mission: ................................................................................................................... 15
Audience for this Architecture Document .............................................................................. 16
Architecture Defined ................................................................................................................ 16
Terminology ............................................................................................................................. 17
The KNIS Value Proposition .................................................................................................... 18
The KNIS User Communities and Their Needs ...................................................................... 18
KNIS High Level Requirements ............................................................................................... 19
Design Principles ..................................................................................................................... 22
Conceptual Architecture .................................................................................................. 25
Logical Architecture ......................................................................................................... 27
High-level Guidelines............................................................................................................... 27
Reuse and Purchase Before Developing ................................................................................ 28
Open Systems and Open Standards ...................................................................................... 28
Vendor Specific Extensions .................................................................................................... 28
Separation of Concerns.......................................................................................................... 28
Decomposition ....................................................................................................................... 28
Systemic Qualities.................................................................................................................. 28
Business Continuity................................................................................................................ 28
Architecting for Security ......................................................................................................... 29
Architectural Patterns ............................................................................................................. 29
Architecting for Usability ......................................................................................................... 29
Enterprise Tier.......................................................................................................................... 29
KNIS core processes ............................................................................................................. 29
KNIS Architectural Governance.............................................................................................. 30
Architecture, Design Guidance, Implementation Directives .................................................... 32
Contributing Back to the Open Source Community ................................................................ 32
Client Tier ................................................................................................................................. 33
Thin Client Rule ..................................................................................................................... 33
Client Mobility Rule ................................................................................................................ 33
Disconnected Client Rule ....................................................................................................... 33
Client Applet Rule .................................................................................................................. 34
Client Usability Rule ............................................................................................................... 34
Presentation Tier...................................................................................................................... 34
Localization (L10N) Rule ........................................................................................................ 35
Internationalization (I18N) Rule .............................................................................................. 35
Accessibility Rule ................................................................................................................... 35
End-User Preference Configuration Rule ............................................................................... 36
End-User Role Identification Rule .......................................................................................... 36
Field Validation Rule .............................................................................................................. 36
Presentation Tier Standards Rule .......................................................................................... 36
Active Content Rule ............................................................................................................... 37
Please provide comments on this draft to feedback@stateoftheusa.org 7
SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
Business Processing Tier ....................................................................................................... 37
Services Overview ................................................................................................................. 37
Shared Services ..................................................................................................................... 39
Data Resource Tier .................................................................................................................. 41
The KNIS website will require at least the following data resources: ...................................... 41
Estimates of Data Size and Scope ......................................................................................... 42
Data Access ........................................................................................................................... 42
Data Persistence .................................................................................................................... 43
Data Labels ............................................................................................................................ 44
Data in the cloud .................................................................................................................... 45
Data Registry ......................................................................................................................... 46
Content Management Systems .............................................................................................. 46
Data Provenance ................................................................................................................... 47
Data Value Add ...................................................................................................................... 47
Integration Tier ......................................................................................................................... 47
Data Schema, Format and Semantics .................................................................................... 48
Batch Data Transfers ............................................................................................................. 48
Syndication of Value Added Content ...................................................................................... 48
Syndication and Social Media ................................................................................................ 49
Discovery of data sets and other content ............................................................................... 49
Interaction Models .................................................................................................................. 49
Direct and Indirect Integration ................................................................................................ 50
Third-Party Application Integration ......................................................................................... 51
Technical Architecture ..................................................................................................... 53
High-level Guidelines............................................................................................................... 54
Operating System Guidance .................................................................................................. 55
Designing for Flexibility in use of new Cloud Capabilities ....................................................... 55
Technology Architecture of Client Tier .................................................................................. 56
Browser .................................................................................................................................. 56
Consumer device apps .......................................................................................................... 56
Technology Architecture of Presentation Tier ....................................................................... 56
HTML and CSS ...................................................................................................................... 56
XML and XSLT ....................................................................................................................... 57
Presenting data via widget: .................................................................................................... 57
Application Frontends ............................................................................................................ 57
User and Usability Testing ..................................................................................................... 57
Technology Architecture of the Business Processing Tier .................................................. 57
Web Services ......................................................................................................................... 57
SOAP and REST:................................................................................................................... 58
Assertion of authorization in a Web Services environment ..................................................... 59
Web Services Continued ........................................................................................................ 59
Application Business Web Services ....................................................................................... 60
Web Service Registry ............................................................................................................. 60
Data Registry Choices ........................................................................................................... 61
Technology Architecture of the Data Resources Tier ........................................................... 61
Relational Database Management Systems (RDBMS) ........................................................... 62
Directory Servers ................................................................................................................... 62
Object-Oriented Databases (OODB) ...................................................................................... 62
XML Database ....................................................................................................................... 63
File Systems .......................................................................................................................... 63

Please provide comments on this draft to feedback@stateoftheusa.org 8


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
RDBMS or Directory Servers.................................................................................................. 63
Java Database Connectivity (JDBC) ...................................................................................... 64
Data Storage .......................................................................................................................... 64
Data Source Types ................................................................................................................ 65
Content Management System ................................................................................................ 66
Content Abstraction Layer ...................................................................................................... 66
Technology Architecture of the Integration Tier .................................................................... 66
Open System Web Server ...................................................................................................... 66
Open System Application Server ............................................................................................ 67
Open System Portal Server .................................................................................................... 67
Relational Database Server ................................................................................................... 67
Monitoring Products ............................................................................................................... 67
Network Attached Storage (NAS) ........................................................................................... 67
Storage Area Network (SAN) ................................................................................................. 68
Virtual Private Network (VPN) ................................................................................................ 68
Monolithic Applications and Legacy Applications ................................................................... 68
Syndication of Value Added Content ...................................................................................... 68
Integration Testing ................................................................................................................. 68
Content Delivery Services ...................................................................................................... 69
Technology Trends to Watch .......................................................................................... 70
The Current SUSA Beta Architecture ............................................................................. 71
Glossary ............................................................................................................................ 72
Architecture Resources ................................................................................................... 74
Table of Standards ........................................................................................................... 75
About This Architecture................................................................................................... 76

Please provide comments on this draft to feedback@stateoftheusa.org 9


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
Introduction and Background
This document is a working draft (version 0.1) of an enterprise architecture for a Key
National Indicator System for the United States. It is being published solely by the State of
the USA in concert with its technical advisors for open comment. It is specifically intended
for technical audiences – in all sectors and at all levels of our society. Its purpose is not to
finalize a design but to start a specific dialogue over the next year that will underpin
important technical decision-making. Hence, it is also not a consensus document. There
is ongoing debate and discussion amongst our team and advisors on dozens of issues.
However, it is time to open up the process and expand involvement in the project with the
publication of this version 0.1.

In preparations for continued support to a Key National Indicator System (KNIS) for the
United States, the State of the USA has drafted this architectural vision, principles, concept
and plans relevant to the implementation of a KNIS. The purpose for this document is to
leverage shared assets and accelerate learning among the many participants in a national
indicator system to maximize its full potential for service to the American people.

The version outlines key principles but does not suggest product selection. It generates
working hypotheses but does not define operational specifications. It has a five year
planning horizon but is only a first step toward designing an official Key National Indicator
System implementation. It represents the hard work of a dozen individuals but anticipates
engaging hundreds from around the country in a dialogue based on this document.

We are actively seeking critiques, ideas and suggestions about purpose, structure, content
and process. Out of this dialogue will come the requirements, design and specifications for
how best to start and then evolve an architecture for a Key National Indicator System.

The evolution of democratic society has always been about striving to achieve increasing
specificity about progress and higher degrees of transparency. These mutually reinforce
one another to accelerate learning and improve accountability for the use of scarce
resources our nation grows. The State of the USA is grateful to the John D. and Catherine
T. MacArthur Foundation for their visionary support of this activity.

The mission of the State of the USA is to help the American people assess the progress of
the nation for themselves, using the nation’s best quality measures and data. Its vision is
to make these available to the public on the web as a free service in such an easily usable
form that they become a shared frame of reference for civic debate on whether we are, in
fact, making progress on the major issues we face.

As a non-profit, non-partisan institution, it conducts work in a collaborative and transparent


fashion, involving a diverse range of individuals and institutions. The State of the USA’s
founding in 2007 consciously built on 20 years of work by millions of Americans that had

Please provide comments on this draft to feedback@stateoftheusa.org 10


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
already created a patchwork of key indicator systems at the neighborhood, city, county,
regional and state levels. (For more information, including a history of the effort, please
see www.stateoftheusa.org.)

Starting in 2010, the growing movement by Americans to assess progress with key
indicators officially reached the national level. After many years of development and
bipartisan support, a Key National Indicator System for the United States has been
mandated by law (P.L 111-148, sec. 5605). A bipartisan Commission on Key National
Indicators is being constituted by Congressional leadership of both parties. That
Commission will then negotiate an agreement with the National Academy of Sciences to
implement a web-based KNIS in partnership with a non-profit institute, the State of the
USA.

Although these relationships are still being formalized, preparation has begun in earnest,
which is the reason for SUSA’s publication of this document. As a public/private
partnership, resources and talent from both the public and private sector must be involved
early in the process of preparation. This document has not been reviewed or approved by
the National Academy of Sciences, the National Academy of Engineering, the Institute of
Medicine or the National Research Council.

A Key National Indicator System can help millions of Americans become better informed
about the progress of the United States on a wide range of issues, from education to
innovation, from the environment to the economy, and from families and children to health.
The question this document begins to address is how best to design this system for the
country.

For clarity, this is an architecture for a ―national‖ system, not a ―governmental‖ system. It
must take account of and complement efforts in government. But a national system in our
society must involve the government, business, media, non-profit and academic sectors. It
must involve government at the federal, state and local levels as well as international
organizations that collect and publish data for purposes of comparing the U.S. to other
countries.

In addition to such a broad scope, the design task is made doubly challenging by a
technology environment with a dizzying rate of evolution and innovation. The design must
optimize performance and openness, innovation and continuity for the nation, as well as
balancing hundreds of other potential tradeoffs. It must support continuing, high quality
production while keeping pace with the external technical environment.

We have concluded that this design challenge can only be accomplished with a
combination of an open and inclusive approach – guided by individuals who have histories
and track records of large-scale, complex enterprise and systems development. For this
reason, the State of the USA Open Architecture Project has been especially fortunate to be
guided in this early stage of our process by a distinguished group of technical advisors.

Please provide comments on this draft to feedback@stateoftheusa.org 11


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
Why is openness so important? Our intent is to make the asset that is built for the nation
accessible to the widest range of possible users, from individuals to institutions and from
the public to application developers. This is best achieved through a transparent and
collaborative process that involves representatives from diverse user and stakeholder
communities. Hence, this architectural document anticipates discussions about open
standards, open communities and open source software that would all be a vital
underpinning of a KNIS.

If you have comments, please send them to feedback@stateoftheusa.org. The goal of this
project is to continually expand participation in order to increase its scope, depth and
diversity. At the conclusion of this version, the State of the USA technical advisors
identified several areas that are high priority for inclusion in our work program for the next
iteration of this document – version 0.2. Hence, these are of special interest for those
providing feedback. Those areas are:

-- Refinements in audience priority and segmentation


-- Increased detail in user/stakeholder requirements definition
-- Increased detail of enterprise process design
-- Specificity of user experience and information architecture
-- Specificity of data services, curation and integration
-- Elaborate roadmap for implementation over time
-- Expanded references to key documents
-- Expanded references to comparable sites/installations
-- Increased treatment of security and privacy considerations

About This Document


This version 0.1 of the KNIS architecture was prepared by an interdisciplinary team of
issue experts, technologists, program managers and enterprise architects, supported with
initial input from KNIS stakeholders. In its current state, it sets aspirational goals for
openness and is intended to provoke thought and debate about how best to design and
implement national and community indicator systems. The State of the USA would like to
see this framework develop into a system that others could adopt and adapt, thus creating
synergy and shared value. The aim during the coming year is to evolve a collaborative
design which can be borrowed and built upon by the many organizations required to
support a KNIS. The architecture should eventually also provide a frame of reference for
third-party developers seeking to create independent end-user capabilities that may be
enabled by a KNIS.

Please provide comments on this draft to feedback@stateoftheusa.org 12


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
Caveats and Context:
- This is a living document in continuous development while input is being solicited
from a broad community. Since it is an initial working draft, in the future please
consult with SUSA to ensure you are working off of the latest version.
- This document is based on an evolving set of user requirements, of which only a
brief summary is provided here. SUSA is currently running both a public site
www.stateoftheusa.org and a more advanced private beta implementation to gain
input on requirements definition, evolve design principles, test alternative
components and assess performance. If you are interested in becoming a beta user,
please sign up on the public site.
- This is a high-level design document and intentionally does not provide the
specificity required to implement a KNIS. Architectures with progressively more
detailed specifications will flow from this one.
- This document is written with a five year forward planning horizon in mind. It
anticipates multiple specific implementations within that five year horizon as well as
continuing alteration and adaptation of the architecture based on evolving
requirements, technologies and key external factors in the market space.
- This document provides best practices, lessons learned and architectural decisions
we believe are right for enterprise capabilities of this nature, but customization is
also required prior to establishing any implementation.
- Designs and implementations that that flow from this architecture are intended to
support the Key National Indicator System articulated in P.L. 111-481 sec. 5605.
However, the KNIS is still in the process of formalizing governance and
management processes. Hence, this draft has not been reviewed by or approved by
any institution established by or named in that law.

Sections of the Draft KNIS Architecture


Key sections of this draft are:
A Conceptual Architecture View: This section identifies the main components of
the architecture and provides important context.
A Logical Architecture View: The logical architecture section explains component
functions and their interrelationships in greater detail, which begins to provide
directional guidance for implementation. This guidance is reflected and embodied in
the KNIS technical architecture, but it is also repeatable in other technical
architectures.
A Technical Architecture View: The purpose of a technical architecture is to map
defined components from the logical architecture to specific implementation
technologies. These technologies are generally layered and support standard
interfaces that allow them to be used in a ―plug and play‖ manner.

1 For more on P.L 111-148 see PDF at:


http://www.stateoftheusa.org/assets/Key%20National%20Indicators%20Act%20of%202008.pdf
Please provide comments on this draft to feedback@stateoftheusa.org 13
SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
In short, the conceptual architecture is initial context, the logical architecture provides
guidance and rules regarding logical component segmentation, and the technical
architecture provides more specific standards and technologies.
A graphical overview of the draft KNIS architecture is provided in Figure 1 below. This
graphic will be returned to again in the conceptual architecture. It will be modified and
extended in the logical architecture section. And it will be expanded on again in detail
within the technical architecture section.

Figure 1: Graphical Overview of the Draft KNIS Architecture

Please provide comments on this draft to feedback@stateoftheusa.org 14


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
Foundations of the KNIS Architecture
This section of the draft KNIS architecture provides an overview of the background,
requirements and guidance necessary to ensure the components of the KNIS are aligned
with the mission and accomplish the objectives outlined in P.L. 111-48.

The KNIS Mission:


A Key National Indicator System can help Americans better assess the position and
progress of the nation for themselves, freely and easily, with the best quality measures and
data on the most important issues facing the country.
The United States faces many systemic issues, with few systemic ways to measure and
manage the progress of our society. The architecture presented here is designed to
address this situation by enabling a source of high quality measures, free and easily usable
for millions, of measures on the nation's major issues. The impact can be seen in better
framed problems, increased understanding of what we know and of what works, more
informed choices, and improved resource allocations.
Unique KNIS attributes:
Breadth – A KNIS addresses all topics relevant to a society (e.g., economy,
innovation, families, youth and children, education, environment, health) with a
dynamic topic structure that can be extended to meet consumer demand or evolving
concepts.
Focus – A KNIS presents carefully selected issue frames, indicators and datasets
from the highest quality sources for each topic—on the order of tens of measures,
rather than hundreds or thousands. The selected indicator set will evolve, and
individual indicators may appear under multiple topics. The issue frames, indicators
and data sets will be selected in an editorial process designed by the National
Academy of Sciences, the National Academy of Engineering, the Institute of
Medicine, the National Research Council and the State of the USA.
Consistency – All of the indicators presented by the KNIS will have similar
functionality and expression. Once site visitors understand a simple and intuitive
way to interact with one measure, they will know how to interact with any of them
and explore interrelationships. The underlying data will be available for downloading,
either through the user interface or via a standard internet protocol.
Multi-dimensionality – Each indicator will be available with as much detail as
possible within quality constraints along four major dimensions: time, geography,
demographics (or other appropriate aspect) and conceptual decomposition. At
times, this will mean using data from various different sources, which may not be
strictly comparable, for the same measure. For example, obesity is measured
clinically by a small survey which can only provide data at the national level. In order
to show obesity at the state level, a less reliable but much larger survey must be
used. Data quality is a component of this in all dimensions.

Please provide comments on this draft to feedback@stateoftheusa.org 15


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
Audience for this Architecture Document
The KNIS architecture was written for designers and developers of enterprise, information
and technology systems. These are the people who will create and deliver a capability. It
was also written for those who will oversee the production of the capability, including
stakeholders and mission partners.
Architecture Defined
"Architecture serves as the blueprint for both the system and the project developing it,
defining the work assignments that must be carried out by design and implementation
teams. The architecture is the primary carrier of system qualities, such as performance,
modifiability, and security, none of which can be achieved without a unifying architectural
vision. Architecture is an artifact for early analysis to make sure that the design approach
will yield an acceptable system. Architecture holds the key to post-deployment system
understanding, maintenance, and mining efforts. In short, architecture is the conceptual
glue that holds every phase of the project together for all its many stakeholders." –
From the Carnegie Mellon Software Engineering Institute,
http://www.sei.cmu.edu/architecture
While there is no single, widely adopted definition of architecture, the many definitions
available have a great deal in common, and SUSA's approach to architecture is consistent
with that of SEI, above. Figure 2 provides a visual guide to what is included and not
included in this architecture.

Figure 2: Scope of the KNIS Architecture

Please provide comments on this draft to feedback@stateoftheusa.org 16


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
Terminology
A few terms are used extensively throughout this document. To provide a working
understanding of these terms, they are defined briefly below. For more detailed definitions,
refer to the Glossary
Principles, Strategies and Recommendations
These terms are often confused. As they apply to enterprise architectures, their definitions
are as follows:
- A principle is a high-level rule that is well established and not likely to change over
time.
- A vision is a desired future state.
- A strategy is a means for achieving a vision, often a medium-level rule or guideline
that applies to a specific area of the architecture. Strategies can evolve over time as
the technological landscape changes.
- A recommendation is a specific ―best practice‖ that has proved to be effective and
desirable based upon past experience.
Component, Service
A component or service is a unit of software with a single purpose, which has an interface,
and interacts with other components and services. Services are distinguished from
components in that they tend to run in their own process, whereas components may simply
be software libraries. For a more detailed definition of service, refer to the W3C standard.
Interface
As defined by CMU/SEI-2002-TN-015, an interface is ―a boundary across which two
independent entities [components] meet and interact or communicate with each other.‖
The specification of interfaces at the architectural level is extremely important to ensuring
that components can be built independently yet work correctly together.
Dependency
Component A has a dependency upon another component, B, if the correct functioning of A
depends on the existence and correct functioning of B. Additionally, any change to B may
have an effect on component A.
Must, Should and May
In this document, wherever possible, the following terms are to be interpreted as described
in the requirements language standards found in the RFC 2119 standard (the standard is
paraphrased here for the reader's convenience):
- When the word must, required or shall is used, the statement is an absolute
requirement of the specification.
- When the phrase must not or shall not is used, the statement is an absolute
prohibition of the specification.

Please provide comments on this draft to feedback@stateoftheusa.org 17


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
- When the word should or recommended is used, there may exist valid reasons to
ignore the statement under certain circumstances, but the full implications of doing
so must be understood and carefully weighed.
- When the phrase should not or not recommended is used, there may exist valid
reasons the behavior is acceptable or even useful under certain circumstances, but
the full implications should be understood and the case carefully weighed before
implementing the action.
- When the word may or optional is used, the statement is not compulsory.
The KNIS Value Proposition
The KNIS is creating a single source, provided free and easily usable for millions, of the
highest quality measures and data on the nation's major issues.
Current and future versions of the KNIS site and IT capability are designed to be easy to
use and to provide tools that will enable KNIS staff and Americans to discover, understand
and share information across the Web through distributed publishing and social
networking.
In so doing, the KNIS seeks to unite nonprofits, the media, government decision makers,
business leaders, scientists, educators and citizens around a single goal: to deepen and
broaden our factual knowledge and understanding of the country's most pressing issues.
Relying on expertise and quality assurance from the National Academy of Sciences, the
National Academy of Engineering, the Institute of Medicine, the National Research Council,
the statistical community, the scientific community and individual, nationally recognized
subject-matter experts from all sectors, a KNIS will assemble the highest quality
quantitative measures and related data and develop Web presentations designed to make
it easy for interested citizens to assess whether progress is being made, where it is being
made, by whom and compared to what.
KNIS value proposition emphasizes:
- Highest quality, most current data on the issues that matter most
- "Key" measures incorporate ease of understanding, grasping what matters most
- Reliable, free and accessible data and contextual content
- Engaging and educational data and context
- User interaction, commenting and discussion opportunities
- Publishing on Web time, in constant motion with frequent updates
The KNIS User Communities and Their Needs
Delivering a valuable service to users is at the heart of a KNIS. Users can be segmented
into four broad categories: Individual Users, Institutions, Partners and Stakeholders and
Developers:
Individual Users: This segment includes all users with an interest in quality measures and
data on the state of their nation. Additionally, a KNIS is a means to introduce new topics
into consideration for discussion. The public includes many sub-segments of the U.S.

Please provide comments on this draft to feedback@stateoftheusa.org 18


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
public: policy makers, business, students, researchers, professionals, legislators,
educators, journalists and any other community requiring access to quality data. This user
segment can be served as individuals but increasingly individuals are leveraging social
media and collaborative capabilities to work on large issues and the KNIS intends to
support and leverage social media in providing its service to the public.
Institutions and Partners: A KNIS will work with many other institutions in the key national
indicator ecosystem, many of whom are data providers, many of whom are data users, and
a large portion of which are both. They will include both public and private institutions. The
KNIS will benefit institutions (including governments, business and non-profits) by
enhancing their access to actionable information to enable better strategies and resource
allocation choices on investments in complex issues. The KNIS will provide media partners
with new information and tools that improve productivity, depth of coverage and accuracy.
The KNIS will provide business partners better insight into broad societal patterns and
trends for planning, investment and product/service creation. Education partners will be
provided with information that enables improved quality of curricula, increased numeracy,
better understanding of public issues, and increased levels of meaningful civic
engagement.
Stakeholders: These include the American people, the U.S. Federal government, state
and local governments, the business community, civil society, KNIS leadership, KNIS
partners, statistical data providers, concerned foundations, academicians, and a wide
variety of other members of the KNIS family of stakeholders. Some of these stakeholders
will leverage the KNIS infrastructure to be a reliable source of timely and accurate data and
repeatable models. Others will use the architecture to disseminate information.
Developers: The KNIS is building wherever possible on open platforms using open
standards designed to empower developers. This community needs information on KNIS
measures and data and how to find them, as well as guidance on information quality. They
are also appreciative of ways to share lessons and code.

KNIS High Level Requirements


The KNIS strategy for building a broad user base and an active audience for its content is
to employ state of the art social networking and syndication techniques to promote a
dynamic, engaging web site. This means that KNIS content will be made available through
multiple channels, in addition to a standard web site. Additional channels include widgets
that can be embedded in other web pages, direct programmatic access to indicator data
and metadata, online webinars and YouTube videos, and more.
Design of the KNIS capability will create a clear and engaging environment for the
audience to explore, learn, and take action. The interface should be clean and intuitive, and
should facilitate the easy location of indicators, data, and editorial content. The design
should also facilitate the incorporation of syndication, including widgets within other web
sites/services to facilitate syndication/distribution. A range of visual and interactive methods
should be used to identify, clarify, and compare issues. Deeper exploration of issues,
measures and data should be a consistent possibility.

Please provide comments on this draft to feedback@stateoftheusa.org 19


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
The KNIS is working to engender trust and credibility, and the aesthetic must be intelligent,
authoritative, and professional. However, the experience should not be overly complex or
technical. The cumulative goal of design and content organization should be to offer the
audience a clearer understanding of one or more national issues, and to provide a set of
flexible and branded tools with which to take further action.
The KNIS will provide an experience that allows users to search and manipulate data in the
context of a very flexible super-issue, issue and sub-issue construct to gain a better sense
of systemic issues. The KNIS will also strive to achieve credible simplicity through carefully
selected data and measures with easy links to more complex or sophisticated sites or data
sources.
The following are key categories of high-level user requirements which are currently being
defined in our ongoing processes, which include a combination of user testing and
feedback, use case scenario building, and specification to mainstream technical standards:
- Performance and Reliability: The site must be able to meet mainstream market
standards for uptime and responsiveness.
- Availability: All KNIS functionality will be highly available and reliable and an
appropriate security design will ensure this. Additionally, in the event of a
catastrophe, all data will be backed up and ready to respond and recover.
- Scalability: The site must be able to grow to consumer-scale and withstand not
only high user demands but diverse data types and be able to do so with resiliency
of a mission critical system.
- Multi-platform and multi-vendor: The KNIS must be capable of supporting
multiple technology and vendor products.
- Portability: The KNIS must emphasize capabilities for sharing, syndication and
social interaction.
- Selectivity: Users should be able to understand the rationale for selection of issue
frames and limited numbers of key measures and data sets to enhance their
capacity for assessing the nation as a whole.
- Credibility: Choices of issues, measures and data sets and the presentation of
information must meet the highest standards of professionalism.
- Accessibility: The site should support a multilayered design balancing credible
simplicity and complexity, for all types of audiences, along with freely available
content and no advertising.
- Quality: Dimensions of user-centric measure and dataset quality should be
incorporated in metadata and essential dimensions exposed so that users can
assess relative information quality depending on their intended use.
- Utility: Content should be presented in a fashion that makes meaningful facts
easy to discover and then presents measures and data in a way that is practical and
useful.

Please provide comments on this draft to feedback@stateoftheusa.org 20


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
- Multi-Dimensional: The KNIS should give the user the capacity to look at any
measure over time, at different levels of geography, by demographic group or at
various levels of information density/abstraction.
- Content in Context: The KNIS should offer not only the essential statistics but
also contextual information that enhances understanding and engagement without
interpretation. Issues, measures and data will always be presented in relationship to
one another.
- Multi-faceted: The KNIS should offer users with different cognitive and learning
styles a variety of ways to engage in the information, using different sensory modes.
- Multi-media: The KNIS should support the capacity to visualize and interact with
measures and data through a full range of techniques.
- Navigation and Orientation: The site should support navigation in a variety of
intuitive ways and allow users to maintain consistent sense of orientation.
- Interactivity and Continuity: The KNIS should make it possible for users to
interact with and explore data and information in a variety of ways and then to save
their work and build understanding cumulatively over time.
- Balanced: The KNIS must present what is known and what is not known, what is
available and what is not available, where information exists as well as where gaps
in coverage need to be filled.
- Involvement and Diversity: The KNIS should give the American people many
ways to be involved in issue framing and indicator selection, constantly balanced
against expert input.
- Independence: People, processes, vendors, content and products must be
selected on needs and demonstrated capabilities and not influenced by
inappropriate bias.
- Persistence: All pages, once published, must have a persistent URI so that links
established from other web sites remain viable.
- Openness, Transparency and Extensibility: Decisions must be based on open
and transparent processes, sources, open standards, open principles, and, to the
greatest extent possible, open source software.
- Synchronization: Updates should draw from a large ecosystem of hundreds of
data providers while being continually updated with acceptable and reliable latency
times.
- Security: The KNIS must provide security in the experience, including
confidentiality of user information, assured availability of all services, and assured
integrity of all data.
- Flexibility and Adaptability: The KNIS must be able to respond to user and
stakeholder input as well as market evolution with frequent integrated updates.
- Confidentiality: The core reason for existence of the KNIS is to get the right
information to the right users, and transparency in doing that is always the default
answer. However, at times the KNIS will be entrusted with information that must
Please provide comments on this draft to feedback@stateoftheusa.org 21
SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
remain confidential. This includes any information associated with users, such as
logins, search strings or publicly identifiable information.
- Integrity: The data used by and held by the KNIS and the information mixed,
mashed and modified by users must be secure from tampering, including secure
from tampering by sophisticated adversaries.

Design Principles
The following are KNIS architecture design principles. These principles guide the KNIS
architecture decision-making process, including actions of the governance team, the
design team, and SUSA staff:
1. The entire effort is focused on the users and their experience.
2. Web services will be used to the greatest extent possible.
3. Open source approaches are preferred.
4. We standardize on open standards.
5. KNIS designs will be OS independent.
6. KNIS designs will be client and browser independent.
7. The architecture will be broadly understandable and broadly communicated.
8. The design will empower communities.
9. We design for scalability.
10. We design for interoperability.
11. We design for flexibility, extensibility and an ability to evolve.
12. We design for universal accessibility and usability

1. The entire effort is focused on the user and their experience. The priority driving
principle of the Key National Indicator System is that humans must be empowered for
greater understanding and better decision-making. This activity is all about people who will
be using the system and the design team will build architectures that place the American
people’s experience in the primary position it deserves. We also recognize that success
here will require far more than design. It will also require continuous process of usability
testing and community engagement.

2. Web services will be used to the greatest extent possible. Web services enable
system-to-system communication, creating a means for reliable exchange of information
and autonomous synchronization. The standards and specifications associated with web
services have been proven to provide scalable, reusable, interoperable capabilities and will
be used in KNIS designs. Implications: The architecture will come with the many benefits of
web services, but care must be taken to ensure reliability meets expectations. With web
services, reliability must be engineered in.

3. Will design with the open source community in mind. KNIS framework designs are
not being built to favor any single software package or suite of tools. But as a key
architecture principle, KNIS will design with the open source community in mind. The KNIS
framework should be implementable for a low cost and deliver high availability with solid
security. Commercially supported open source is supportive of these requirements. If,
Please provide comments on this draft to feedback@stateoftheusa.org 22
SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
during design work, the team identifies requirements that are best met by proprietary
approaches or software, these components will be well-articulated and steps will be taken
to ensure that transition away from proprietary solutions is possible in the future.
Implications: This early attention to open source solutions will help build community free
from undue influence. However, care must be taken to ensure all decisions, open source or
proprietary, are documented well.

4. We standardize on open standards. It is the intent of the design team to follow best
practices as articulated by the open standards group. These may include groups such as
The Open Group, the Organization for the Advancement of Structured Information
Standards (OASIS) and the Object Management Group (OMG). Using the standards and
implementation guidance of widely known and highly respected organizations will help
ensure standards are implemented in repeatable ways. We intend on following the SoaML
(Service oriented Architecture Modeling Language) as a way of clearly articulating
implementations of standards. Implications: use of best practices provides lessons learned
from many other environments. The design team will be well-versed in these best practices
and will only deviate when there is good reason. Here too, a key implication is that design
choices must be well documented. We also expect to use open standards, where possible,
for data security (especially data integrity).

5. KNIS designs will be OS independent. The KNIS framework is being built in a way that
can be implemented by a wide range of organizations. Although we intend on engineering
for secure scalability with reliable systems (and open source operating systems will be the
first choice) we will take every step to be as implementable as possible in any operating
system to ensure the greatest possible adoptability. Implication: Engineering for OS
independence requires attention to detail and experience with a wide range of OSs.

6. KNIS designs will be client and browser independent. The KNIS designs will support
users on any client, including traditional PCs, smartphones, cell phones and tablets. End
users will access most information from the framework via browsers, and we intend on
supporting all major browsers. Implication: This goal can be hard to achieve but we view it
as very important to attempt, since there is no lock-in by any one OS or other software
platform vendor for client devices. Citizens should be able to interact with the KNIS
architecture from any device.

7. The architecture will be broadly understandable and broadly communicated. The


open, collaborative vision of the KNIS requires an architecture that is available to all those
who will participate in governing, building, interacting with or overseeing it. The design
team will, to the greatest extent possible, avoid producing an architecture which can only
be understood by the design team. Implication: This architecture must be written and re-
written, with an ever-increasing circle of diverse technical input, until it has been
demonstrated to be widely understandable and useful.

8. The design will empower communities. We expect and will engineer for a high degree
of community involvement in the resulting system. The design being produced will be
implemented by the KNIS to provide a web presence for a community (i.e., virtual,
geographic, demographic). But we also expect this to be empowering in a different way. It
Please provide comments on this draft to feedback@stateoftheusa.org 23
SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
is also to be designed for the use of an empowerment of any other community which can
benefit from it, including international communities. We intend on the design supporting
communities but recognize that success here will require far more than design. It will also
require a continuous process of usability testing and community engagement.

9. We design for scalability. With a target audience of hundreds of millions, designs


must support very large ―consumer-scale‖ use. Virtualization will be the scalability
path. Implication: If any concept or decision is introduced that has yet to be proven as
scalable it will be well researched and scalability risks mitigated and stress-tested.

10. We design for interoperability. A KNIS will require data, results, graphics and
spreadsheets being exported from systems for further use, including embedding in sites
and consumption by other automated tools. This interoperability will be designed in from
day one. Implications: The design will need to be tested for interoperability characteristics.

11. We design for flexibility, extensibility and an ability to evolve. Designs must be
established so that they will allow any single component to be removed and replaced. This
is important for the design's ability to evolve over time and is also an important enabler to
allowing variation between sites in ways that does not impede interoperability. To the
greatest extent possible, the architecture will not require specific software packages.
Implications: Enhancements in functionality and the introduction of new technologies are
expected. When they are made, the design will enable smooth evolution, in small
increments that will minimally impact other systems.
12. We Design For universal accessibility and usability. Designs will adhere to the
highest standards of providing access to users with visual, auditory, and motor disabilities
as specified in Section 508 of the Rehabilitation Act. In addition, our design will strive to
serve the needs of users with cognitive limitations and low literacy or numeracy skills, while
keeping in mind the needs of young/old and novice/expert users.
These principles are a key means of evaluating the architecture and will be a continual
compliance check to ensure the architecture below sets the KNIS on the right path.

Please provide comments on this draft to feedback@stateoftheusa.org 24


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
Conceptual Architecture
The Conceptual Architecture provides important context relevant to understanding the
entire architecture and which is useful in explaining the architectural intent.
The KNIS conceptual architecture can be expressed as illustrated in Figure 3.

Figure 3: KNIS Conceptual Architecture

The KNIS architecture provides data and services to end users via their client devices and
also enables syndication of data to other systems. The conceptual architecture includes:
Enterprise Tier Components – This is the realm of overall governance (e.g. strategy,
fiduciary, policy, requirements and leadership as well as technology and data governance).
Client tier components – Software that must reside on the client hardware (e.g., desktop
PC, PDA, cell phone, etc.). In general, these are restricted to browser software.
Presentation tier components – Software responsible for rendering information that will
be conveyed to the end-user (including system administrators). All of the screens specified

Please provide comments on this draft to feedback@stateoftheusa.org 25


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
by end-users during use-case modeling reside in this tier. Often, these screens will be
encapsulated within the user interface.
Business processing tier components – Software responsible for performing business-
specific functions. All of the business operations specified by end-users during use-case
modeling will reside in this tier. These include components that implement business
transactions, validate user requests, apply and enforce business rules, as well as
components that ―wrap‖ legacy systems and databases (although the legacy systems and
databases themselves do not reside in this tier). Although there are security components
throughout the architecture, some critically important business tier contributions to security
include user authorization, account management, and mechanisms to ensure only
authorized users can change data.
Data resource tier components – Data management components (e.g., databases,
directory servers, etc.) responsible for managing the persistent state of business data. All
of the business objects specified by end-users during domain-object modeling will be
managed by the components residing in this tier. The data resource tier also contains the
legacy systems that are ―wrapped‖ by business processing tier components. Rules and
capabilities for ontologies and taxonomies are articulated in this layer.
Integration tier components – General-purpose or business-specific components used to
tie together business processing tier components with data resource tier components or
resources that are external to the application. The components in this tier generally are not
visible to end-users. These ―messaging‖ components usually take the form of queued
messaging servers, publish/subscribe event servers, or a combination of the two. The
Integration tier is the interface to syndication services provided from the KNIS and the
interface to data providers. The integration tier and its syndication services provide
developers with connectors which can facilitate enhanced direct connection to social media
outlets (for example, Facebook, LinkedIn, Twitter).

Please provide comments on this draft to feedback@stateoftheusa.org 26


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
Logical Architecture
The purpose of a logical architecture is to specify the work of logical components in more
detail, based on the desired functionality of the enterprise. Each component should do one
thing and do it well – accounting for interactions with other components. The logical view
describes the problem from an abstract, platform and technology-independent perspective.
It describes the software elements that meet the system's functional requirements. It
describes the design of individual services, their interfaces, and their operations.
This section provides architectural specifications and guidelines flowing from the
conceptual architecture. It lists specific guidelines for each tier. The relevant systemic
qualities and future directions related to logical architectures are also discussed. An
overview of the Logical Architecture is presented in Figure 4.

Figure 4: Logical Architecture

High-level Guidelines
The following are KNIS guidelines for the logical architecture:
Please provide comments on this draft to feedback@stateoftheusa.org 27
SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
Reuse and Purchase Before Developing
The development team should seek to reuse existing infrastructure and, when none exists
to meet business requirements, make informed buy-versus-build decisions before
proceeding with new development projects.
Open Systems and Open Standards
In any system purchase or development project, open systems and open standards should
be preferred above proprietary technologies, after considering comparative functionality,
total cost of ownership and track records for adoption and innovation.
Vendor Specific Extensions
When vendor capabilities are used they are to be maintained in as open a configuration as
possible. All vendors, even those which base their capabilities on open source, provide
means to extend their capability. In most cases, extending capabilities like this introduces
future interoperability issues and can end up having the same negative interoperability and
vendor lock-in issues as purchase of closed source proprietary capabilities does. If vendor
specific extensions are used, they should be encapsulated and well
commented/documented so they can be removed or replaced with the least impact on the
surrounding systems.
Separation of Concerns
The logical architecture provides focus on the separation of concerns within the application:
each tier deals with a specific logical area of the application (presentation, processing, data
management, etc.) and each component within a tier should focus on one and only one
concern.
Decomposition
Because decomposition isolates specific responsibilities to individual components, so they
may be addressed independently, it is important to ensure that the required functionality
can be delivered by components working in collaboration. To provide the proper context for
the development and use of each of component, functional responsibilities for each
component must be assigned and documented.
Systemic Qualities
Systemic requirements (such as performance and uptime) and functional requirements are
of critical importance and are articulated where possible in this logical architecture. To
properly address systemic qualities, sets of collaborating components must be considered
together. Performance, for example, should be addressed in terms of the patterns of
interaction the design calls for, not just from the perspective of the individual parts.
Business Continuity
Recoverability, redundancy, and maintainability should be addressed during application
design, based on criticality and impact to the mission, in order to determine the required
level of continuity.

Please provide comments on this draft to feedback@stateoftheusa.org 28


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
Architecting for Security
There are many non-architectural aspects of security and the designers want to take this
opportunity to alert all readers to that fact. Policy, Process, Training and Governance are
all critically important to security. However, there are also important technology
considerations for both the logical and technical architecture.
Considerations, using the previously articulated constructs of Confidentiality, Integrity and
Availability include:
Confidentiality: When users provide personal data (for configuration or savings of
settings) that information must be protected.
Integrity: Data must be provided in its pure, unchanged form, with no threat of
modification by adversary or accident.
Availability: KNIS services, including syndicated services, must be provided at a
high availability and the system must be designed to withstand both natural disaster
and computer attack.
Architectural Patterns
Existing architectural patterns, reference architectures, business services, etc., should be
leveraged wherever possible. This includes emerging reference architectures and patterns
for use of emerging cloud capabilities. Where possible, architecture patterns are used in
the technical architecture below.
Architecting for Usability
When design trades are considered, priority weighting for decisions must be on usability
factors, since this is the overriding "meta requirement" of the KNIS architecture.

Enterprise Tier
The enterprise tier is the domain of business processes, governance and key standards
and is intentionally articulated here in the logical architecture section since it is a driver of
all other components of this tier (the KNIS is a mission-driven capability).
KNIS core processes
The KNIS institution will support four key processes:
- Content Management: Including selection of issues, sourcing of data sets,
presentation, publication, design, evolution and adaptation of information over time,
as well as data quality.
- Product Management: Focused on product/service design, development and
maintenance to performance specifications.
- Strategic Development: Includes communications, fundraising, marketing, public
relations, partnership and stakeholder management.
- Institutional Management: Includes all aspects of relationships with public and
private stakeholders, the National Academy of Sciences and the State of the USA

Please provide comments on this draft to feedback@stateoftheusa.org 29


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
leadership (e.g., governance, planning, finance and accounting, legal, information
systems, human resources, audit and oversight).

All four of these core KNIS processes have complex interactions between them. But they
are bounded in the context of larger societal processes of debate, learning and change. It
is vital to understand those boundaries. A KNIS will focus on the presentation of measures
and data, in the context of issue frames and with a high enough utility that they can be
used for analysis by users. However, the boundaries of the KNIS enterprise processes do
not extend into education, choice, change or dialogue. A simplified model for societal
processes is diagrammed below in Figure 5:

Figure 5: KNIS Processes

KNIS Architectural Governance


Architectural governance is the means to ensure that processes and technologies support
excellence in the pursuit of the KNIS mission. This section articulates an initial approach to
KNIS governance. KNIS architectural governance processes are subordinate to the
overall KNIS governance processes. A summary follows:
Advisory Processes

The KNIS maintains an architectural advisory board consisting of experienced enterprise


thought leaders from a wide background selected based on their years of dedication to
community and years of demonstrated success in a variety of fields. Membership of the
KNIS technical advisory board is by invitation only with the Presidents of the National
Academy of Sciences, the National Academy of Engineering, the Institute of Medicine, the
National Research Council and the President and CEO of the State of the USA responsible
for inviting membership.

Decision Mechanisms
Please provide comments on this draft to feedback@stateoftheusa.org 30
SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
KNIS leadership must find balance between broad coordination with a large base of
stakeholders and agile action designed to support our mission. This balance will be found
by use of two levels of collaborative groups: an executive architectural board and
subordinate working groups.

The KNIS Executive Architecture Council is used by the SUSA CEO and senior staff to
ensure appropriate vetting of ideas and decisions with a broad range of internal and
external stakeholders. This council is an enterprise systems and technology decision-
making body.

KNIS Architecture Executive Council membership consists of representatives from:


- The Executive Officer and General Counsel of the NAS, NAE, IOM and NRC or
their designated representatives
- The SUSA Executive Staff (Chief Data Officer, Chief of Content, Chief Technology
Officer, Chief Scientist, Chief Statistician, EVP, CFO and CEO)
- The KNIS Technology Advisory Board

The KNIS Architectural Council is chaired by the SUSA CEO or, in the CEO’s absence, by
the CTO. This council has approval authority over the KNIS architecture and its principles
and will help adjudicate issues brought to the council by working groups (further described
below). This council is about ensuring the right architecture decisions and will keep user
issues at the forefront of design decisions.

KNIS Architecture Working Groups are an additional decision mechanism which will also
be used to enable the best coordinated advice from technologists. Working groups are to
be chartered as required and will be empowered with terms of reference approved by the
SUSA Architecture Executive Council.

Working groups may be chartered to work on specific issues, however, at least two are
envisioned to be of extended duration: The KNIS Architecture design working group and
the KNIS Data Working Group. Additionally, although this is the governance process for
technology issues, there are key processes underway in the higher level KNIS governance
structure for other critical topics including issue frames and measurements. SUSA
leadership has noted that the complex interrelationships between issue frames,
measurements and data issues can lead to ambiguity by those working the issue and will
work to ensure leadership of these working groups are in constant communication to
ensure this ambiguity is reduced.

The KNIS Architecture Design Working Group will be assigned responsibility for
maintaining each chapter of the KNIS enterprise architecture and will work to keep the
architecture aligned with the KNIS vision, coordinated with the community and relevant for
designers.

The KNIS Data Working Group will work technical data issues among public and private
Please provide comments on this draft to feedback@stateoftheusa.org 31
SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
data providers, data consumers and designers. This group will also maintain and update
the data section of the SUSA enterprise architecture.
Each working group charter will spell out their role in capability certification, which is a
mechanism the CEO will be able to use to grant approval for capability roll-out. SUSA
intends on using ITIL v3 as a framework for operations and maintenance and artifacts
required by ITIL will be provided by working groups prior to capabilities being certified as
ready to run. ISO standards for business process certification will also be used where
appropriate.

Oversight/Execution/Feedback
Decisions regarding architecture of the KNIS will be promulgated by the CEO and
compliance ensured by effective communication to all involved and monitoring to ensure
execution. It is the intent of SUSA to ensure dialog and evaluation of multiple courses of
action in architectural decisions, and mechanisms will be put in place to surface competing
ideas.
Issue Areas, Culture, Standards and Data Mechanisms
The KNIS architecture governance structure and process must cover a broad range of
issue areas covering many stakeholders, information consumers and data providers. The
scope of this effort means open collaboration and coordination and open processes are
key. The governance team will ensure transparency at all levels to assist in broad input and
involvement by all able to contribute.
Architecture, Design Guidance, Implementation Directives
The KNIS architecture governance process relies on broad understanding, continuous
feedback and dialog. Therefore a key operating concept of KNIS architecture governance
is to provide all architectural artifacts in openly sharable formats for all stakeholders, from
users to data providers to developers to architects, to review and provide input on.
Additionally, a KNIS should provide virtualized instances of the KNIS capability for
developer use. The provisioning of these virtualized instances will be provided upon
approval of the KNIS CTO as resources allow.
Contributing Back to the Open Source Community
The KNIS architecture is designed with the meta-requirement of usability, and it is user
focused. This has driven an approach that is open in many ways, including a strong bias
towards open source software. Wherever possible it is our intention of sharing back with
the Open Source community. One early way to do that will be in providing use cases
showing ways a KNIS is implementing open source. When possible we will also share back
suggestions for code improvements and contribute in other ways. KNIS interactions with
the open source community will be under the umbrella of the KNIS architectural
governance structure and coordinated by the KNIS CTO.

Please provide comments on this draft to feedback@stateoftheusa.org 32


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
Client Tier
Client tier components – Software that must reside on the client hardware (e.g., desktop
PC, PDA, cell phone, etc.). In general, these are restricted to browser software and do not
include any of the three concepts modeled by the end-user.
The client tier consists of any client device or system that manages display and local
interaction processing. This includes the browser. KNIS IT planners default to the following
guidelines for the architecture in client-focused decisions:
Thin Client Rule
Whenever possible, end-user interaction with business applications and services should be
mediated by a browser. Mediation may include the browser's use of commonly-available
plug-in modules. The KNIS recognizes that tracking release details of browsers is also
important and that a wide range of users indicates a wide range of browsers and variants
will be in the ecosystem. KNIS implementations will document release levels that are
written to. Decisions which concern censurability may only be approved by the KNIS
architecture governance board.
Justification: By standardizing access, application and service offerings are very loosely
coupled to the client tier. It will almost never be the case that deploying a new service or
application will force deployment of new components to the client tier.
Impact: Client tier deployment costs are minimized. Access to business applications and
services can be monitored more easily.
Client Mobility Rule
Whenever possible, an end-user's physical location should not affect access to KNIS,
applications and services (except in cases where location gives users additional control
over their experience, and then that should be by user choice).
Justification: The KNIS exists to serve citizens, wherever they are.
Impact: Costs related to special client tier configuration will be minimized. End-user
satisfaction, performance, and effectiveness will be maximized.
Disconnected Client Rule
KNIS visualizations and applications will sometimes run on clients that cannot be
connected continuously to underlying business applications and services. In these
circumstances, the client tier may include components which would otherwise be placed in
other tiers in order to make the client useful while disconnected (for example, a database
engine). The preferred way of providing for offline use is to provide data through
syndication which can be downloaded for viewing in any common viewing engine already
resident in mobile clients (browsers) or easily consumable by common platform
applications (for example, iPhone or Android apps).
Justification: Many use-cases support end-user productivity while disconnected.
Impact: The value of enabling end-users to remain productive should not be
underestimated; however, this value should be balanced against the cost of deploying,
supporting, and securing additional client tier software components. When such a client is
disconnected but remains in use, the end-user's session state as reflected on the client
Please provide comments on this draft to feedback@stateoftheusa.org 33
SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
may diverge from the ―stale‖ version reflected in the lower tiers; the application is
responsible for ensuring synchronization when a connection is reestablished. Data
associated with the security of the application, that must be stored locally, must be
encrypted using a user-specified password that is not stored on the client machine.
Client Applet Rule
Whenever possible, business applications and services should avoid the use of client
applets. Applets may be appropriate when the code cannot be persistent locally.
Justification: Applets have a number of drawbacks that should be considered:
- Every time an applet is started, the entire applet needs to be provisioned.
- Applets are often large and, therefore, time-consuming to provision.
- Applets are very sensitive to runtime environment; they are easier to fail.
Client Usability Rule
Because the end-user experience is the focus of KNIS requirements and critical to the
perceived quality of applications and services, end-users must be provided with:
- Reasonable response time, under expected operating conditions, when interacting
with underlying components; this includes a consideration of low bandwidth and high
latency environments
- A comprehensible experience (sometimes called ―walk up and use‖)
- A consistent experience across client platforms
- Familiar graphical aids
- Appropriate mechanisms for customization (e.g., allowing end-users to specify
right- or left-hand use, preferred fonts, font sizes, etc.)
Justification: The meta-requirement of the KNIS architecture is a quality end-user
experience.
Impact: Long wait times, excessive number of keystrokes to complete tasks, and excessive
confusion would result in poor user experience and run counter to all KNIS is building
towards.
Presentation Tier
Presentation tier components – Software responsible for rendering information that will be
conveyed to the end-user (including system administrators). All of the screens specified by
end-users during use-case modeling reside in this tier. These screens may be
encapsulated within the user interface on a portal.
The presentation tier is responsible for formatting all information displayed to end-users,
capturing end-user input, and performing simple field-level validations. The format of
displayed information may take many forms, but in the KNIS context will generally be via a
browser. The presentation tier is often architected using a model-view-controller pattern.
For security reasons, these components will, in general, also be responsible for checking
request parameters initiated or specified by end-users. These ―sanity checks‖ help reduce
the possibility of buffer overflow attack. Other data validation (e.g. date, phone number,
Please provide comments on this draft to feedback@stateoftheusa.org 34
SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
account number formats) may be performed by the presentation tier, but business-specific
validation is usually performed by the business processing tier.
Localization (L10N) Rule
L10N is ―the process of providing language- or culture-specific information for computer
systems.‖ When a business application or service will be used in multiple locales, its
functional requirements must state explicitly the aspects of L10N that will be important,
which aspects of L10N must be implemented and which aspects will not be localized.
Justification: We are designing for all our citizens as well as the international community
and must ensure accessibility by the widest possible cultural base. Designers must ensure
that all prospective end-users will be able to experience the application or service in a
manner that will facilitate both learning the application and using it in a highly productive
way.
Impact: The result is reduced training time, improved end-user productivity, and less
likelihood of end-user data entry error.
Internationalization (I18N) Rule
I18N is ―the process of generalizing computer systems so that they can handle a variety of
linguistic and cultural conventions.‖ All applications must facilitate I18N. When a business
application or service will be used in multiple linguistic and cultural locations, its functional
requirements must state explicitly the aspects of I18N that will be important, which aspects
must be implemented and which aspects will not be internationalized.
Justification: We are designing for all our citizens as well as the international community
and must ensure accessibility by the widest possible cultural base. Designers must ensure
that all prospective end-users will be able to experience the application or service in a
manner that will facilitate both learning the application and using it in a highly productive
way.
Impact: The result is reduced training time, improved end-user productivity, and less
likelihood of end-user data entry error.
Accessibility Rule
KNIS applications will be used by end-users with disabilities. Some accessibility aspects
may be handled in a uniform manner through application integration with a portal that
provides some accessibility features across all the application portlets or widgets it
manages. Accessibility includes sensitivity to color issues not only for people who have
color-deficient vision, but for display on various display devices, including projectors and
printers.
Justification: Ethical and legal considerations clearly indicate the importance of ensuring
that end-users with disabilities will be able to operate all aspects of the KNIS application
suite.
Impact: The result is reduced training time, improved end-user productivity, and less
likelihood of end-user data entry error.

Please provide comments on this draft to feedback@stateoftheusa.org 35


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
End-User Preference Configuration Rule
When a business application or service might have its usability enhanced by allowing end-
users to set their own preferences, its functional requirements must state explicitly which
types of user preferences will be permitted and the manner in which they may be
implemented. Whenever possible, an end-user's preferences should be in effect regardless
of the user's location or client device.
Justification: End-users often expect to be able to make superficial modifications that make
using the application more pleasant or a more productive experience.
Impact: The user's ability to ―customize‖ an application or service results in improved user
productivity. For example, if a user prefers to view his calendar by month instead of by
week, allowing this customization permits the user to complete tasks more rapidly.
Standardizing where and how user preferences will be stored is a topic under study.
End-User Role Identification Rule
When a business application or service constrains the end-user roles that will be
authorized, its functional requirements must state explicitly which roles will be authorized.
Differentiation of functionality by end-user role must also be stated explicitly.
Justification: This serves to clarify how use-cases that vary by role are differentiated by the
presentation layer (i.e., content, data capture, and field validation).
Impact: Implementation at the presentation layer will be clarified. This rule also impacts the
business processing tier and the data resources tier.
Field Validation Rule
When a field in a form is of an easily validated type, the applications or service's functional
requirements must state explicitly the validation rule to be applied. Field validation in the
presentation tier does not affect the requirement that all fields must be validated in the
business tier.
Justification: Generally, fewer computing resources are required to force the correction of
(relatively trivial) input errors before they are passed to the business logic tier. Conversely,
validation of fields based on business rules is not trivial, is subject to change, and should
be performed in the business tier. It is possible that a nominally valid input from the
presentation tier (e.g., a valid date) will fail a business tier validation (e.g., the date was a
blackout date).
Impact: Field validation helps ensure productivity is optimized. For example, if a user
enters a bad date as part of a form and posts the form all the way to the business logic tier
before the error is detected, the user might have to wait considerably longer before being
notified of mistake. Performance considerations need to be balanced against security
concerns. Note also that a nominally valid input might later be rejected by the business tier.
Presentation Tier Standards Rule
All information produced by the presentation tier must be formatted using widely accepted
standards (such as those expressed in the technical architecture below).

Please provide comments on this draft to feedback@stateoftheusa.org 36


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
Justification: Adherence to standards ensures that information will be useable on the
widest possible range of client platforms. It also decreases the likelihood that custom client
components will need to be provided as part of an application solution.
Impact: This is critical to implementing thin client applications. Failure to adhere to
standards yields unpredictable results.
Active Content Rule
Examples of active content include active or looped GIF sequences, graphically active
applets, Flash movies, and varying color. Browser-based active content should be avoided
unless required for conveying KNIS information to users. If active content must be
employed, the end-user must be able to turn it off.
Justification: Visually active content may become annoying to the user. Additionally,
applications that employ active content might consume excessive computer resources.
Impact: On some client platforms the aggregation of active content significantly impacts
computer resources.
Consideration: In cases where developers need help in determining which situations are
acceptable for active content, focus on user experience and consult with SUSA Chief Data
Officer and/or Chief Content Officer for guidance.
Business Processing Tier
Business processing tier components – Software responsible for performing business-
specific functions. All of the business operations specified by end-users during use-case
modeling will reside in this tier. These include components that implement business
transactions, validate user requests, apply and enforce business rules, as well as
components that ―wrap‖ legacy systems and databases (although the legacy systems and
databases themselves do not reside in this tier).
This section provides guidelines for architecting the software components responsible for
performing business-specific functions. This includes implementing business transactions,
as well as applying and enforcing business rules.
Services Overview
Often, the business-specific functionality for an application is encapsulated within a service.
This means that all the code and resources that are necessary to implement the business
functionality are grouped together as a package and run in one or more stand-alone
processes that are accessible via the network.
This model allows the business functionality to be implemented and deployed once, and
then used by multiple application components. It also allows the service implementation to
scale independently of the other application components.
Other parts of the application that need to access the encapsulated business functionality
can do so by calling a published service application programming interface (API) that
interacts with the remote service. The API is implemented by proxy code that runs in the
client process and communicates with the service via standard networking protocols like
HTTP. Such proxies should be provided by the Web service for the convenience of clients,

Please provide comments on this draft to feedback@stateoftheusa.org 37


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
but because some clients may not be capable of using such a proxy, all APIs should be
well documented.
As illustrated in Figure 6, a service will be treated as a separately deployable package,
almost like a mini-application. Each must be able to be monitored, configured, and secured
separately, independent of the other application components that depend on it. This model
allows services to be upgraded independently if the service APIs that it implements are not
changed by the
upgrade.
Figure 6: Services

Calling Process
Application
Code
Service Interface
Service API Service
network call
Service Implementation
Proxy
Service Process

A well designed service does one thing and does it really well. Combining unrelated
business functionality into a single service is confusing to users and makes it hard to
evolve related services in a consistent manner. In contrast, having a service implement just
one business operation is wasteful due to the added overhead associated with the
deployment and management of a service.
A good measure of how well a service is designed is the number of interfaces it supports in
its API. As illustrated in Figure 7, a well designed service should have only one or two
business-related interfaces, a monitoring interface, and a configuration interface for remote
administration.

Please provide comments on this draft to feedback@stateoftheusa.org 38


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
Figure 7: Multi-Interface Support

Calling Process Calling Process


Application Administration
Code Code

Business API Monitoring API Configuration API


Service Service
Proxy Proxy
network call

network call

network call
Business Interface Monitoring Interface Configuration Interface

Service
Implementation

Service Process

The business interface is called via proxy code that is integrated with the part of the
application used by the end-user. The monitoring and configuration interfaces would be
called via proxy code that is integrated with the part of the application (or a general
monitoring and administration program) that is used by the system administrators. If the
application is running in a portal, both interfaces may be made available by the portal at
different times to different users based on their current roles.
Shared Services
Some services provide business functionality that is general enough to be useful across
multiple applications. These are called shared services. See Figure 8.
Applications that leverage shared services have the potential to realize several business
benefits:
- Reduced time to market; integrating existing services, rather than developing
redundant code, speeds application development and enables upgrades to better
deliver consistent results
- Reduced total cost of ownership; costs for implementing and maintaining the
service are assumed by the service provider rather than the service consumer; as
new features are added to these services, consumers realize increased functionality
at little cost
The benefits of shared services come at a cost. Shared services have more dependencies
on them and require more care when upgrading implementations or changing interfaces.

Please provide comments on this draft to feedback@stateoftheusa.org 39


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
Such services must ensure that upgrades are backward compatible and produce a
minimum effect on dependent applications.
Figure 8: Shared Services

Calling Process Calling Process


Application X Application Y
Code Code

Business API Business API


Service Service
Proxy Proxy

network call

network call
Business Interface

Service
Implementation

Service Process

Because a service-based architecture results in ―finer grained‖ deployment packages (and


more of them) special care must be taken with respect to version management. The
versions of each deployed service, along with the versions of the interfaces it supports and
the client components that depend on it, must be tracked carefully. Upgrades must deliver
consistent results.
- An old (obsolete) interface cannot be "end of life'd" (EOL'd) until all the clients that
depend on it have been moved to a later version of the interface.
- A policy for each component must be published stating how long each version of
the service interface will be supported and when it is planned to be EOL'd. An
example of a reasonable policy would be to support the current version plus one
previous version of each interface.
One way to deal with this complexity is to have a service implementation support multiple
versions of the same interface at the same time. This approach allows dependent
applications to migrate to the newer interface versions when ready, while dependent
applications that are not yet ready can continue to use old versions of the interface.
A policy should be made that each version of an interface will be supported for at least one
year (but not much longer) after a newer version of the interface has been deployed. This
provides a realistic upgrade path for components that continue to use the previous version.

Please provide comments on this draft to feedback@stateoftheusa.org 40


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
Data Resource Tier
Data resource tier components – Data management components (e.g., databases,
directory servers, etc.) responsible for managing the persistent state of business data. All
of the business objects specified by end-users during domain-object modeling will be
managed by the components residing in this tier. The data resource tier also contains the
legacy systems that are ―wrapped‖ by business processing tier components.
The data resource tier provides management of data that an application acts upon. Any
data that an application needs to fulfill its purpose or behavior are part of this tier.
Primarily, the data resource tier provides access to and persistence of data. This section
focuses on selection of the most appropriate data resources.
The KNIS need for response times and need for adding value to content will require
designers give appropriate consideration of the balance between internal data managed in
the enterprise, internal data managed in the cloud, and data maintained at provider sites
and called when required. These design trades will be assessed at time of design with final
design choices approved by the CTO.
The KNIS website will require at least the following data resources:
Indicator data (and metadata) – These data originate with other sources, mostly
government statistical agencies, but will probably be stored in KNIS databases, at least
until adequately responsive web services to them are available. These data are at the heart
of the site content, and will be available to users through data visualizations, online tables,
downloads in various formats, and other formats as determined by site designed. They will
also be available remotely (off the KNIS site) in visual 'widgets' or via a web service for
public data access. Any use of the data must be accompanied by metadata to at least
identify the source and provide labels and important notes.
Indicator data will in general change slowly, with updates typically on a monthly or annual
basis. Data will support disaggregation to the extent possible along four dimensions: time,
geography, demographics or other characteristics, and conceptual components. These
dimensions will vary from data set to data set, but the KNIS will strive for consistency.
Timeliness is important; data should be available within the KNIS system within hours of
when they become available from the data provider. Data will be provided in a variety of
formats, from online web service delivery, to downloadable spreadsheets, to screen-
scraping from web pages or PDF files.
The metadata associated with indicator data describe the provider, the survey or system
from which the data originate, the description of the measure used, and any notes
necessary to use the data. Notes may be at the data-set level, the data point level, or at
some point in between, such as at the collection of disaggregated data (e.g. race/ethnicity
or year). In some cases, statistical quality measures (standard error, confidence interval,
etc.) will be available and must be stored and delivered with the data.
Textual and media content – Textual and media content will be managed by a staff of
writers, artists, web producers, and other (largely non-technical) staff. Although such
content will almost always be mediated by a content management system (see CMS at
various places in this architecture), the content generated will be stored in and served from
a data store. The content management system will also typically store user comments,
Please provide comments on this draft to feedback@stateoftheusa.org 41
SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
although implementation decisions and development phasing will drive that aspect of the
architecture.
Site metrics, log and user access data – Data about user interaction with the site and the
site elements are essential for KNIS business practices. They will inform site development
and content priorities, provide a metric for site success, and contribute to capacity planning
and hardware support decisions. Data must include interactions both on the KNIS site and
with KNIS content that has been distributed to other sites, and KNIS activity on Twitter,
Facebook and other social interaction platforms.
User-specific data – It is important that users be able to interact with KNIS content and
indicator data and to be able to share their discoveries and creations with others. To
provide a personalized experience, the KNIS will allow users to create an account and
through that account be able to customize their settings, and have access to saved
mashups, bookmarks, or other items enabled by the site functionality. There will be
opportunities for users to register to receive RSS feeds or email alerts, to link to or from
social networking sites like Facebook, or to send references to content to others. User
authentication and customization may also apply to users accessing the site via other
means than the web site, such as web services.
Estimates of Data Size and Scope
Although the KNIS is working with a wide range of data providers numbering in the
hundreds and we are seeking an end state goal of a system able to scale to millions of
users the core data in our repository will be text and is likely to be modest in size. For initial
planning purposes we believe KNIS-retained data stores will be on the order of under 400
Gigabytes. We are basing this on very rough assumptions and this planning figure will be
frequently revisited as we continue to design and scale up from our current working
system.2
Data Access
One of the most important decisions an application architect must make is to choose the
mechanism by which application data will be accessed. This decision is based on
application type and data type.
- A file access is sufficient for static local data and application-specific data such as
configuration and locale information.
- A distributed transaction environment is required for accessing data from multiple
sources.
- An application should cache reference data that it needs, or access such data on
an as-needed basis. Caching also depends on the overall application environment
and behavior (for example, caching reference data makes more sense in a low
bandwidth environment than it does in a high bandwidth environment).

2. Other large scalable systems serving users with dynamic issue relevant data include Wikipedia. The core Wikipedia database consists
of 163Gig of text. For more info relevant for comparisons see
http://en.wikipedia.org/wiki/Wikipedia:FAQ/Technical#How_big_is_the_database.3F

Please provide comments on this draft to feedback@stateoftheusa.org 42


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
Data Persistence
Applications require various types of data to be persistent. Examples include transactional
data, reference data, logging data, and configuration data. Data persistence means
different data life spans depending on the type of data, type of application, and type of
usage. Solutions for data persistence must take these factors into consideration. Solutions
such as caching, batching of data, query optimizations must be employed to address the
specific application environment.
- Data persistence should be tracked in terms of "logical units of work," to ensure the
state of all related data in a transaction set or query set are consistently maintained.
- Log data should employ persistence solutions that provide features for data
management, such as purging, archiving, and searching. In particular, a record of
user actions on the site must be available to provide customized recommendations
and dynamic feedback about general trends and preferences.
- Transaction data should leverage persistence solutions that support transactions,
including the ability to update multiple records or perform multiple queries within a
single transaction.
- Reference data should use data registries (enterprise directory service, company
registry, agreements registry, etc.), if available. Reference data should not be
modified by an application; as such, persistence of reference data is not the
development team's responsibility.
The selection of a data resource is much more complex than just choosing the appropriate
technology (although selection of the appropriate technology should play a part). Many
factors other than technology play a pivotal role in the selection process. An understanding
of boundary systems is also helpful in choosing the most appropriate technology.
Application architects must:
- Identify all data sources and target systems. This enables the architect to account
for all data exchanges.
- Identify all reference data, including data feeds used. All domain (business system)
registries, where they are available, must be used for reference data.
- Identify systems of record (SORs) with which the application will interface. Whether
internal or external, all SOR data must be accessed via standard and open
interfaces.
- Account for application data volumes (transaction rates), replication requirements,
metadata management, data archiving, and data purging policies that align with data
access patterns. This will drive architectural decisions for the application. Note:
Purge and archive retention policies will also be driven by business requirements
and regulations and some might be mandated by content providers. Documentation
of decisions and implementations here must be done with a discipline enabling
broad review of decisions.
- Account for data stewardship, determining who owns and maintains the data.
Appropriate access control must be provided for data owners and administrators.

Please provide comments on this draft to feedback@stateoftheusa.org 43


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
Answers to these questions and concerns provide a good foundation to work with the most
appropriate data resource options. Many of these systems have their own interfaces that
enable other applications to integrate with them. The choice of data persistence engine
depends primarily on the type of data to be managed.
Other factors play an important role in the selection process for data management, such as
performance, transactional reliability, productivity in application development, mechanisms
for integration with existing resources, compliance with standards and portability.
Data Labels
Data will be contained within an unlabeled table or file architectures to allow for maximum
flexibility in data storage, retrieval and display. Figure 9 provides a graphical
representation.
The labels for the data will be stored separately and dynamically associated with data prior
to user display based on unique factors such as:
- Data source
- User Role (who is querying the data)
- Query path (intent of query or mash-up - e.g. didn’t define a role, but stated interest
in a particular topic – health, immigration, etc)
- Related Data Sets (e.g. being queried at the same time or as part of a mash-up)
- Meta Data (source or user defined)
- Language
- Other variables TBD

Please provide comments on this draft to feedback@stateoftheusa.org 44


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
Figure 9: Dynamic Display of Data

Data in the cloud


The KNIS architecture will support the potential for some Data to be stored in the cloud
and/or locally. To allow for scale, the KNIS may utilize advanced cloud platforms for data
storage using a blended approach.
At a minimum, some data types will be stored locally. The term local data will be used to
describe data that will be stored locally on KNIS servers (even if those servers are cloud
server instances).
The term cloud data will be used to describe data that could reside on cloud services (e.g.
Amazon, RackSpace, etc).

LOCAL DATA ALLOWED CLOUD DATA

 User profiles  Original source elements


 Saved user queries  Data content type (labels)
 Saved user mash-ups  Content labels

Please provide comments on this draft to feedback@stateoftheusa.org 45


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
 Social content of queries (e.g.  User role labels
discussions, external links, incoming links)  Query path labels
 Social content of mash-ups  Context labels
 Geospatial context
 Related datasets table
 User tags
 Data Source Lists

Data Curation
Upon import of source data, the KNIS architecture will support the creation of multiple
dynamic data labels for each data element to support finding data in multiple ways, but
each data element will have consistent names on the key label. The system and
associated processes must allow for both manual and automatic labeling. In addition, the
curation/import process must support the mapping of source data labels to existing KNIS
labels and normalization of benign data labels such as date formats.
Data Source Lists
The KNIS architecture must support the storing and processing of source data identification
tables to include:
Data source feed location
Data refresh rate
Data rights (store locally, syndicate)
Data label history (complete history of all labels applied to any given data element)
Data Registry
A data registry is ―a system of record that provides unique identifiers and required key
descriptors for discrete business objects.‖ We extend this definition to require data
registries also to provide a published service API that is application domain specific. As
with other business services the API encapsulates business rules for data validation and
insulates application components from the data management system that implements the
API (such as a database). In this case, the components are those that use the business
processing tier data registry.
Content Management Systems
A content management system (CMS) is a specialized application for creating, editing,
storing, accessing, and distributing structured content (and user comments). Content is, in
essence, any type of digital information – it can be text, images, graphics, video, sound,
etc.
CMSs are distinct from document management systems (DMSs), where the managed
entity is a complete document with specific name, size, and content – documents are

Please provide comments on this draft to feedback@stateoftheusa.org 46


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
comprised of content. A DMS is concerned with a document in its entirety and is less (or
not at all) interested in what the document contains.
A typical CMS has at least these modules: authoring, meta tagging, workflow (publishing),
and rendering or presentation. Additionally, it may also provide design templates and
access control mechanisms. Web content management systems aim to reduce the
overhead and bottlenecks of Web production by enabling "anywhere, anytime" Web
authoring and promoting content reuse by separating content from its presentation
requirements.
To use a CMS service, the application must generate data in an appropriate format,
suitable for consumption and publishing by a CMS system.
Data Provenance
Data used by the KNIS and provided to end-users and/or syndicated must be trusted,
indexed, searchable and enable rich means for users to assess the validity of conclusions.
Key to enhancing these factors are Data Provenance considerations. KNIS systems should
track where data came from, how it was modified, what value was added to it and who has
accessed it. This same need for data provenance extends to meta data, which may include
notes about data, confidence, confidence intervals, standard errors, etc, as well as text and
dates and notes regarding the meaning of conclusions.
Data Value Add
BI typically has some intentional de-normalization to improve query performance, typically
organized in "star" or "snowflake" designs. Normalized data structures are usually used by
OLTP systems. A key factor regarding the types of data the KNIS is dealing with is that it is
not all normalizable in the BI sense, because in most cases the KNIS is not dealing with
the microdata, and in some mission domains (especially Health) collection of data is done
with different means depending on locality which complicates normalization. In many
(most?) cases we are dealing with summary data that can't be normalized for different
populations (for example, when one survey is civilian non-institutionalized over 18 and
another is all ages). This is an important consideration for data and an impact on the ability
to normalize, cleanse and add value. However, adding value is in the KNIS mission scope
and as value is added provenance will be kept. The KNIS value is in bringing data from
disparate sources together in one place, with a common, well understood format, and as
much consistency in disaggregation as possible given the constraints above.
Integration Tier
Integration tier components – General-purpose or business-specific components used to
tie together business processing tier components with data resource tier components or
resources that are external to the application. The components in this tier generally are not
visible to end-users. These ―messaging‖ components usually take the form of queued
messaging servers, publish/subscribe event servers, or a combination of the two.
This section provides guidelines for linking business processing tier components with
resource tier components, or with services and resources external to the application. Code
that is primarily intended for enabling integration with other applications or services, and
that does not perform business processing, should be isolated as integration tier
components. This insulates the business tier from changes in integration technology.
Please provide comments on this draft to feedback@stateoftheusa.org 47
SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
Data Schema, Format and Semantics
The KNIS consumes data from a wide range of providers and at times stores, transforms,
adds new context and re-provides that data back. The KNIS data schema enhances design
quality by ensuring data flow, storage and retrieval are optimized.
When one application is integrated with another application or service, a common data
format and semantics should be defined. Data format should not be confused with data
encoding, as the latter is dealt with in the technical architecture and relates to technology
(e.g. XML, Java Object Model, etc.). In logical architecture the focus is on identifying the
data elements involved in the integration, defining their semantics, and grouping the data
elements into business objects. This task is performed by business analysts or functional
architects who are domain experts.
Rather than defining business objects from scratch, the development team must review
KNIS and industry standards and use these as a foundation. Several such standards are
available with varying levels of maturity and acceptance. Relevant standards include, but
are not limited to:
- RosettaNet (www.RosettaNet.org)- for larger scale b2b system frameworks
- Universal Business Language (http://www.oasis-
open.org/committees/tc_home.php?wg_abbrev=ubl)
- SDMX (www.sdmx.org ) an evolving standard for international statistical data
sponsored by the OECD, IMF, World Bank, UN and other organizations.

Batch Data Transfers


In some situations the data exchanged between applications are generated as part of
scheduled jobs, in which the data are produced in batches. In such situations, applications
must use batch data transfer mechanisms such as file transfers. File standards for
exchange are captured in the technology architecture (and are XML based).
Data ingest and other potentially long-running batch transfer/update routines hold the
potential of becoming very long running jobs and must be designed to be executable in
reasonable times through techniques such as partitioning the input for shorter runs (care
and thought must be given in the design so jobs to not run too long).
Syndication of Value Added Content
The term syndication here is used to mean the ability for external users and consumers to
take automated feeds of valued added content from the KNIS system.
Syndication interactions are unregulated. This does not mean no rules apply, but that the
interactions are isolated. The interactions are not included as part of a strict and complex
procedure. Unregulated syndicated content can be used by authorized clients who apply
mandatory rules and formats. The model is basically to act on demand.
The KNIS will study existing data exchange standards to determine whether any are
appropriate for use. If no existing standards meet KNIS needs, then KNIS will publish an
open standard for data exchange and work to help other organizations implement and
adopt it.
Please provide comments on this draft to feedback@stateoftheusa.org 48
SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
Syndication and Social Media
Syndication will enable use of value added KNIS content by developers fielding capabilities
on a wide range of other platforms. One deserving of special note is social media.
Developer guidelines will be provided that facilitate use of KNIS value added data by
developers focusing on key social media sites like Facebook, LinkedIn and Twitter. And
representations of data produced will be socially sharable by embedding in other sites (for
example, and individual's blog or Facebook page or LinkedIn page).
Discovery of data sets and other content
KNIS architectures produce information meant to be discovered, shared and used.
Information that is sharable will be exposed to search engines and optimized for search
engine discovery via the open sitemap protocol.
Interaction Models
The type of interaction an application uses to interface with another application or service
must be determined based on the business processing needs.
Synchronous Interaction
In synchronous interaction an application ―makes a call‖ to another application or service
and receives a response in that same call. The calling application blocks until it receives a
response; therefore, time-outs should be used when the API supports time-outs.
This type of interaction requires the responding application or service to be available;
otherwise the call will fail. Hence synchronous interaction increases the coupling between
the calling application and the responding application or service. Synchronous interaction
must only be used where a real-time response is required.
Asynchronously Interaction
Applications should interact asynchronously with other applications and services, as well
as with other components within the same application, under the following circumstances:
- The communication is only one-way (i.e., the application sends information but
does not expect a response)
- The application makes a request but does not require a response in real time (the
response is sent as a separate one-way message)
- There is a need to provide massive scaling capability and improve performance;
the application design may use multiple instances of a service, which are used to
balance the load (the round robin scenario depicted below is one such strategy for
load balancing)
In asynchronous interaction, the calling application makes a request for information but
does not block until a response is obtained; it proceeds with its processing and the
information it requested is obtained later. This usually requires the calling application to
register a listener or event handler that the responding application or service may use for
sending the requested information. In most situations asynchronous interaction is
accomplished via middleware that decouples the sender from the receiver and guarantees
delivery of data.

Please provide comments on this draft to feedback@stateoftheusa.org 49


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
Asynchronous interaction may be of the different styles listed in Table 1. The appropriate
interaction style must be determined based on the application's needs.

Interaction Applicability

Point-to-Point  Single consumer for any given message.


 In the round-robin pattern, one message may go to one recipient;
the next may go to another, and so on.

Publish-  Multiple message producers and multiple message consumers.


Subscribe  Any given message needs to be delivered to multiple consumers.

Request-Reply  Response is needed asynchronously.


 Reply needs to be associated with request.

Table 1: Asynchronous Interaction Styles


Direct and Indirect Integration
When integrating one application with another application or service the architect may
choose direct access (see Figure 6) or indirect access that uses middleware to facilitate
interaction (see Figure 7). Most synchronous access is direct due to the need for real-time

Figure 6: Direct Synchronous Interaction

response.
Middleware, such as queues, should also be used for synchronous communication when
multiple instances of an application or service may respond to a request and load
balancing or high availability are necessary. When supported by the API, time-outs must be
employed to avoid calls from blocking forever when there is no response.
Indirect integration for synchronous interaction is also applicable when direct access is
blocked by a firewall. In such situations the synchronous access is facilitated by an
intermediary, such as a synchronous messaging server.
If the application requires guaranteed delivery of information or must broadcast or multicast
information (i.e., send information to multiple recipients at the same time) it should use
indirect integration facilitated by middleware such as a message broker. Most
asynchronous interaction should take place indirectly via middleware to leverage the
Please provide comments on this draft to feedback@stateoftheusa.org 50
SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
Figure 7: Indirect Synchronous Integration

guaranteed delivery (sometimes referred to as ―fire and forget‖) and decoupled nature of
such interaction.

Third-Party Application Integration


Applications often require integration with third-party applications or components. These
components must be evaluated with the KNIS architecture in mind so integration will be
straightforward.
Often, however, third-party applications or components may not be compliant with the
guidelines set forth in this document. When this is the case, there are some important
considerations to be made. These are described in the following sections.
APIs
If the third-party application supports a published API that can be called by the integrating
application, and the third-party application will reside in the same sub network as the
integrating application or uses an acceptable communication protocol (such as HTTP),
then the integrating application should call the third-party application's API directly.
If the third-party application API is in a different programming language than the integrating
application, then an adapter must be implemented if response time is critical, or an indirect
integration mechanism may be used if response time is not critical.
Wrapping an existing API behind a new API does not solve the integration problem, as a
new wrapper would be necessary when the third-party application changes. A justification
is required before wrapping a third-party API.
Monolithic Applications
If the third-party application needs to be accessed by other applications but does not
support a programmatic interface (API) then the entire third-party application should be
wrapped behind a service interface with its own defined API. The wrapper service must
then go directly to the application database, if possible, or interact with the monolithic third-
party application via screen scraping.

Please provide comments on this draft to feedback@stateoftheusa.org 51


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
Screen scraping is when the wrapper service behaves like a display device (such as a
browser) and converts the screen input and output into API requests and responses. This
approach allows the monolithic third-party application to be treated as a separate business
processing tier component or service.
Applications that access the monolithic application should not access the third-party
application database directly, even when this option is available. This prevents these
applications from breaking when the third-party vendor changes its database schema.
Application Consolidation
When third-party platform components need to be upgraded, reconfigured, or restarted for
one application, all applications that share the third-party platform instance will be affected.
Therefore, when consolidating instances of a third-party platform, special care must be
taken to ensure that all applications sharing the consolidated platform also share similar
service level agreements (SLAs).

Please provide comments on this draft to feedback@stateoftheusa.org 52


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
Technical Architecture
The purpose of a technical architecture is to map defined enterprise components from the
logical architecture to specific implementation technologies. These technologies are
generally layered and support standard interfaces that allow them to be used in a ―plug and
play‖ manner. This section provides guidance for technology choices. It complies with the
rules established in the logical architecture. It is not yet an all-inclusive enterprise technical
architecture. More comprehensive versions will flow from this one. Although many issues
are surfaced and guidance is provided, specific application solutions will require more
detailed implementation guidance. A graphical depiction of the KNIS technical architecture
is provided in Figure 8.

Figure 8: Technical Architecture

This architecture emphasizes technology independence. The concept of technology


independence means that we will design the logic of our system without introducing
technology-specific details. Then we will make an explicit mapping step to translate the
technology independent logic to a technology specific implementation.

Please provide comments on this draft to feedback@stateoftheusa.org 53


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
The mapping process involves taking each component defined during the logical
architecture phase and selecting, from the technologies listed in the technical architecture
section, an implementation technology to either provide complete functionality itself or host
the component once it has been developed. This section provides general guidelines for
developing an effective technical architecture. Also included are explicit guidelines for
using the specified technology within each layer.
The guidelines below are meant to ensure we address the number one "meta-requirement"
of the KNIS architecture, which is usability. Guidance is also provided meant to optimize
operational and support considerations. Both these factors are key reasons for our focus
on open source software, open architectures and open API access. Both are also the
reason we place a large focus on the requirements for the implementation team to
document, with discipline, all aspects of the technology decisions made.
High-level Guidelines
The following high-level guidelines capture the ―big rules‖ that should be kept in mind for
the technical architecture for the KNIS.
- To minimize training and support costs, minimize the variety of technologies in a
particular category.
- Implementations are expected to be operated and maintained by ITIL compliant
processes (ITIL: Information Technology Infrastructure Library), and documentation
sufficient for use by operations and maintenance staff working under ITIL are
required. This includes information for help desk support. Use cases for problem
management, change management, capacity management will be documented and
provided with every delivery of system capabilities. The SUSA CTO is the holder of
these documents.
- Regarding REST versus SOAP, the KNIS architecture must have an ability to
consume both since both exist in the ecosystem. More information on those two
approaches is provided below.
- To achieve vendor independence and improve interoperability between
components, use industry standards when available.
- To minimize the code that must be written and supported, leverage the capabilities
of the application platform and IT infrastructure services when possible.
- Before resorting to custom development projects, first identify whether purchased
applications and software can satisfy requirements, then consider reusing existing
services.
- Avoid customizing purchased software (through code changes) that will require
maintenance or invalidate support contracts. Customization using mechanisms
provided by the application, such as Oracle flex-fields, is acceptable.
- Carefully consider whether there are single points of failure within designs and
eliminate them when possible and highlight the rest.
- When adopting a new technology, always try to make sure that:
- Viable alternatives are not already in use in other applications or services

Please provide comments on this draft to feedback@stateoftheusa.org 54


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
- The technology in question has not already been considered and discarded
by other development teams
- The technology has no hidden costs or dependencies
- The technology does not conflict with other technologies in terms of its use
of resources
- The technology can be monitored for access, performance, and failure
- The technology adheres to industry standards and does not commit the
application to a sole vendor (this is frequently harder to do than it seems,
most technologies support industry standards but the way the technology is
used by developers (for example, extended features) commit and application
to a sole vendor, and that is what we want to avoid).
Operating System Guidance
Either Solaris or Linux operating system is the preferred OS for all application components
running in the business processing tier or data resource tier. Either Solaris or Linux is
recommended for application components running in the presentation tier. Any third-party
application with components that require another operating system must have an exception
granted during the review process.
From an application architecture perspective, the end-user desktop is out of KNIS control.
Applications should, therefore, be accessible by end-users running Solaris, Linux, or
Windows or Mac OS. This is most easily accomplished by making the application browser-
based and having it support the most popular browsers.
Designing for Flexibility in use of new Cloud Capabilities
Extensive lessons from the developer community are all pointing to the need to design for
the cloud. The good news is the same important constructs for enterprise architectures
apply to cloud architectures. It remains important to split application functions and couple
loosely, for example. There are some differences, however. The following are key:
Network Communication: Designs must use network-based communication
interfaces and not interprocess communication or file-based communication
paradigms. This allows scale in the cloud since each piece of the application can be
separated into distinct systems.
Design for the Cluster: Rather than scale a single system up to serve all users, the
system can be split into multiple smaller clusters, each serving a fraction of the
application load. This is often called "sharding" and many web services can be split
up along one dimension, often users or accounts. Requests can then be directed to
the appropriate cluster based on some request attribute.
Ensure asynchronous interfaces: To tolerate failure, applications must operate as
part of a group but not be too tightly coupled to their peers. Each app piece should
be able to continue to execute despite the loss of other functions. Asynchronous
interfaces are an ideal mechanism to help application components tolerate failures
or momentary unavailability of other components.

Please provide comments on this draft to feedback@stateoftheusa.org 55


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
Design for data persistence: This is always important, but care must be taken in
cloud environments to ensure data are available, including data for recovery from
outages.
Engineering for monitoring in the cloud: The methods and approaches for
monitoring processes in the cloud are still a very dynamic/emerging market.
Implementation decisions must be made that will enable the appropriate monitoring
of processes and alerting on required performance parameters.
Technology Architecture of Client Tier
Browser
The KNIS's strategy is for the browser to be the standard thin-client platform for delivering
Web content to and obtaining input from end-users on large-screen devices. Our strategy
is to ensure compliance with as many browsers as possible, but to provide for focus we
intend on focusing on HTML 5 compliant browsers first. We code to the W3C Web
Standards.
Consumer device apps
Apps on iPhone, iPad, Android and Windows Mobile devices will be a good adjunct to
access to the KNIS through mobile browsers. This architecture enables development of
these apps by ensuring feeds and presentations are readily consumable.
Technology Architecture of Presentation Tier
This section provides guidelines for employing the set of standard APIs and protocols that
are used to insulate the application components from dependencies on specific
implementation technologies. The majority of these APIs and protocols are found in
multiple environments including J2EE, LAMP, SAMP and others.
These APIs and protocols are implemented by the implementation technologies in the
upper platform layer. Many design patterns are available which describe how to use and
combine these APIs in proven ways.
HTML and CSS
The hypertext markup language3 (HTML) is still the most popular way to create static
content in Web-based applications. Browsers use HTML to render application content for
the end-user. HTML should therefore only be used within presentation tier components.
Cascading Style Sheets (CSS) work with HTML to separate design elements and layout
from content. Best practice involves using a combination of HTML and CSS.
HTML has a number of variants and extensions. It is outside the scope of this document to
enumerate them, but the development team should remember that different browsers may
display the same HTML and CSS code differently. This fact should be considered when
deploying an application, but it is especially important for externally facing applications.
NOTE: The capabilities of HTML 5 and the ability of fielded browsers to support HTML 5
are important to track as this is a quickly changing component of the technological
architecture.

3 For more information on HTML, see http://www.w3.org/MarkUp/.


Please provide comments on this draft to feedback@stateoftheusa.org 56
SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
XML and XSLT
One of HTML's constraints is that it limits the choice of client tier devices. A better
alternative is to capture application content in extensible markup language (XML), then
apply extensible style sheet language transformations (XSLT) to produce device-specific –
and perhaps even locale-specific – content. Thus the same XML data can be used for a
wide variety of client tier devices (PDAs, cell phones, PCs, etc.).
Presenting data via widget:
Although concepts of portals are still in use and may be implemented as a presentation
means in many situations, increasingly users expect data to be provided in units
embeddable in any page, site or even desktop. The increasingly common term in computer
design is "Widget." Designing with smart separation of HTML, CSS and use of XML
provides a head start in designing for presentation via widgets. Other standards such as
JSR168 and the standard WSRP interface are also key.
Application Frontends
Application frontend components are responsible for rendering the larger pieces of
application content to the user. The frontend components should be implemented using
widely available technologies and giving consideration to performance and accessibility.
Application frontends should call upon web services and application backend business
logic components to implement business rules, perform database transactions and initiate
interfaces with other back-end services. Application front ends should not display login
pages to end users, collect credentials or authenticate the user. Authentication of users
should be externalized to the security service via use of an agent installed on the
application front end (using standard encryption/authentication standards such as WS or
SAML). Applications should also externalize access control for resources which can be
handled by the URL policy agent (URLs).
Each frontend component should accept HTTP(S) requests and return content in a format
that is acceptable to the end user's device's browser (e.g. HTML).
User and Usability Testing
Although testing of functionality must occur at every tier, Usability Testing occurs at the
client end of capability. Its purpose is to ensure initial requirements are being met and to
solicit new requirements. Usability testing focuses on measuring the capacity of the
solution to meet its intended purpose. It gives users direct input into the system. Since the
KNIS's user base will be broad, care must be taken to sample the entire spectrum of users
from many issue areas, to ensure no one single group is driving requirements for all issues
areas.
Technology Architecture of the Business Processing Tier
Web Services
Within heterogeneous deployment environments, especially environments that span
corporate boundaries, there is a need for services that can be accessed using standard
Web communication protocols such as HTTP, independent of programming language.
Shared services that provide an XML-based interface over HTTP are commonly referred to

Please provide comments on this draft to feedback@stateoftheusa.org 57


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
as Web services. Web services also tend to imply a runtime mechanism for dynamically
looking up the URL of a Web service prior to using it. This is generally accomplished using
a Web service registry, which maintains interface definitions and URLs for a set of Web
services.
A more formal definition of a Web service, provided by the World Wide Web Consortium
(W3C), is as follows:
A Web service is a software system identified by a URI [uniform resource identifier],
whose public interfaces and bindings are defined and described using XML. Its
definition can be discovered by other software systems. These systems may then
interact with the Web service in a manner prescribed by its definition using XML
based messages conveyed by Internet protocols.
SOAP and REST:
The SOAP4 protocol is the recommended approach for implementing most services that
provide a document-oriented interface. A wide experience base and familiarity by
developers means SOAP will likely be in the ecosystem for a long time. SOAP is simple to
generate. It is also known as a more reliable protocol for large data and is regarded as the
better choice for higher availability systems. However, REST is easy to consume and work
with. For many web services, REST5 will be the easiest choice and the capability we
desire. REST is now a protocol of choice for simple interfaces. REST has a low barrier to
entry and enables a simple XML over HTTP approach. REST is strongly supported by a
growing community but it is still a new way of doing things and in many cases there are not
standards for how it should be implemented.
As an example, take identity propagation in web services composition (Client->Service-
>Service->Service), For SOAP, there are accepted means of implementing WS-Security,
giving SOAP a well-vetted messaging solution for propagating identity, whereas most
REST solutions typically either require developers to create their own means for this or use
proprietary solutions. REST can do identity propagation, but it is by complex, unique
methods that make it much harder to use than REST fans like to do.
Guidance for deciding when to use SOAP/WSDL or REST for services provided by the
KNIS:
- If it is a service internal to the KNIS, the simple to generate and consume SOAP is
probably the answer. SOAP has a strong developer community and several features
supportive of enhanced security, availability and reliability.
- If it is a service from the KNIS to the community, publishing SOAP and WSDL is
perfectly acceptable. However, REST may also be provided. REST is lightweight,
readable by humans, easy to build and fast to field.
The most important lesson gained from years of interacting with web services developers:
in either SOAP or REST, everything must be documented or use/reuse will be too hard.
Not every application or service should be Web service-based. Defining XML schemas for

4 For details on the SOAP protocol, see http://www.w3.org/TR/SOAP/.


5 See: http://www.oreillynet.com/pub/wlg/3005
Please provide comments on this draft to feedback@stateoftheusa.org 58
SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
the message types, and generating the SOAP proxies and adapter code, adds complexity
to the application. Unless an application or service requires interoperability with non-Java
or external applications or services, the Web service approach to implementing a logical
component may be overkill.
Assertion of authorization in a Web Services environment
OpenID is a leading candidate for web services identity assertion and delivers great
promise in large scale web services environments as a key component of the authorization
solution.
Web Services Continued
The KNIS is focused on developing a service-oriented architecture of loosely-coupled Web
services in the basic request-response style, leveraging SOAP and WSDL or REST
standards for technical interoperability and reduced time to market. The KNIS initiative has
supported governance and infrastructure build-out needs.
The KNIS must interact with a wide range of data and service providers, as well as
consumers. It should work in ways that make it easy for data providers and consumers to
interact with us, so we will enable consumption of data from any source. However, we
believe that developing shared guidelines with data providers – such as those below -- can
help data providers and consumers in the ecosystem provide their information in ways that
make consumption of their data easier and its utility greater.
The KNIS provides and consumes web services. Web services are accessed through
interfaces. Those interfaces describe how capabilities are presented and the rules and
protocols for using them. Key points:
- KNIS web services conform to the WS-I Basic Profile v1.1 and Security Profile
v1.0.
- All KNIS partners and data providers are requested to provide the KNIS with
WSDL and XML samples. This will enable us to document and share with others the
detailed definitions of the content of services and data provided by others. Consult
with the KNIS architecture team for examples. This will also enable establishment of
an initial Service Inventory: A service inventory is a "responsibility map" that
captures who will be providing the service.
- WSDL is used by data and service providers to express the communication
protocols, message formats, including serialization techniques, and service
locations.
- Service invocation policies such as security requirements, required SOAP headers
etc are also provided by formal definition in the WSDL.
- The KNIS offers data definitions and schemas that can be reused and encourages
collaboration to capture the best means of ensuring these schemas support the
mission and needs of all stakeholders. We provide these definitions and schemas to
enhance interoperability and to help the community avoid the problems which arise
from development of independently generated, not-well-understood WSDLs.
Service specifications hold all the information a user or consumer of the service would want
to know before deciding if they are interested in using the service, as well as exactly how to
Please provide comments on this draft to feedback@stateoftheusa.org 59
SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
use it if they are. It also specifies everything a service provider needs to know to implement
the service. The service specification includes:
- Service name
- Provided and required interfaces
- Rules for how the functions are to be used and in what order
- Constraints that reflect what successful use of the service accomplishes
- Qualities that service consumers should expect such as cost, availability,
performance etc.
- Policies for using the service.
The interface contract specifies the data to be exchanged within the context of a business
interaction and a set of criteria that determine initial and ongoing success. The contract
does not specify how either the service consumer or the service provider will be written.
So, the service consumer and provider can be written in any language and they can be
deployed on any platform. And, a consumer or provider can be a single monolithic
program, or it can be a cluster of programs. Best practices for SOA have functional and
non-functional aspects of a service implemented separately. This separation of concerns
facilitates initial development, ongoing maintenance and reuse for both functional and non-
functional code. A further separation can be applied in implementing each non-functional
aspect – e.g., logging, security, and versioning.
Application Business Web Services
The application and web services implement the business processing functionality and
enforce business rules. They respond to requests from the application specific portlets or
application front ends to perform specific business functions. They should export a service-
specific, public application programming interface (API) that can be called by portlets and
application front ends.
Web Service Registry
The web service registry is responsible for storing information about web services, such as
descriptions and interface information. The web service registry is used by clients of web
services to dynamically discover, locate, determine the interface mechanism for and send
requests to web services which are registered.
The KNIS Service Registry provides a means for services to operate as a collective, since
consumers must have a means to discover services. Service registries provide a means to
find other services and to use them in a loose coupling way.
The KNIS registry will be a standard UDDI server (Universal Description, Discovery and
Integration). This provides a dynamic choice of service based on the functionality required
by the consumer. Its role is similar to that of the Yellow Pages. The key use of the KNIS
UDDI is to store WSDL files which are used by a service consumer at design time.
However, we also make this server available for runtime lookup of service endpoints based
on service name and policies. Typical examples of such policies can be quality of service
requirements, security requirements, preferred communication protocols, service version
and the like.
Please provide comments on this draft to feedback@stateoftheusa.org 60
SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
Service descriptions and interface definitions for services are maintained in the Web
Service Registry. Before developing new Web services, the development team must check
this registry to see if an existing service will satisfy the application's business needs.
An exemplar UDDI is an open source Java implementation of UDDI.
Key features that make this the exemplar:
- Platform Independent
- Supports for JDK 1.3.1 and later
- UDDI version 2.0 compliant implementation
- Use with any relational database that supports ANSI standard SQL (MySQL, DB2,
Sybase, JDataStore, HSQLDB, etc.)
- Deployable on any application server that supports Servlet 2.3 specification
(Jakarta Tomcat, JOnAS, WebSphere, WebLogic, Borland Enterprise Server, JRun,
etc.)
- jUDDI ws.apache.org/juddi/ registry supports a clustered deployment configuration.
- Easy integration with existing authentication systems
Data Registry Choices
A data registry should be the SOR for a given domain's data (e.g., product, customer, or
order information) and should track those data using globally unique identifiers. To optimize
performance, data registries may sometimes store data from other systems of record, but
these duplicate data are then treated as read-only.
The published service API for the data registry should also be domain-specific. Each
parameter that is passed and returned is relevant to the business domain, in this case
customer information.
The data registry should also maintain additional information about each domain entity
(e.g., the customer) including the user ID of the person who created the entity, the date the
entity was created, the user ID of the person who last updated the entity, and the date the
entity was last updated. This approach permits better data auditing and allows for the
archiving of old data that have been kept a long time but not been updated.
Technology Architecture of the Data Resources Tier
Several different technologies for data persistence are available, each with its own
strengths. This section briefly describes these technologies and the uses for which they are
best suited.
Data Background: physical versus domain data
Physical data: This is the data that is actually stored on disk. The details are how it
is stored are described in a database schema. The schema is optimized for the
performance characteristics and requirements of the particular data store.
Domain data: This is the data that is used in the service implementation. It is
described in a standard data model and describes all of the information that is used
in the implementation of a service. It represents the private knowledge of the data. A
Please provide comments on this draft to feedback@stateoftheusa.org 61
SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
subset of the data is the view of the physical data and may come from one or more
physical data stores.
The persistent storage solutions used by the application or service must be limited to a
SQL-compliant RDBMS, an LDAP-compliant directory server, or an NFS-compliant file
system.
Relational Database Management Systems (RDBMS)
RDBMS provides excellent support for OLTP when multiple changes must be applied as
one transaction. Because RDBMS is a mature technology, it is widely used in enterprise
application architectures to solve a majority of data management problems and provide
stable, mature, standardized tools for functions such as data modeling, administration,
querying, and reporting.
Vendors of RDBMS have provided a variety of ―value added‖ features that are not always
portable. If portability is important, the development team should avoid features that
commit the application to a sole vendor. Such features include SQL extensions, data types,
stored procedures, triggers, database links, etc. Triggers within one database instance are
acceptable. Database links (or any database linking mechanism) should not be used within
a transaction. RDBMS is a possible solution for storage of transactional data with frequent
updates, and when there is no need for data replication to multiple locations.
Directory Servers
The directory server is another data store that has gained importance with the movement
to Web-centric applications and services.
Although the details of how data are stored in a directory server are not relevant to
architecture discussions, some vendors have chosen to implement directory servers on top
of relational databases. This convincingly proves the directory server is neither a new
(immature) technology nor one that completely conflicts with relational databases. In most
enterprises, directory servers coexist with relational databases as they cater to different
application needs. The directory server is the recommended solution for requirements with
very fast reads (lookups), few writes, and the need to distribute or publish data to multiple
locations simultaneously.
Directory services are considered loosely consistent, which means there is no guarantee
that all replicas hold the same data at any one time. In other words, not all replicas are
updated instantaneously. Another advantage to loose consistency becomes apparent when
communication problems cause a number of the servers on a network to become
unavailable (or slow to respond): changes made to a directory server while network servers
were out of operation are not lost. When the network problem is resolved, replicas on the
affected servers receive updates.
Object-Oriented Databases (OODB)
OODB came into existence a few years back when object-oriented programming was
becoming popular. Although OODB is suitable for managing complex, dynamic data such
as 3D maps, engineering drawings, and scientific data, all popular RDBMS now have built-
in support for objects, reducing the viability and need for specialized OODB.

Please provide comments on this draft to feedback@stateoftheusa.org 62


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
Within the KNIS, the use of OODB is not allowed unless a very specific need arises. Hence
an exception is required at the time of review. Explicit approval must be obtained for use of
OODB in a vendor product.
XML Database
XML is another recently popularized data format. XML is suitable for mapping object
representation to a "flat" format. XML processing has full support in the latest version of the
J2EE platform. Although there are situations where a native XML database6 may be
suitable and desirable, most RDBMS have support for XML data. One of the advantages
offered by XML DB is Xquery, which allows collections of XML files to be accessed like
databases.
Because there currently are no standards broadly accepted by the majority of the industry,
XML DB is still changing. Hence XML database is not allowed except when an explicit
approval is obtained for use in a vendor product.
File Systems
File systems are part of the foundation of operating systems. Although using a file system
is a quick and convenient way of providing data persistence, file systems should not be
used by enterprise applications for transactional or reference data due to lack (or ease) of
features such as transactions, replication, policy enforcement and provisioning.
File systems may be perfectly suitable for a quick application proof-of-concept, or for
supplying mostly static application data such as configuration data. Generated application
data, such as output and error/log files, can be stored in local files with appropriate access
control mechanism.
RDBMS or Directory Servers
While, with some exceptions, the KNIS does not allow the use of OODB and XML
database, the following guidelines help the development team choose between RDBMS
and directory servers.
Choose a RDBMS solution if the application has many of the following characteristics:
- Access is weighted in favor of writes to reads (high W/R)
- Data change rapidly
- Multiple concurrent clients access or update the data
- Changes to data instantly available to all clients
- Transactional integrity, reliability, and recovery
- Strong ad hoc reporting tools and environment
- Control of data for administration
- Stringent auditing requirements
Applications that employ RDBMS should have a dedicated instance of the RDBMS.

6 A native XML database is one that stores XML documents in its native form without decomposing it.
Please provide comments on this draft to feedback@stateoftheusa.org 63
SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
Choose a directory server solution if the application has many of the following
characteristics:
- Relatively low write to read ratio (low W/R – lots of lookups and occasional
changes to lots of data such as user profile data, application configuration data, etc.)
- Dynamic discovery of resources
- Data published to a large number of users in many different locations while
maintained in a loosely consistent state
- Infrastructure for centralizing user, resource, and security information replication
- Little or no reporting requirements
- Runtime resource provisioning such as network bandwidth allocation
- Support for multi-valued attributes
- Flexibility for schema modifications
- Suitable for distributed infrastructure needs, due to features such as chaining and
referral
Choose a file system solution for logging.
Applications that employ directory servers must use the KNIS's enterprise directory server
setup. An application may not have its own dedicated directory server, unless required for
a vendor product and/or an exception is granted.
Java Database Connectivity (JDBC)
After identifying the system and/or technology for the data resource tier, the development
team must choose the data-access APIs.
Java applications must interact with RDBMS using JDBC API. This requires:
1. Loading an appropriate JDBC driver
2. Creating a connection or a pool of connections
3. Creating and executing SQL statements that return result sets
4. Closing the statement
5. Closing the connection or releasing it to the pool for reuse (optional)
JDBC drivers manage connectivity to the RDBMS and provide support for caching result
sets.
J2EE (Web and EJB) containers provide connectivity to data resources, including
connection pooling and transactional semantics; the development team does not need to
code this function.
Data Storage
Data can be stored in the cloud and locally. To allow for scale, the KNIS can utilize
advanced cloud platforms for data storage using a blended approach of storing some core
data in the cloud.

Please provide comments on this draft to feedback@stateoftheusa.org 64


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
At a minimum, the following types of data will be stored locally to maximize contributions to
privacy, access speeds and other key value adds:
- User profiles
- Saved or recent queries
- Comments associated with users or saved queries
- Geo data
- Tags
Data Source Types
Core Data
Data extracted from government and third party sources that will be stored locally to
facilitate responsive user queries and mash-ups with other core data sources (including
external).
External Core Data
Data residing on external servers that the KNIS cannot download locally but can access
through some sort of data share or API.
The KNIS will maintain custom labels for external core data sets.
Where we can, the KNIS will store local copies of query results – at least to facilitate
display to the user, and will attempt to store query results longer term –e.g. as associated
with a user query so the original query results are maintained. This storage will also enable
analysis of what users are asking.
These local copies should be clearly marked with the date the query was conducted and
option to refresh the data (e.g. re-query the external core data source)
Core Data Labeling
Associated labels or taxonomies for core data will be stored in relational reference tables
and are manually or dynamically applied dependent on the data source, user roles, or other
qualifying variables. Labels are dynamically applied to core data upon display of the data or
in response to a user query.
User data
User data includes profiles, roles, stored queries, and relationships to other users,
comments/interaction history and related data. In addition, users can metatag core or
external data or queries created for use in personal pages, to create analysis and share it
with others.
Query data
Used to store query data so that searches or mash-ups can be saved, shared, or cloned by
users.
It also stores associated comments with the query as well as any supplemental
information.
GEO data

Please provide comments on this draft to feedback@stateoftheusa.org 65


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
The KNIS will contain separate reference tables to enhance or augment geo-specific data.
This should also allow for reverse look-up based on location (e.g. –show me all the data
you have about Dallas, Texas)
Versioning of Data
The KNIS will implement a versioning mechanism for stored data and will allow the users
with the capability to query the archive to compare how results have changed over time.
KNIS retention of historical data will meet previously articulated statements regarding
compliance with business rules and data provenance will be understood and retained.
Automatic Labeling
KNIS data architecture should support automatic data labeling for future versions to reduce
curation overhead.
User Generated and Dynamic Tagging
Data architecture should support user and dynamic tagging of data sets, data elements,
queries or mash-ups.
Content Management System
The content management system is responsible for managing content. The content
typically includes, but is not limited to, textual content that has been tagged with XML or
other markup. The content management system supplies tools to author and administer
content as well as an API by which content can be retrieved.
Content Abstraction Layer
The content abstraction layer should provide a single, vendor-independent interface by
which portals can retrieve and search for content which resides in these various content
management systems. The content management system or content abstraction layer must
make all content and selected metadata available for search engine discovery.
Technology Architecture of the Integration Tier
This section provides guidelines for using the middleware products that implement the
standard APIs and protocols described in the client and presentation and data layers
above. This includes most of the standard relational database products. It also includes
some shared infrastructure services that are written and maintained by the KNIS.
Open System Web Server
The Web Server provides a container that supports servlets. It should be open Java
compliant. It can be used for hosting business services that are implemented as servlets
and do not require any of the special clustering and session management features of the
Application Server. Due to its smaller footprint and easier administration, the Web Server is
also recommended for servlet-based business tier components that do not require
clustering and session management.
During implementation design, the web server must be chosen to be an open robust
platform for HTTP(S) request processing. It should provide support for various open
technologies such as JSP, servlets, and JDBC, as well as content technologies including
CGI, SHTML, PHP, ASP, and JSP. The Web Server is optimized to serve static content via

Please provide comments on this draft to feedback@stateoftheusa.org 66


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
caching. It allows for a direct connection to databases and other resources and does not
support a complex (distributed) transactional environment involving multiple data sources,
global transaction recovery, etc. The Web server also provides support for Web services
with servlet-only endpoints.
Open System Application Server
The Application Server provides containers that support APIs. It also allows the business
services it hosts to be run in a clustered configuration for better load balancing and session
management.
The Application server is an open application container providing a robust and optimized
platform. It provides a high-performance platform for a complex (distributed) transactional
environment. It is the recommended choice for high volume applications (greater than 100
transactions per second) whose components can reside in the same subnetwork.
Open System Portal Server
An Open System Portal Server will be used for portals.
Applications may integrate with existing portals either by providing a portal channel (portlet)
or a link. If the application provides a portlet, its presentation tier must provide channel
content by implementing the interfaces called for by an open system portal such as the
Java System Portal Server. If the application simply provides a link, no special interfaces
are required.
An application should not have its own dedicated production instances of the open System
Portal Server.
Relational Database Server
Applications with large volumes of data, large numbers of concurrent users, and stringent
requirements for reliability, availability, and recovery, should choose an RDBMS. In
keeping with the KNIS open approach, by default design considerations should consider
open source alternatives to RDBMS before closed source/proprietary.
Monitoring Products
Many current solutions for Application Performance Monitoring (APM) require no additional
work by developers. Many leading products are based on capture and analysis of TCP and
UDP packet traffic. Therefore, no specific KNIS guidelines exist for making special
architectural considerations for application monitoring at this point in time. The
development team should work with the operations team to implement the appropriate
monitoring solutions (for example, Mercury Topaz for monitoring information such as
response times and user transaction correctness via synthetic transaction). Use tracking is
also of importance in continuing feedback to the design and governance teams.
Network Attached Storage (NAS)
NAS is a storage architecture that allows multiple servers to share file systems over a
network. It is similar to NFS-mounted file systems but allows more protocols, provides
better performance and security, and is more efficient to run and manage. Although the
benefits are the same, with NAS each server can share the same set of application
executables and configuration files. The files need only be updated once to update all the

Please provide comments on this draft to feedback@stateoftheusa.org 67


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
servers. This storage architecture is recommended for read-only (or seldom-update) files
such as binaries and configuration files.
NAS is intended for file systems (as opposed to raw data). The NAS server assumes
responsibility for maintaining the integrity of the file systems, thus it offers some protection
against clients that might otherwise corrupt the file system.
Storage Area Network (SAN)
SAN is an architecture that enables attaching of remote computer data storage devices to
servers. While NAS uses file-based protocols to do this, SAN uses block-based storage. In
general, KNIS's approach will be to architect for NAS since these protocols more easily
translate to cloud based services for data storage.
Virtual Private Network (VPN)
A VPN is an easy and cost-efficient technology for connecting networks that share the
same security characteristics and trust levels. After establishing a shared session key, the
VPN encrypts all data that flow between the networks.
A VPN should not be used to connect networks with differing security characteristics or
trust levels, as its security will drop to the level of the least secure connected network.
The KNIS will enable connectivity via VPN, focused on the open standards and community
capabilities of OpenVPN (for secure data exchange with stakeholders and access for
management functions).
Monolithic Applications and Legacy Applications
Monolithic applications are applications which are web enabled, but with no separation of
presentation tier and business logic tier, and no business logic API which can be called to
invoke the backend business transactions. Such web-enabled applications can simply be
linked from a portal channel. If tighter integration is desired for business reasons, they
should be screen-scraped or wrapped to provide portal content. Legacy applications are
applications which are not web enabled. Legacy applications should be screen-scraped or
wrapped to provide access to their functions from the portal.
Syndication of Value Added Content
- When exchanging data between applications or services, XML files must be used.
XML insulates applications from internal details, such as database schema, of the
integrated application.
- Legacy applications and data sources which do not have XML capabilities will also
be part of our ecosystem and must be engineered for. However, objective solutions
that import legacy data into XML servers should be considered.
- Syndication can be by data feeds, RSS, flexible XML or integrated client feeds or
other appropriate widget technologies.
Integration Testing
Although testing occurs at every tier, Integration Testing is especially important and KNIS
expectations are that the implementation team will provide recommendations for standards

Please provide comments on this draft to feedback@stateoftheusa.org 68


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
to be met during integration testing (SUSA CTO is approving authority for Integration
Testing metrics).
Content Delivery Services
The KNIS currently intends on using the global delivery fabric of a major cloud provider
(such as Rackspace or Amazon) as the initial content delivery mechanism. However, we
may quickly scale to the size where additional content delivery capabilities (such as
Akamai) may be required to ensure usability for end users. Plans for and pacing of this
capability will be driven by the CTO and may involve standing up a working group for
collegial/community coordination on the best approach.

Please provide comments on this draft to feedback@stateoftheusa.org 69


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
Technology Trends to Watch
As the KNIS architecture evolves and improves this foundational document will as well.
Another important environmental consideration, however, is the fast changing landscape of
technology. Over the planning period for coming KNIS solutions we expect several
changes and also know we must be prepared to move to take advantage of emerging
opportunities when unexpected technological changes emerge. Some key technology
considerations we know we must anticipate include:
Service Registry Changes: With its recent release of the ebXML Web Service Registry, the
community is on a road to explore more collaborative Web services that are further
decoupled and more readily adapted to inter-enterprise business integration.
Web service trends to watch include:
- Asynchronous message style, rather than a synchronous request-response
paradigm
- Document oriented, rather than procedure oriented
- More sophisticated use of standards for data semantics, process orchestration,
and workflow
Changing Web Standards: Currently, these standards are mature, but changes are
expected.
Run Time Dynamics: Dynamic integration is currently limited by the need for business and
service agreements and by technical interface integration at the API level. However, new
technologies that advance run-time dynamics in deployment, integration, and management
of Web services will soon be introduced. With their release, more ad hoc and dynamic
integration can occur, leading to reduced time to market and increased opportunities for
efficient, flexible business automation.
Security Standards: The W3C, OASIS, and WS-Security are developing standards for
negotiating security constraints between service requesters and responders. As these
standards evolve and gain further adoption, inherent support by developer tools, APIs,
service containers, and identity management products will become available. In addition,
standards-based, secure service interoperability should increase.
This will be essential for secure interoperation with external customers, partners, and
suppliers. Today, without a common infrastructure to support identity and authorization
management, defining standard solutions for securing Web services can be a challenge.
Speed in development: As the IT infrastructure leverages new standards-based APIs and
products, the time to develop and integrate services should decrease. Additionally, both
internal and externally facing Web services will use common, established infrastructure
based on simple yet high-quality standards for security information exchange. This
infrastructure will allow for more granular introspection and, therefore, increased security.
Additional items to watch: We must continue to watch Flash, Silverlight, HTML5, REST,
SOAP and WSDL developments. The open source community itself is also changing fast
and needs to be watched very closely.

Please provide comments on this draft to feedback@stateoftheusa.org 70


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
The Current SUSA Beta Architecture
The architecture SUSA is currently using for its private beta implementation can be thought
of in two ways. First, it is an ―operational input‖ to this version 0.1 KNIS architecture
because of the many lessons learned from its implementation. Second, it is an
―operational commentary‖ on the KNIS version 0.1 architecture as it already incorporates
many of the principles outlined in this document. This current SUSA architecture is based
on a mix of open scalable systems (for example, Drupal is an open content management
system and some open MySQL is used) and known/proven proprietary capabilities (for
example, .NET and MS SQL as key components of the data management solution). In the
current architecture, Drupal and MySQL power all user-facing functionality. .Net and MS
SQL power only backend data processing, which is then published as XML to the cloud.
There is no real-time access to the Data Management vertical in the current beta site
architecture.
- A graphical depiction of this architecture is provided at Figure 9.
- Check with the KNIS architecture team for the most current version and technology
details of the architecture.

Figure 9: KNIS's Current Architecture

Please provide comments on this draft to feedback@stateoftheusa.org 71


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
Glossary
Architecture: Design and description, including functionality, components and their
relationships of components. Representation of the coming system.
API: Application Programming Interface. A description of how to interact with an application
or its data.
Channel: A portion (usually a little box) of a portal page that contains content. For
example, the stock quote channel on a page.
Content Management System (CMS): Code specifically designed to store, manage,
disseminate documents.
Credentials: information presented to authenticate a user. For example, a static password,
a dynamic one-time password obtained from a token card, or information on a smartcard.
CSS: Cascading Style Sheets
DSXML: Design Extensible Markup Language
GUI: Graphical User Interface
HTML: Hyper Text Markup Language
HTTP: Hyper Text Transfer Protocol
ITIL: Information Technology Infrastructure Library
J2EE: Java 2 Enterprise Edition
JDBC: Java Database Connectivity
JSP: Java Server Faces
JSP: Java Server Pages
JSR168: A standard specification for portlets.
LAMP: Short for Linux Apache MySQL PhP
NAS: Network Attached Storage
OODB: Object Oriented Database
PDA: Personal Digital Assistant
REST: Representational State Transfer. Better describes and defines HTTP-WWW
client/server/application interactions.
RDBMS: Relational Database Management System
SAMP: Short for Solaris, Apache, MySQL, PhP
SAN: Storage Area Network
Servlets: A unit of java code that runs in a web server.
SOA: Service Oriented Architecture. Designing for flexibility using loosely-
integrated/coupled suites of services (largely web services). From OASIS: "A paradigm for
organizing and utilizing distributed capabilities that may be under the control of different

Please provide comments on this draft to feedback@stateoftheusa.org 72


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
ownership domains. It provides a uniform means to offer, discover, interact with an use
capabilities to produce desired effects."
SOAP: Simple Object Access Protocol. A specification for exchanging structured info.
SMS: Short Messaging Service
URI: Uniform Resource Identifier
URL: Uniform Resource Locator
VPN: Virtual Private Network
WOA: Web Oriented Architecture, considered an extension of SOA. Maximizes browser
and server interaction by REST and POX (plain old XML).
WSDL: Web Services Description Language. Models and defines the web services
available.
WSRP: Web Services for Remote Portals – A protocol used between different instances of
portal servers to enable one portal (a consumer portal) to obtain channel markup from a
portlet which resides on another portal (a producer portal).
XSL: eXtensible Stylesheet Language – A software language which can be used to write
code which converts data or documents from one markup format, such as XML, to another,
such as HTML.
XSLT: extensible Stylesheet Language Transformations
XML: Extensible Markup Language. Rules for encoding documents and data in machine-
readable formats.

Please provide comments on this draft to feedback@stateoftheusa.org 73


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
Architecture Resources
Bredemeyer Consulting
http://www.bredemeyer.com/A variety of resources to help software architects deepen and
expand their understanding of software architecture and the role of the architect. It has lots
of material – papers, presentations, etc. – on software architecture, software architects,
and architecting.

Institute of Electrical and Electronics Engineers (IEEE)


http://www.ieee.org Through its members, the IEEE is a leading authority in technical areas
ranging from computer engineering, biomedical technology, and telecommunications, to
electric power, aerospace and consumer electronics, among others. Lots of publications
and papers available at no charge; some require a fee.

IEEE Standard 1471:―Recommended Practice for Architectural Description of Software-


Intensive Systems‖
http://ieeexplore.ieee.org/xpl/tocresult.jsp?isNumber=18957

Open Group Architecture Framework (TOGAF)


http://www.opengroup.org/ Information flow without boundaries, achieved through global
interoperability in a secure, reliable and timely manner.

SEI – Software Engineering Institute


http://www.sei.cmu.edu/sei-home.html A office of Carnegie Mellon University, the SEI's
core purpose is to help others make measured improvements in their software engineering
capabilities. Most SEI material is timely and free.

SEI Architecture Documentation page


http://www.sei.cmu.edu/ata/arch_doc.html

Worldwide Institute of Software Architects


http://www.wwisa.org/ A nonprofit corporation founded to accelerate the establishment of
the profession of software architecture and to provide information and services to software
architects and their clients. The WWISA site has lots of white papers, books, etc., on
software architecture as a profession.

Zachman Institute for Framework Advancement (ZIFA)


http://www.zachmanframework.com/ The ZIFA is a network of information professionals. Its
mission is to promote the exchange of knowledge and experience in the use,
implementation, and advancement of the Zachman Framework for Enterprise Architecture.

Please provide comments on this draft to feedback@stateoftheusa.org 74


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
Table of Standards
Standards relevant to the KNIS architecture are summarized below, organized by layers of
the OSI model stack:

OSI model

7. Application Layer

NNTP · SIP · DNS · FTP · HTTP · NFS · NTP · SMPP · SMTP ·DHCP · SNMP

6. Presentation Layer

MIME · XDR · TLS · SSL

5. Session Layer

Named Pipes · NetBIOS · SAP · SIP ·L2TP · PPTP

4. Transport Layer

TCP · UDP ·

3. Network Layer

IP (IPv4, IPv6) · ICMP · IPsec ·

2. Data Link Layer

ARP · Ethernet

1. Physical Layer

802.11 · USB

Please provide comments on this draft to feedback@stateoftheusa.org 75


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
About This Architecture

This document – Version 0.1 -- was produced by the SUSA architecture team. Your input
and ideas for improvement are strongly desired.

For questions/comments/suggestions please contact SUSA at:

The State of the USA, Inc.


1146 19th Street, Suite 300
Washington, D.C. 20036

or

General inquiries:
(202) 540-5400
feedback@stateoftheusa.org

Please provide comments on this draft to feedback@stateoftheusa.org 76


SUSA KNIS Draft Architecture by is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.