You are on page 1of 17

Softw Syst Model (2014) 13:825–841

DOI 10.1007/s10270-012-0252-1

REGULAR PAPER

Automatic data collection for enterprise architecture models


Hannes Holm · Markus Buschle ·
Robert Lagerström · Mathias Ekstedt

Received: 5 July 2011 / Revised: 9 April 2012 / Accepted: 21 May 2012 / Published online: 23 June 2012
© Springer-Verlag 2012

Abstract Enterprise Architecture (EA) is an approach used ogy and information [4]. Architecture models constitute the
to provide decision support based on organization-wide mod- core of the approach and serve the purpose of making the
els. The creation of such models is, however, cumbersome as complexities of the real world understandable and manage-
multiple aspects of an organization need to be considered, able [5]. EA ideally aids the stakeholders of the enterprise to
making manual efforts time-consuming, and error prone. effectively plan, design, document, and communicate IT and
Thus, the EA approach would be significantly more prom- business related issues [6].
ising if the data used when creating the models could be As these models are intended to provide reliable manage-
collected automatically—a topic not yet properly addressed ment support it is imperative that they capture all the aspects
by either academia or industry. This paper proposes network of an organization which are of relevance. Thus, the mod-
scanning for automatic data collection and uses an exist- els often grow very large and contain several thousands of
ing software tool for generating EA models (ArchiMate is entities and an even larger number of relationships between
employed as an example) based on the IT infrastructure of these entities. The creation of such large models is often
enterprises. While some manual effort is required to make both costly and time consuming, as various stakeholders are
the models fully useful to many practical scenarios (e.g., to involved and many different pieces of information have to be
detail the actual services provided by IT components), empir- gathered. During the creation process, the EA models are also
ical results show that the methodology is accurate and (in its likely to become (partly) outdated [7]. Consequently, in order
default state) require little effort to carry out. to provide the best possible support, it needs to be ensured
that EA models both are holistic and reflect the organizations
Keywords Enterprise architecture · current state.
Automatic data collection · Network scanning Automatic data collection for model instantiation would
be preferable as this could mean a reduced modeling effort
and possibly an increased quality of the collected data. In
1 Introduction current EA initiatives, few approaches regarding data collec-
tion for the instantiation of models are proposed. In the most
In recent years, Enterprise Architecture (EA) has become an wide-spread EA frameworks, little discussion regarding data
established discipline for business and software system man- collection is available [2–4]. In the EA tools community,
agement [1]. EA describes the fundamental artifacts of busi- there are some approaches that are somewhat applied. That
ness and IT as well as their interrelationships [1–5], typically is, to either import models from third party software or allow
through dimensions such as business, application, technol- usage of SQL queries (or similar) in order to load information
from databases [8,9]. Both approaches however assume that
Communicated by Prof. Richard Paige. the modeled data is already available and updated (e.g., in
third party applications or databases). As the main problem
H. Holm (B) · M. Buschle · R. Lagerström · M. Ekstedt
generally is to collect this data in the first place and these
Industrial Information and Control Systems,
Royal Institute of Technology, 100 44 Stockholm, Sweden approaches have this as a pre-condition, significant manual
e-mail: hannesh@ics.kth.se effort is still required. In the research community, the focus

123
826 H. Holm et al.

has been on proposing methods and principles for model 2 Related works
creation and maintenance [10–13]. None of these academic
initiatives have, however, yet resulted in actual implementa- In the enterprise architecture community, there are few ini-
tion of automatic data collection for EA modeling, and none tiatives focusing on the data collection process for model
have validated their proposed approaches. instantiation and maintenance. Among the most well-known
This paper proposes the integration of network scanners frameworks, data collection is almost completely left to the
and EA tools in order to assist the data collection pro- modeler to handle. Some tool vendors provide support for
cess, especially for infrastructure assets, in EA modeling and data collection. However, in most cases, it is required that the
maintenance. The contribution presented in this paper is two- needed information is compiled somewhere else, implying
fold. First, to show that integrating an EA modeling tool with that the data must be collected by someone at some point in
a network scanner can provide automatic data collection for time. In the academic EA community, most researchers have
the modeling process. Second, to show that the information put their focus on deriving principles and designing methods
in these models are correct in respect to the reality it is sup- for model creation and maintenance. No researcher claims
posed to model. to have the focus on automatic data collection. Regarding
A pilot study of the proposed approach can be found in automated network scanners, there is no previous work that
[14]. This pilot study features automatic model generation has estimated their accuracy in terms of finding software and
using network scanning for a more specific EA metamodel, computer user accounts.
focusing on cyber security analyses. This pilot study some-
what illustrates the usefulness of the proposed approach for 2.1 EA frameworks
an EA metamodel closely related to the metamodel of the
scanner. However, it does not thoroughly explore the reli- There are many frameworks presenting and discussing EA
ability or validity of the approach, or how useful it is for a modeling [2–4]. However, none of these describe and discuss
more general EA metamodel. This work explores the use- the data collection process used when creating the architec-
fulness of the approach for a more general EA metamodel, ture models. No practical help is presented in these frame-
exemplified using ArchiMate, the arguably most wellknown works regarding the data collection for as-is models or for
EA metamodel. The analysis cover multiple key areas of updating already existing models (maintaining the architec-
EA modeling: (i) the comprehensiveness of the approach in ture).
terms of covered ArchiMate concepts, (ii) how the ambigu-
ity in ArchiMate is handled, (iii) how much effort that can 2.2 EA tools
be saved compared to manual modeling, (iv) how generated
models can be maintained, (v) how accurate gathered data In current EA tools, some approaches addressing automatic
are, and (vi) the validity of the approach for EA metamodels data collection can be found. The most common way is to
in general. import models that are made in third party software. For
Empirical data collected through an experiment estimates example, BizzDesign Architect [8] can import data from
the accuracy and effort required to generate EA models office applications and with this data instantiate models.
through the proposed approach. Aspects not statistically stud- Thereby, the automation aspect actually means that data is
ied are discussed accordingly. The Enterprise Architecture reused and does not need to be manually entered if it is already
Analysis Tool [15] was extended in order to enable model available. The interpretation of data documented in the third
transformation between automated network scanning and the party software can however be resource- and time consum-
ArchiMate metamodel. ing, thus contradicting parts of the purpose with automatic
The remainder of this paper is structured as follows: Sect. 2 data collection.
presents related work. In Sect. 3 enterprise architecture meta- Troux [9], for example, allows the usage of SQL queries
models are introduced and the metamodel of the Archi- in order to load information from databases. This approach
Mate language is described. The following Sect. 4 presents focuses on the extraction of the data model and process
the concept of network scanners. Next, in Sect. 5 a map- descriptions, thereby the automatic creation of the informa-
ping between the ArchiMate metamodel entities and network tion architecture as well as the business architecture.
scanner elements is proposed. The approach is studied by Both approaches assume that the data entered, in the
analyzing empirical data from an experiment, as presented third party applications or databases, is already available and
in Sect. 6. This section also contains the actual instantia- updated. However, this data still needs to be manually col-
tions made through the approach. Section 7 discusses the lected in the first place before it can be used. Thus, both
advantages and shortcomings of the approach, and guide- BizzDesign Architect and Troux can automatically instanti-
lines for practical usage of it. Finally, Sect. 8 concludes the ate models, but not automatically collect the information in
paper. the organization.

123
Automatic data collection for enterprise architecture models 827

MooD Business Architect [16] is another enterprise archi- tecture management. The idea is to connect IT management,
tecture modeling tool on the market. This tool has a com- System Operation, and Software Engineering. According to
ponent called Synchronization Activation Technology that the author, three major research challenges have to be met in
enables import and export of data between architecture mod- order to materialize this: (i) provide a coherent view of the
els and various data sources, for example, Microsoft Excel, quality status of the systems. (ii) Keep track of the quality
Microsoft SQL Server, and ODBC compliant databases. status as the systems evolve over time. (iii) Support the col-
However, all connections between architecture models and laboration of stakeholders for achieving the necessary quality
data sources need to be deployed and managed manually. level. Enterprise architecture models can be used to achieve
Sousa et al. [10] presents the Blueprint Management the first of the three challenges. Automatic data collection
System (BMS), a software tool and methodology, used for would be appropriate for the second challenge. In the paper,
collecting architecture information. The tool collects data Breu presents ten principles that are crucial for Living Mod-
from IT projects plans. This approach thus aims to pro- els. Principle no. 2—Close Coupling of Models and Code
vide automatic data collection for architecture models. The states: “Models are generated out of the code (e.g. archi-
approach is however still time consuming. Since the data doc- tecture models)”. This would mean automatic generation of
umentation process still needs to be formalized according to models at some architecture levels. Furthermore, Principle
their specific format. no. 3—Bidirectional Information Flow between Models and
ARIS Business Architect for SAP [17] supports the reuse Code, focus on the idea that information from code can be
and import of SAP process models out of the SAP Solution used to build models as well as information in models can
Manager. This is similar to the method proposed in the pres- be used to generate code. Throughout the ten principles pat-
ent paper as no manual interpretation of SAP data is needed terns and metamodel elements are discussed supporting these
to generate EA models. An issue for the approach is how- ideas. However, there is no tool today that can implement and
ever that not all organizations use SAP Solutions Manager. use these principles yet.
In addition, the SAP process model does only cover cer- In [20], the focus is on the design of the enterprise architec-
tain parts of the complete enterprise architecture; while other ture. The main result is a framework for engineering driven
aspects of EA, such as infrastructure, are not considered. The EA design and a software tool implementation of this frame-
method proposed in the present paper can be applied to next work. There is no however description of the data collection
to any organization and involves collection of, in particular, process for the instantiation of models. The design of EA in
infrastructure assets. Thus, the method used by ARIS Busi- [20] essentially concerns deriving the metamodel needed for
ness Architect can be seen as a potential complement to the an enterprise. The software tool implementation proposed
proposed approach. is a tool incorporating the framework and metamodel. In
There are other IT management tools, such as Configu- [21], the focus is on model maintenance and the main finding
ration Management Data Bases (CMDBs) [18,19], available reported in this publication is a discussion of the shortcom-
that like scanners can collect information in an organization. ings of existing model maintenance approaches. The authors
If these where to be integrated with an EA tool, the combi- present a federated approach to deal with these shortcomings.
nation could be used for instantiating architectural models. However, there is no discussion regarding the data collection
part of model maintenance.
2.3 EA methods and principles
2.4 Accuracy of automated network scanners
Buckl et al. [13] presents a Wiki-based approach to enter-
prise architecture documentation and analysis. According to There is, to the authors knowledge, no previous work that has
Buckl et al. companies who start an enterprise architecture studied the accuracy of automated network scanners’ regard-
initiative usually do not have a pre-defined information model ing the detection of software and user accounts. There are
for this. Many companies start with regular spreadsheets or however two studies on the topic of detecting vulnerabilities.
similar. Instead, Buckl et al. propose a Wiki-based approach That is, Holm et al. [22] analyzed how many existing true vul-
for collecting and sorting the information needed for enter- nerabilities that are properly detected by scanners, and how
prise architecture management. The main benefit with the many non-existing vulnerabilities that are falsely reported by
Wiki-approach is that the data collection is distributed but them. In another study, Holm [23] also analyzed how many
still managed formally. Although the Wiki-based approach vulnerabilities that would be removed if one would follow all
proposed seems interesting there is still a need for data col- guidelines provided by vulnerability scanners’ (oftentimes
lection to provide input to the Wiki. thousands of pages). None of these studies, however, evaluate
In [12], an approach for handling change is proposed. The the scanners’ accuracy in terms of finding software and user
approach is called Living Models and it is based on theories accounts—an objective of great importance towards measur-
of model based software development and enterprise archi- ing the value of the proposed approach.

123
828 H. Holm et al.

2.5 Analysis and conclusions concerns regarding their business and the supporting IT
systems. ArchiMate is extensively presented in [4] and is
The EA frameworks available today, such as TOGAF and partly based on the ANSI/IEEE 1471-2000, Recommended
the Zachman framework, provide very little data collection Practice for Architecture Description of Software-Intensive
support. The research initiatives have ideas and suggestions Systems, also known as the IEEE 1471 standard [30]. The
for principles and methods regarding data collection support, Open Group accepted the ArchiMate metamodel as a tech-
but none has implemented these ideas in any EA modeling nical standard [31] and as a part of The Open Group Archi-
tool. When it comes to available EA modeling tools some, tecture Framework (TOGAF) in 2009 [2].
like Troux and BizzDesign Architect, help the instantiation The ArchiMate metamodel consists of three layers; the
of models based on data already existing in other softwares Business layer, the Application layer and the Technology
or data bases. Thus, these do not help in collecting data in layer. Where the technology supports the applications, which
the organization, they rather make use of data already avail- in turn support the business. Each layer consists of a number
able in other tools. Comparable to network scanners, CMDBs of entities and defined entity relationships.
also collect information about applications and their rela- The entities in each layer are categorized into three aspects
tions. However, we are aware of no approach that has inte- of enterprise architecture: (i) The passive structure—mod-
grated a CMDB tool with an EA modeling tool. The only eling informational objects. (ii) The behavioral structure—
comparable work found is the Blueprint Management Sys- modeling the dynamic events of an enterprise. (iii) The active
tem (BMS) that collects data from IT project plans in an orga- structure—modeling the components in the architecture that
nization. It is however unclear how accurate that the retrieved perform the behavioral aspects.
information is and how the information in these IT project Figure 1 presents the ArchiMate metamodel. An overview
plans is gathered and entered in the first place. For organi- of the different types of relations in the language can be found
zations using SAP systems, Aris Business Architect can aid in Fig. 2; the entities utilized in the paper are described in
in the collection and modeling on a business process level. Sect. 5.2. For detailed descriptions, cf. [4].
Thus, covering some of the EA layers and not the same layers
as covered by the scanner and tool presented in this paper.
These two might thus be complements to each other in order
to provide a more complete EA model. 4 Automated network scanning

An automated network scanner is an appliance or software


3 Enterprise architecture metamodels system that is used for scanning the architecture of a network
and reporting information regarding the network architec-
A main concept in enterprise architecture is the metamodel ture through TCP and UDP [32]. Network scanning involves
which acts as a pattern for the instantiation of the architec- assessing which hosts are active on the network, what operat-
tural models. In other words, a metamodel is a description ing systems they use, what services they run, what application
language used to create models [2,4,6,24]. clients that are installed and any users that have privileges
Most EA tools have a built-in metamodel that guide the on the devices. Any device connected to the scanned net-
modelers what elements to instantiate. Some tools allow work can be probed; it does not matter if it is, for example, a
the users to change and define their own metamodels. In printer or a PC. Table 1 describes an overall list of the infor-
research, there are several metamodels proposed. Lankhorst mation provided by a network scanner. Three major types
[4] have probably published the most well-known and wide- of data relevant for ArchiMate modeling can be collected:
spread metamodel, called ArchiMate. There is also a set of (i) computer hardware and IP addresses, (ii) user accounts
metamodels proposed focusing on different aspects of archi- present on a computer system, and (iii) software present on
tectural analysis, e.g. modifiability [25], system quality anal- a computer system. Such software includes operating sys-
ysis [26], interoperability [27], dependency analysis [28], and tems or firmware, application servers (i.e., end-points), and
business value analysis [29]. application clients. A few example signatures include Win-
dows XP SP2, SAP GUI 7.1 (the client for accessing SAP
3.1 The ArchiMate metamodel applications), Oracle WebLogic Server Node Manager 10.3
and Apache HTTP Server 2.2. The signatures do however
In this paper, the automatic data collection approach using unfortunately not always provide the correct results, which
network scanners is instantiated with the ArchiMate meta- in practice means that, e.g., Windows 2000 Service Pack 4
model since it is an open, independent, and general model- can be identified as Windows XP Service Pack 1 and Apache
ing language for enterprise architecture. The primary focus Web-server 2.2.3 as Apache Web-server 2.0.0. Thus, it is of
of ArchiMate is to support stakeholders how to address importance to evaluate how accurate these estimates are.

123
Automatic data collection for enterprise architecture models 829

Information aspect Behavior aspect Structure aspect

Product Value

Represen- Contract Business Business


Business
Meaning tation Service Collaboration
Interface
Event

Business object Business


process/ Business role Business
function
Actor

Business
Application

Application Application Application


Service Interface Collaboration

Data Object
Application Application
Function/ Component
Interaction

Application

Technology

Infrastructure Infrastructure
Service Interface

Artifact Node Communication


path

System Software Device Network

Information aspect Behavior aspect Structure aspect

Fig. 1 The ArchiMate metamodel in the notation that was suggested in [4]

Fig. 2 An overview of Specialization Assignment Used by Association


relations available in the the
ArchiMate metamodel [4] Composition Realization Flow Junction

Aggregation Triggering Access

A network scan can be either authenticated or unauthen- deployed services as they do not need to probe as much
ticated. During an authenticated scan the scanner is given and thus are less intense. However, it is not always the
authentication parameters (i.e., credentials) of systems to case that credentials are readily available for the individ-
enable more detailed and presumably also more accurate ual(s) performing a scan and it can be seen as intrusive
scans. Authenticated scans are typically less disruptive on as the systems’ local files are probed. This study assess

123
830 H. Holm et al.

Table 1 The output of


automated scans Attribute Example

System information
MAC address 000C290326CC
IP address 173.18.3.1
User information
User accounts on system John Doe
Software
Operating system type and version Windows XP SP2
Application server (i.e. end-point) port, protocol, type and version 80, HTTP, Apache HTTP Server 2.2.1
Application client type and version Adobe Reader 9.0

the capability of both authenticated and unauthenticated


scans.

5 A framework for automatic EA model generation

In order to enable generation of EA models there is a need


to map, or relate, the metamodel of the data collection tool
(automated network scanning) to the chosen EA metamodel
(in this case ArchiMate). This chapter describes the method-
ology for relating ArchiMate to automated network scanning
and the aspects of the ArchiMate metamodel that are possi-
ble to instantiate. The data quality of an automated network
scanner is studied in Sect. 6.3. Other aspects of importance
Fig. 3 The extension of the EA tool to enable the mapping presented
towards the reliability and validity of the proposed approach in Sect. 5
are discussed in Sect. 7.

5.1 Methodology for integrating ArchiMate and automated


scanning a general-purpose enterprise architecture metamodel such as
the one described in Sect. 3. EA models representing specific
The methodology for mapping the output of an automated scenarios are created utilizing the constraints defined in the
network scan and the ArchiMate metamodel involves two chosen metamodel (e.g., ArchiMate, 2 in Fig. 3).
parts: (i) a means of relating the metamodels, and (ii) a tool The tool allows the definition of viewpoints as suggested
for conducting the chosen translation. by [4]. Thereby subparts of the model can be formatted and
The means of relating the metamodels during this project processed in so called views. They help to focus on the infor-
was through manual effort by the researchers. The guidelines mation that is relevant for a certain domain or stakeholder
of the ArchiMate metamodel [4] and the NeXpose scanner and hide aspects that are not important. As the metamodel of
[33] were utilized for this purpose (cf. Sect. 6). However, as ArchiMate is used, viewpoints are created according to the
automated scanners have the same general content the trans- layers and aspects that are specified in [4].
lation process would be next to identical for any scanner. To use the results gained from an automated scanning
An existing software tool1 [15] was extended to enable an extension of the tool was necessary. Automated network
a means of model transformation—automatically generating scanning allows results to be exported into XML files (4 in
models based on a chosen mapping. This tool consists of two Fig. 3). The XML files are structured according to a schema
parts to be used in succession. The first component allows definition file (XSD) (3 in Fig. 3). The possibility to create
the definition of metamodels (1 in Fig. 3). These metamodels mappings between XSD files and metamodels (5 in Fig. 3)
can either describe a certain system quality of interest or be was added in the tool in order to automatically instantiate
the metamodel based on a scanner’s XML files (6 in Fig. 3).
1 The software tool can be downloaded at http://www.ics.kth.se/eat. Figure 3 visualizes the discussed extension.

123
Automatic data collection for enterprise architecture models 831

Fig. 4 Definition of the mapping within the tool; the ArchiMate metamodel can be found to the left, the XSD describing NeXpose reports to the
right, and the specified mapping in the center

Each scanner has a slightly different way of denominating Manager, and departments rather than computer user
concepts in the XSD, however, the main characteristics of a accounts. Manual effort is needed to fulfill this type of
XSD stay the same. Part of the XSD utilized by the scanner scenario. Naturally, more complex requirements need more
NeXpose can be seen in Fig. 5. The model transformation is manual work for specifying the ruleset of the model trans-
carried out through specifying what aspects of the XSD that formation.
should be mapped to what concepts in ArchiMate. Figure 4 The rest of this section describe the concepts of the Archi-
illustrates how the mapping specification is done within the Mate metamodel that can be mapped to the output of an auto-
tool. Once a mapping has been specified the scanner output mated network scanner. The translation is due to the purpose
is parsed using the Document Object Model and thereaf- of this study carried out through the viewpoint of ArchiMate.
ter queried using a self-developed algorithm. Based on the ArchiMate relations can be derived to connect all entities
result of the querying, the instantiated EA model is created. modeled except business actors; the scanner’s metamodel
An example translation, utilized in the present study, is given relates business actors to devices and system softwares, and
in Sect. 6.1. as ArchiMate prescribes relating business actors to services
this study does not consider it an adequate representation of
5.2 Translated concepts ArchiMate’s metamodel.
A Business Actor is an organizational entity capable of
The key characteristic in terms of translating contents of (actively) performing behavior [4]. A scanner collects all user
the XSD and XML to the ArchiMate metamodel is that of accounts of computer systems. It is possible to relate these
ambiguity. That is, the concepts of ArchiMate are, as other different actors to, for example, departments. However, such
enterprise architecture metamodels, possible to interpret in a mapping would naturally require additional effort from the
many different ways. As a consequence, one type of map- modeler performing the translation.
ping can be useful for one purpose and useless for another. An Application Component is a modular, deployable, and
For example, some enterprises might not be interested in replaceable part of a system that encapsulates its contents
modeling software such as Adobe Reader and Apache Web- and exposes is functionality through a set of interfaces [4].
server, or computer system accounts such as “John Doe”. A scanner collects data on various application components
Such an enterprise might only need data about larger appli- such as different ERP system modules and application clients
cations such as SAP Solution Manager and Oracle Enterprise such as Adobe Reader.

123
832 H. Holm et al.

<xsd:complexType name="nodesType ">


<xsd:sequence>
<xsd:element name="node" type="nodeType " maxOccurs="unbounded"/>
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name="nodeType ">
<xsd:sequence>
<xsd:element name="names" type="namesType " minOccurs="0"/>
<xsd:element name="fingerprints" type="fingerprintsType" minOccurs="0"/>
<xsd:element name="software" type="softwareType " minOccurs="0"/>
<xsd:element name="tests" type="testsType "/>
<xsd:element name="endpoints" type="endpointsType " minOccurs="0"/>
</xsd:sequence>
<xsd:attribute name="address" type="xsd:string"/>
<xsd:attribute name="status" type="nodeType_statusType "/>
<xsd:attribute name="hardware-address" type="xsd:string" use="optional"/>
</xsd:complexType>
<xsd:complexType name="namesType ">
<xsd:sequence>
<xsd:element name="name" type="xsd:string" maxOccurs="unbounded"/>
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name="fingerprintsType">
<xsd:sequence>
<xsd:element name="os" type="osType " maxOccurs="unbounded"/>
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name="osType ">
<xsd:attribute name="certainty" type="xsd:string"/>
<xsd:attribute name="device-class" type="xsd:string" use="optional"/>
<xsd:attribute name="vendor" type="xsd:string"/>
<xsd:attribute name="family" type="xsd:string"/>
<xsd:attribute name="product" type="xsd:string"/>
<xsd:attribute name="version" type="xsd:string" use="optional"/>
<xsd:attribute name="arch" type="xsd:string" use="optional"/>
</xsd:complexType>

Fig. 5 An excerpt of the NeXpose XSD

An Application Interface declares how a component can A Network is a physical communication medium between
connect with its environment [4]. If a scanner finds an Appli- two or more devices [4]. As network scanners provide data
cation Component running on an end-point (i.e., a port) it regarding IP addresses of computer systems it is possible to
will provide information about how it communicates (e.g., detail networks.
type of protocol). All categories of the ArchiMate Structural aspects are thus
A System Software is a software environment for specific possible to map, and one of three parts of the metamodel’s
types of application components and data objects that are Behavioral aspects. All but one of these entities (System Soft-
deployed on it in the form of artifacts [4]. A scanner iden- ware) belong to the Structural aspects—which also is quite
tifies several types of system software such as web servers logical, considering that an automated scanner only gathers
and operating systems. information available through remote queries on a computer
An Infrastructure Interface is a point of access where the network.
functionality offered by a node can be accessed by other
nodes and application components [4]. A network scanner
details the protocol (e.g., SMTP) and port (e.g., 25) which a 6 Empirical study of the proposed approach
software end-point utilizes. It is also used to relate an oper-
ating system the software employed on it. This section describes how the proposed approach was tested
A Device is a physical computational resource upon which through an experiment on an actual network. Due to the ambi-
artifacts may be deployed for execution [4]. A network scan- guity of the ArchiMate metamodel (cf. Sect. 5.2) there is a
ner provides information about the hardware address and IP need to define the actual translation that is employed. The
address of a system. translation which is used for the empirical study is described

123
Automatic data collection for enterprise architecture models 833

Table 2 Mapping of ArchiMate


entities and the output of ArchiMate entity Scanner output Example scan output
automated scans
Business actor User account John Doe
Infrastructure interface Application protocol HTTP, FTP, SMTP
System software (end-point) Application server Apache Web-server 2.2.3
Application component Application client Adobe Reader 9.0
System software (OS) Operating system Windows XP SP2
Device IP, MAC addresses 172.18.1.3, 000C290326CC
Network A range of IP addresses 172.18.1.*

in Sect. 6.1. Section 6.2 provide details regarding the exper- A Business Actor is translated as a user account on a com-
imental setup. Section 6.3 analyzes the reliability of the pro- puter system. An application server is translated to a System
posed approach. Sections 6.4 and 6.5 provide descriptive Software (end-point). An Application Component is mod-
results about the models created during authenticated and eled as an application client residing on a scanned system
unauthenticated scans of the network. The scanner NeXpose (for example, Adobe Reader 7.0). A System Software (OS) is
[33] was chosen as it has demonstrated good results in previ- seen as the actual operating system employed on the probed
ous tests [22]. However, as almost all scanners have very sim- system. An Infrastructure Interface is modeled as the appli-
ilar network scanning methodologies and signatures (built on cation protocol of a System Software (end-point). It is also
Nmap [34]) we believe that this study should be indicative of needed to relate different Application Components to System
the accuracy of other available automated network scanners Software (OS). That is, a scanner does not detail the spe-
as well. cific Infrastructure Interfaces between different Application
Components and System Softwares (OS)—but it is needed to
connect them. A Device is described through its IP and MAC
6.1 An example translation address. A Network is a range of IP addresses found during
the scan. For example, if the scanner finds IPs 172.18.3.2,
This paper provides an example mapping that illustrates the 172.18.3.5 and 173.18.4.5 then two Network entities would
usefulness of the proposed approach. The exemplified trans- be instantiated—172.18.3.* and 172.18.4.*.
lation covers all ArchiMate related information that a scanner A more detailed version of the mapping between the
can provide, given that the a minimal manual effort is utilized NeXpose XSD and the ArchiMate implementation in the
to perform the translation. That is, there has been no manual software tool used for the model transformation (cf. Sect. 5.1)
effort conducted to, for example, detail any services provided can be seen in Table 3. For example, a Device is instanti-
or to relate computer user accounts into groups. As a conse- ated by the XSD concept “node (nodeType)”, denominated
quence, it can be seen as a general type of mapping which through two of the attributes of this concept; “address” and
requires very little effort to perform. There was no focus “hardware-address”. An overview of all relations instanti-
placed on more enterprise-critical systems such as ERP sys- ated can be seen in Table 4. The directions of the relations
tems simply because this was not available in the experiment are described through their source entities and target entities,
network. The translation between scanner output (XML) and using both ArchiMate terminology and NeXpose XSD ter-
ArchiMate input was handled through the software tool pre- minology. The reader is referred to Sect. 3 for information
sented in Sect. 5.1, but any model transformation tool capable about these types of relations.
of translating XML to ArchiMate concepts would suffice.
A summary of the utilized mapping can be seen in Table 2.
This mapping was made from the perspective of the Arch- 6.2 The experimental setup
iMate metamodel as the purpose of the study is to evaluate
how enterprise architecture modeling can benefit from auto- The main experimental setup was designed by the Swedish
mated network scanning. Six different ArchiMate entities can Defence Research Agency with the support of the Swed-
be automatically generated based on this framework, namely: ish National Defence College. The environment was set to
Business Actor, Infrastructure Interface, System Software describe a simplified critical information infrastructure at a
(Operating System and end-point), Application Component, small electrical power utility and was composed of 20 phys-
Device, and Network. Relations can be instantiated to con- ical computer servers running a total of 28 virtual machines,
nect all entities except business actors, for reasons described divided into four network segments. Various operating sys-
in Sect. 5.2. tems and versions thereof were used in the network, e.g.,

123
834 H. Holm et al.

Table 3 The implemented


ArchiMate NeXpose XSD
mapping between entities
Business Actor test (testType)
Name id
Free Text key
Name status
Required cifs-acct-password-never-expires
Infrastructure Interface endpoint (endpointType)
Name protocol
Name port
System Software fingerprint (service_fingerprint_Type)
Name product
Name From Parent service->name
Free Text –
Name version
Application Component fingerprint (fingerprintType)
Name product
Free Text –
Name version
System Software os (osType)
Name product
Free Text –
Name version
Device node (nodeType)
Name address
Free Text –
Name hardware-address
Infrastructure Interface fingerprint (fingerprintType)
Name product
Free Text –
Name version

Table 4 The implemented


mapping describing ArchiMate relation Source Target
relationships
Assignment Device System software
node (nodeType) fingerprint (service_fingerprint_Type)
Assignment Device System software
node (nodeType) os (osType)
Used by Application component Infrastructure interface
fingerprint (fingerprintType) fingerprint (fingerprintType)
The first row of each section Used by Infrastructure interface System software
show the related ArchiMate fingerprint (fingerprintType) os (osType)
concepts; the second the
Used by Infrastructure interface System software
corresponding NeXpose XSD
concepts endpoint (endpointType) fingerprint (service_fingerprint_Type)

123
Automatic data collection for enterprise architecture models 835

Windows XP SP2, Debian 5.0 and Windows Server 2003 and a p-value of less than 0.05 is a commonly used refer-
SP1. Each host had several different network services oper- ence value for claiming that there is a significant difference
ating, e.g. web-, mail-, media-, remote connection- and file between the compared sources of data. A p value of less than
sharing services. More information about the environment 0.05 implies that there is less than 5 % probability that the
can be found in [35,36]. assessed differences between two or more sources of data are
due to random variation.
6.3 Data quality provided by the scanner As can be seen in Table 5, authenticated scanning is
equally accurate or more accurate than unauthenticated scan-
Accuracy is something of great importance towards the ning in all aspects. Application Components show extremely
reliability of the proposed approach. If the output from an significant differences due to that client-side software (the
automated network scanner is erroneous then the resulting tested type of Application Component) cannot be assessed
generated EA model(s) will not be reliable. Some aspects without giving the scanner credentials. The authenticated
can however be more important than others. For instance, it scan is also clearly better suited for assessing version num-
is typically of great importance that the correct applications bers of System Software, but fairly similar when it comes to
and systems are identified; but it might not be important that the type of System Software. The statistical tests support this
their actual version numbers are correct. assessment; there were significant differences when assess-
This study assesses the accuracy of Devices, System Soft- ing System Software (OS version) ( p = 0.0089) and System
wares, Application Components and Infrastructure Inter- Software (end-point version) ( p < 0.001) but no significant
faces. The accuracy of Business Actors was unfortunately differences for System Softwares (OS) (100 % for both types
not possible to study due to that the virtual images of the stud- of scans) and System Software (end-point) ( p = 0.27).
ied systems were mistakenly manipulated before many users
had been manually logged. However, the difference between
the number of generated Business Actors during authenti-
6.4 Model created using an authenticated scan
cated (cf. Sect. 6.4) and unauthenticated (cf. Sect. 6.5) scans
display the relative difference between the two scan types.
An authenticated automated scan was performed on the archi-
Also, the accuracy of the Network entity was not assessed as
tecture described in Sect. 6.2 using NeXpose. The resulting
all the Devices in the network were part of the same LAN—
model consists of 110 Application Components, 335 Infra-
thus only generating a single Network entity (with relations
structure Interfaces, 890 Business Actors (the large major-
to all devices). Several important concepts not available in
ity of these actors on the server systems, e.g. a dns server,
the current ArchiMate metamodel were also analyzed: as
a web server and a mail server), 28 Devices, 195 System
it can be valuable for enterprises to know the versions of
Softwares (of which 28 for operating systems), 1 Network,
their software (e.g., for patch management), the accuracy
and 679 relations between them. These entities and relations
of System Software versions and Application Component
were instantiated through the mapping described in Sect. 5.
versions were also assessed. All Devices, System Softwares
Figure 6 graphically illustrates the result for one of the sys-
(OS), System Software (end-point) and Infrastructure Inter-
tems (172.18.2.15). While the complete model is too large
faces in the experimental setup were analyzed. Studied client-
to display in the paper, the interested reader is welcome to
side Application Components were however selected from
receive it through contact with the authors.
the entire pool of available software clients through simple
random sampling as there were too many to study all such
components.
An accurate assessment is one that is correct, i.e., if a 6.5 Model created using an unauthenticated scan
system has the System software Windows XP with Service
Pack 2 and it is identified as Windows XP Service Pack 1 The generated model from an unauthenticated automated
then it will be accurate in terms of System Software (OS), but scan can be seen in Fig. 7 (the Device 172.18.2.15). The
inaccurate in terms of System Software (OS version). Thus, resulting model consists of 227 Infrastructure Interfaces, 106
the accuracy of the different variables are the mean values Business Actors, 28 Devices, 198 System Softwares (28 of
of independent sequences of correct/incorrect (0/1) answers. which were operating systems), 1 Network, and 462 relations
These characteristics make it reasonable to assume binomial between them. As for the model generated by the authenti-
distributions, fulfilling the requirements for performing two- cated scan, the interested reader is welcome to contact the
tailed hypotheses testings of the assessed results. When a authors. While the accuracy of Business actors could not be
p value is mentioned in this chapter it refers to results from evaluated, it is clear that the authenticated scan is much more
two-tailed hypothesis tests [37]. The p value is used to potent in this aspect (890 compared to 106 generated Busi-
explain the statistical difference between sources of data, ness Actors).

123
836 H. Holm et al.

Table 5 Accuracy of unauthenticated and authenticated scans for different types of automatically assessed ArchiMate related data
Variable Accuracy unauth. (%) Accuracy auth. (%) p value Samples

Device 100.00 100.00 a 28


System software (OS) 100.00 100.00 a 28
System software (OS version) 62.50 100.00 0.0089 28
Infrastructure interface 89.66 91.38 0.66 116
System software (end-point version) 67.33 92.00 <0.001 110
Application component 0.00 100.00 <0.001 48
Application component (version) 0.00 100.00 <0.001 48
a Not possible to compute

7 Discussion want to model the most enterprise significant concepts to


those who require very detailed views of their architectures
This chapter critically examines the applicability of the (e.g., when the EA goal is to manage IT cost or IT security).
approach proposed in this paper. Naturally, more complex architecture models also require
Automated network scanning can only instantiate a subset more focus on useful model viewpoints.
of all the entities in common EA metamodels It is true that one Ambiguity A challenge for the individual(s) performing
only can assess a subset of the entities that are of relevance the translation between scanner output and EA metamodel
for an EA model. Most of the entities in the structural aspects input is the general ambiguity of EA metamodels. That is, one
in the ArchiMate metamodel are however possible to gener- concept can be interpreted many different ways. An example
ate. Furthermore, the IT part of an EA model are often the of this property is the ArchiMate concept Business Actor.
most labour-intensive to model due to the large multiplicities This entity can, for example, describe either an individual or
involved (e.g., all software in an office LAN) and an auto- a department. An automated scanner finds Business Actors
mated scan alleviates most of the manual work required to on a level of accounts on computer systems, an abstraction
do this. Thus, it can be seen as a potent way to save resources level that typically imply an individual rather than a depart-
in terms of modeling. ment. As noted in the previous paragraph, it is possible to
Automated network scanning can only identify software customize the output of a scanner; for example, to relate dif-
that it has signatures for If an organization has deployed soft- ferent individuals to departments. However, more complex
ware that is not commercial-of-the-shelf, the scanner will not requirements need more manual effort to implement.
be able to correctly identify them (as there are no signatures How much time can be saved using this approach?
to identify them). At best, such a software will be denoted as Närman et al. [38] presents a study on data quality using
“unknown” by the scanner; at worst, it will be identified as ArchiMate in which 20 entities (Application Components,
the wrong product. To manage this type of scenario, there is Application Services, Business Processes and Data Objects)
a need to manually model these types of software. and 19 relations were instantiated. One task in their study was
Automated network scanning provides too detailed infor- to measure how much time it took to perform this modeling.
mation EA is typically seen as a tool for understanding the A comparison of these results and the results from modeling
“big picture”; whereas network scanners mainly collect data through automated scanning (both authenticated and unau-
on IT infrastructure components. Thus, much of the infor- thenticated) can be seen in Table 6. The time necessary for
mation provided by scanners would be omitted by many deploying and tuning the scanner, and the time required to
EA initiatives. For example, some enterprises might want perform the mapping, are not part of this table as we did not
to generate models describing only enterprise applications formally capture them; however, we estimate that the deploy-
such as SAP Solution Manager and Oracle Enterprise Man- ment and tuning of the scanner took about 2 h, and that the
ager rather than all infrastructure assets. An important task model mapping required approximately 3 h. Notable is that
for such enterprises when employing automated scanning for the majority of the effort required to manage these aspects
generation of EA models is thus to delimit the information correspond to their deployment: once a scanner has been
provided by the scanner. Luckily, as software are uniquely deployed and tuned only minor effort is typically required
identified in the scanner output it is very simple and requires to re-tune it for future scans. Similarly, future changes of a
little effort to add a filter that describes what software to metamodel typically require minor effort compared to the
model, and what software not to model. We believe that this first mapping.
constitutes an advantage for the proposed approach; it can Although a simple comparison such as this is highly infor-
satisfy a range of user needs, from those whom only might mal and too simplistic (e.g. different entities and relations

123
Automatic data collection for enterprise architecture models 837

Fig. 6 A node in the model


generated by an authenticated
scan

require various effort to model and effort does not likely vious one; each new scan generates a completely new model.
scale linearly with entities/relations), it clearly highlights the This is an issue which we aim to address in the future.
applicability of automatic modeling—it is simply less time- Automated network scanning only generates models rep-
consuming than manual modeling. A notable limitation of the resenting single points in time This problem is however pres-
current implementation of the proposed approach is however ent also for manual approaches—but in a much more evident
that the result from a new scan cannot be appended to a pre- way as it is more resource demanding and thus unlikely to be

123
838 H. Holm et al.

Table 6 Comparison of modeling effort in hours (hh) and minutes (mm)


for the proposed approach and results by Närman et al. [38]
Variable Time Entities Relations
(hh:mm)

Modeling by Närman et al. [38] 5:12 20 19


Modeling using an authenticated scan 3:08 1,558 679
Modeling using an unauthenticated scan 1:23 558 462

carried out as often. Updating a model through an automated


scan could in theory, given a network such as the one in this
study, be carried out every second (unauthenticated) or third
(authenticated) hour.
Data quality of an automated scan The quality provided
using automated scanning seems to be very high in general.
However, the best results are provided through authenticated
scanning as it is difficult to identify users and not possible
to identify client-side application components (e.g., Adobe
Reader) with unauthenticated scanning. Additionally, soft-
ware version assessments are significantly more accurate
( p < 0.01) with this type of scan. One must furthermore
keep in mind that also manual data collection suffers from
data quality problems.
How can “the rest” of ArchiMate’s metamodel be auto-
matically generated? A few other practically applicable ways
to automatically generate models are presented in Sect. 2.
We believe that these approaches show that the EA research
field is beginning to grasp the importance of resource-effec-
tive generation and maintenance of models. The current
approaches are already able to instantiate (if correctly imple-
mented) much of the EA metamodel concepts. However,
there are still ample work that need to be carried out. The
proposed approaches (including ours) all have several draw-
backs and instantiated models need to be interpreted with
care.
Generalizing the approach to other EA metamodels The
proposed approach have been tested both against a general
EA metamodel (ArchiMate) and a more specific EA meta-
model for cyber security analysis [14]. Both studies show
sound results, and as such there is no reason to believe that
automated network scanning cannot be used to gather data
also for other EA metamodels (e.g., [2,3]).
Automated network scanning can cause denial of service
issues Sometimes scanners affect the services they probe.
It happened in the study presented in this paper when a
Debian 5.04 firewall was configured with many unneces-
sary services running. In order to scan this device and its
subnets without disruptions all non-critical services had
to be removed. Thus, automated scanning should be used
with care in environments with high availability require-
ments. Valuable non-redundant IT assets with high availabil-
ity requirements should be delimited from scans and instead
Fig. 7 node in the model generated by an unauthenticated scan be manually modeled.

123
Automatic data collection for enterprise architecture models 839

Validity and reliability of the field test The network archi- (cf. Sect. 5.1) capable of reducing large amounts of effort.
tecture tested was a virtual environment. Using a virtual As an automated network scanner is used by most organi-
environment can decrease performance, among other things zations and some scanners are available for free (e.g. [34])
packet loss. This problem is however mainly evident in very the required resources to implement the proposed approach
large virtual environments and not small subnets as evaluated should be small.
in this study, and should thus be a minor issue [39–41]. This
study can only evaluate the automated scanners ability to
assess the entities in the studied network architecture, which 8.1 Future work
only covers a very small amount of the operating systems,
external-, and local application services that are currently There are however some issues with the proposed approach
available on the market. The different products implanted that should be further researched. Different enterprises (and
in the network architecture were however of diverse nature. environments) will likely require various degrees of man-
Thus, we believe that this study gives a good hint towards ual effort in terms of specifying the mapping and manag-
the general accuracy of an automated network scan. A short- ing generated models. One method for decreasing the effort
coming of the experiment is that it did not evaluate the effort for the modeler could be to create a set of common default
required to specify large scale applications and application mappings that could be more easily manipulated to suit the
suits such as customer relationship management systems, needs of different contexts. Another option could be to have
geographical information systems, and asset management a detailed guide for usage of the model (some foundations
systems. This delimitation was chosen as there were no such for such a guide could be the contents of Sect. 7). We believe
systems in place in the experiment network. In terms of accu- that the optimal solution lies with a mix of these methods,
racy this delimitation should have a very minor influence; but that it depends on the general requirements of differ-
there is no reason to believe that there is a difference in accu- ent enterprises and the manual effort needed to fulfill these
racy for such components compared to what was addressed requirements. That is, if most enterprises have a few key
during the experiment. The required effort depends much requirements that need major effort to manage, it would be
on the requirements by the modeler. That is, it would not beneficial to have default mappings that fulfills these needs.
take more than a minute’s effort to only generate ArchiMate Similarly, if the requirements by each enterprise are unique,
concepts for e.g. the vendor SAP; whereas relating different most effort should be spent to specify a user-friendly guide
user accounts to departments might be very time consuming for mapping. In future work we aim to analyze these aspects
(depending on the amount of users and departments). through case studies at different enterprises.
In terms of future work for automatic generation of EA
models in general, there are various aspects that are of impor-
8 Conclusions tance. A number of significant topics have been discussed in
this paper: (i) how comprehensive the approach is (e.g., in
This paper proposes a method for automatic generation of terms of covered ArchiMate concepts), (ii) how the transla-
EA models with respect to the complex IT architectures tion is managed (e.g., in terms of ambiguity), (iii) how much
of enterprises. A previous study tested the approach on an effort that is required (and can be saved), (iv) how generated
EA metamodel for cyber security analysis [14]. The pres- models can be maintained, and (v) how accurate the gath-
ent study mapped the metamodel of ArchiMate to the output ered data are. Future work should focus on aspects that are
of automatic network scanners. The proposed method was cumbersome to model and maintain manually, yet available
empirically investigated through studying several different to collect with little effort in most enterprises. As a first step
variables: (i) how reliable results the method provides, (ii) to enable such work, a collection of commonly required data
how much of the metamodel context that is captured and (iii) that are cumbersome to model should be compiled. This col-
how resource efficient the proposed method is. The proposed lection could then be used to identify the key areas in terms
method offers reliable results, especially when the scanner of automatic data collection support. As discussed in the pre-
is given system credentials. The generated entities can rep- vious paragraph, it could also enable default mappings that
resent different ArchiMate interpretations, depending on the could significantly improve the usability of existing solutions
stated requirements (cf. Sect. 5). such as MooD Business Architect [16].
There are several implications of this study, of which the Future work would also benefit from mapping to a holis-
arguably most important are: This study clearly displays tic data collection methodology. A common standard would
the need for, and applicability of, automatic data collec- enable comparisons of different approaches in terms of, for
tion for EA models. Furthermore, it provides both academia example, EA metamodel comprehensiveness, data sources,
and industry with a readily available model transformation and actors required to involve when gathering and maintain-
tool for the purpose of automatic generation of EA models ing data. Unfortunately, there are no such accepted standard

123
840 H. Holm et al.

as of yet. While a methodology such as Living Models 20. Aier, S., Kurpjuweit, S., Saat, J., Winter, R.: Enterprise archi-
[12,42] (cf. Sect. 2.3) could become a standard for catego- tecture design as an engineering discipline. AIS Trans. Enterp.
Syst. 1(1), 36–43 (2009)
rization of automatic data collection methods, it (and others 21. Fischer, R., Aier, S., Winter, R.: A federated approach to enter-
like it) is currently not tailored for categorizations. Thus, prise architecture model maintenance. Enterp. Modell. Inf. Syst.
future research should also address categorization of data Archit. 2(2), 14–22 (2007)
collection methods for EA models. 22. Holm, H., Sommestad, T., Almroth, J., Persson, M.: A quantita-
tive evaluation of vulnerability scanning. Inf. Manage. Comput.
Secur. 19(4), 231–247 (2011)
23. Holm, H.: Performance of automated network vulnerability scan-
ning at remediating security issues. Comput. Secur. 31(2), 164–
175 (2012)
References 24. Johnson, P., Ekstedt, M.: Enterprise Architecture—Models and
Analyses for Information Systems Decision Making, Studentlit-
1. Ross, J.W., Weill, P., Robertson, D.: Enterprise Architecture As teratur (2007)
Strategy: Creating a Foundation for Business Execution. Harvard 25. Lagerström, R., Johnson, P., Ekstedt, M.: Architecture analysis
Business School Press, Boston (2006) of enterprise systems modifiability—a metamodel for software
2. The Open Group: The Open Group Architecture Framework change cost estimation. Softw. Qual. J. 18, 437–468 (2010)
(TOGAF), version 9, The Open Group (2009) 26. Närman, P., Johnson, P., Nordström, L.: Enterprise architecture: a
3. Zachman, J.A.: A framework for information systems architecture. framework supporting system quality analysis. In: Proceedings of
IBM Syst. J. 26, 276–292 (1987) the International Annual Enterprise Distributed Object Computing
4. Lankhorst, M.M.: Enterprise Architecture at Work: Modelling, Conference, pp. 130–142 (2007)
Communication and Analysis, 2nd edn. Springer, Berlin (2009) 27. Ullberg, J., Lagerström, R., Johnson, P.: A framework for service
5. Winter, R., Fischer, R.: Essential layers, artifacts, and dependencies interoperability analysis using enterprise architecture models. In:
of enterprise architecture. J. Enterp. Archit. 3, 7–18 (2007) IEEE International Conference on Services Computing, pp. 99–107
6. Kurpjuweit, S., Winter, R.: Viewpoint-based meta model engineer- (2008)
ing. In: Enterprise Modelling and Information Systems Architec- 28. Franke, U., Flores, W.R., Johnson, P.: Enterprise architecture
tures (EMISA 2007) dependency analysis using fault trees and bayesian networks. In:
7. Aier, S., Buckl, S., Franke, U., Gleichauf, B., Johnson, P., Proceedings of 42nd Annual Simulation Symposium (ANSS), pp.
Närman, P., Schweda, C., Ullberg, J.: A survival analysis of appli- 209–216 (2009). http://www.scs.org
cation life spans based on enterprise architecture models. In: 3rd 29. Gustafsson, P., Höök, D., Ericsson, E., Lilliesköld, J.: Analyzing
International Workshop on Enterprise Modelling and Information IT impacts on organizational structure—a case study, In: Port-
Systems Architectures, Ulm, Germany, pp. 141–154 (2009) land International Center for Management of Engineering and
8. BiZZdesign: BiZZdesign Architect. http://www.bizzdesign.com Technology (PICMET) Conference Proceedings, pp. 3197–3210
(2011). Accessed on March 2011 (2009)
9. Troux Technologies: Metis. http://www.troux.com/products/ 30. IEEE: 1471–2000—IEEE Recommended Practice for Architec-
(2011). Accessed on March 2011 tural Description for Software-Intensive Systems (2000). http://
10. Sousa, P., Lima, J., Sampaio, A., Pereira, C.: An approach for cre- standards.ieee.org
ating and managing enterprise blueprints: a case for it blueprints, 31. The Open Group: ArchiMate 1.0 Specification (2009). http://www.
Advances in Enterprise Engineering III. Lecture Notes Bus. Inf. opengroup.org/archimate
Process. 34, 70–84 (2009) 32. Manzuik, S., Pfeil, K., Gold, A., Gatford, C.: Network security
11. Hafner, M., Winter, R.: Processes for enterprise application archi- assessment: from vulnerability to patch, Syngress (2006)
tecture management. In: Proceedings of the 41st Hawaii Interna- 33. Rapid7: Nexpose. http://www.rapid7.com (2011)
tional Conference on System Sciences, pp. 396–406 (2008) 34. Network mapper: Nmap. http://nmap.org (2011)
12. Breu, R.: Ten principles for living models—a manifesto of change- 35. Hammervik, M., Andersson, D., Hallberg, J.: Capturing a cyber
driven software engineering. In: International Conference on Com- defence exercise. In: Proceedings of the Symposium on Technol-
plex, Intelligent and Software Intensive Systems, pp. 1–8 (2010) ogy and Methodology for Security and Crisis Management, p. 36
13. Buckl, S., Matthes, F., Neubert, C., Schweda, C.M.: A wiki-based (2010)
approach to enterprise architecture documentation and analysis. 36. Geers, K.: Live fire exercise: preparing for cyber war. J. Homeland
In: 17th European Conference on Information Systems, pp. 1–13 Secur. Emerg. Manage. 7(1), 74 (2010)
(2009) 37. Warner, R.: Applied statistics: from bivariate through multivariate
14. Buschle, M., Holm, H., Sommestad, T., Ekstedt, M., Shahzad, K.: techniques. Sage Publications, Inc, Thousand Oaks (2008)
A tool for automatic enterprise architecture modeling. In: Proceed- 38. Närman, P., Holm, H., Johnson, P., König, J., Chenine, M.,
ings of the CAiSE Forum 2011, pp. 25–32 (2011) Ekstedt, M.: Data accuracy assessment using enterprise architec-
15. Buschle, M., Ullberg, J., Franke, U., Lagerström, R., Sommes- ture. Enterp. Inf. Syst. 5(1), 37–58 (2011)
tad, T.: A tool for enterprise architecture analysis using the PRM 39. Ye, K., Jiang, X., Chen, S., Huang, D., Wang, B.: Analyzing and
formalism. In: CAiSE2010 Forum PostProceedings, pp. 108–121 modeling the performance in xen-based virtual cluster environ-
(2010) ment. In: 2010 12th IEEE International Conference on High Per-
16. MooD International: MooD Business Architect. http://www. formance Computing and Communications, IEEE, pp. 273–280
moodinternational.com/ (2012). Acessed on March 2012 (2010)
17. Software AG: ARIS for SAP. http://www.softwareag.com/corpo 40. McDougall, R., Anderson, J.: Virtualization performance: per-
rate/products/aris_platform/aris_implementation/aris_sap (2011) spectives and challenges ahead. ACM SIGOPS Oper. Syst.
18. Lokomo Systems AB: OneCMDB. http://www.onecmdb.org Rev. 44(4), 40–56 (2010)
(2011) 41. Wang, G., Ng, T.: The impact of virtualization on network perfor-
19. FrontRange Solutions: FrontRange CMDB. http://www.front mance of amazon ec2 data center. In: INFOCOM, 2010 Proceed-
range.com/cmdb.aspx (2011) ings IEEE, IEEE, pp. 1–9 (2010)

123
Automatic data collection for enterprise architecture models 841

42. Farwick, M., Agreiter, B., Breu, R., Ryll, S., Voges, K., Hanschke, tainability, IT-management, also he is a co-author of the book Enterprise
T.: Automation processes for enterprise architecture management. Architecture: Models and Analyses for Information Systems Decision
In: 2011 15th IEEE International Enterprise Distributed Object Making. Robert is a partner and consultant at Management Doctors, an
Computing Conference Workshops, IEEE, pp. 340–349 (2011) IT-management consultancy firm.

Mathias Ekstedt is Associate


Author Biographies Professor at the Royal Institute
of Technology (KTH) in Stock-
Hannes Holm is a PhD stu- holm, Sweden. His research
dent at the department of Indus- interests include systems and
trial Information and Control enterprise architecture modelling
Systems at the Royal Insti- and analyses with respect to
tute of Technology (KTH) in information and cyber security,
Stockholm, Sweden. He received in particular for the domain of
his MSc degree in management Power system management. He
engineering at Luleå Univer- is the manger of the program
sity of Technology. His research IT Applications in Power System
interests include enterprise secu- Operation and Control within the
rity architecture and cyber Swedish Centre of Excellence in
security regarding critical infra- Electric Power Engineering and
structure control systems. technical coordinator of the EU FP7 project VIKING. He is the foun-
der of the architecture network at the Swedish Computer Society. He
received his MSc, PhD, and Docent from the Royal Institute of Tech-
nology in 1999, 2004, and 2010, respectively.

Markus Buschle received his


MSc degree in computer sci-
ence at TUB, Berlin Institute
of Technology, Germany. He is
currently a PhD student at the
department Industrial Informa-
tion and Control systems at the
Royal Institute of Technology
(KTH) Stockholm, Sweden. His
research focuses on the tool
based performance of enterprise
architecture analysis.

Robert Lagerström received


his PhD degree in Industrial
Information and Control Sys-
tems in 2010 and his MSc
degree in Computer Science in
2005, both at the Royal Insti-
tute of Technology (KTH) in
Stockholm, Sweden. His topic
of research as a PhD student
was Enterprise Architecture and
software systems maintainabil-
ity. In 2010–2011, Robert was an
Industrial Post-Doc at ABB Cor-
porate Research and currently he
is working as an assistant profes-
sor at KTH. At KTH Robert is responsible for the course “IT Manage-
ment with Enterprise Architecture II, case studies”. In addition to that he
supervises master thesis and PhD students. Robert has written a number
of academic publications in the field of enterprise architecture, main-

123

You might also like