You are on page 1of 11

Maintaining Quality of Service Based on

ITIL-Based IT Service Management

V Koji Ishibashi
(Manuscript received January 18, 2007)

Interest in the IT Infrastructure Library (ITIL) of system management best practices


has increased in recent years, and corporations are starting to incorporate ITIL in
their IT systems. To help with this incorporation, Fujitsu provides the Systemwalker
product group, which supports ITIL-based IT service management. ITIL contains
many kinds of management processes. In this paper, we focus on the service deliv-
ery area, which includes capacity, availability, and service level managements, and
discuss the functions provided by Systemwalker Service Quality Coordinator (SSQC)
and Systemwalker Availability View (SAView) from the ITIL perspective. An overview
of the architecture used to implement these functions is also included.

1. Introduction functions and has been widely accepted in the


The IT service management processes of the Japanese IT market.
IT Infrastructure Library (ITIL)1) arise from the SAView is a new product that was launched
following two core areas: in 2006 and provides visualization of business
1) Service support: Processes related to the service availability.
daily operation and support of an IT service SSQC and SAView can be positioned as
2) Service delivery: Long-term planning and products that play a supporting role in imple-
improvement processes related to IT service menting the following management processes
provision that fall under the ITIL service delivery core
In this paper, we mainly discuss the service area:
delivery part of these two core areas, which con- 1) Capacity management
tains the processes used to maintain the high 2) Availability management
quality of the services provided by an IT system. 3) Service level management
In 2006, the Systemwalker products, which The functions and architecture of SSQC and
were launched in 1995 as Japan’s first integrated SAView are described below.
IT service management products, were enhanced
to the V13 versions to support all of ITIL. 2. Capacity management
Especially, Systemwalker Service Quality The aim of capacity management is the
Coordinator (SSQC) and Systemwalker Availa­ continued provision, now and in the future, of
bility View (SAView) are related to the service business services that are highly cost-effective
delivery area. Figure 1 shows the Systemwalker in terms of capacity and performance. To achieve
architecture. this end, capacity management clarifies the
SSQC was launched at the end of 2003. It business service requirements, the business ser-
provides capacity and performance management vice capabilities that the current IT system can

334 FUJITSU Sci. Tech. J., 43,3,p.334-344(July 2007)


K. Ishibashi: Maintaining Quality of Service Based on ITIL-Based IT Service Management

Systems operations know-how

Availability mgmt/capacity mgmt


Incident mgmt Process mgmt
Service Quality Coordinator
IT Service Management IT Process Master
Availability View

Incident mgmt Change mgmt Availability mgmt Capacity mgmt

Enterprise operations management


Centric Manager

JOB scheduling/automatic operations


Operation Manager
Prevention of information leaks from PCs
Desktop series
Resource control
Resource Coordinator

Server/Network Client PCs

Figure 1
Systemwalker architecture.

provide, and the IT infrastructure required to tion software.


provide business services in the future. The capacity and performance information
Some examples of the use of capacity man- collected by SSQC is about the following types of
agement are: resource usage in the IT system infrastructure:
1) Expanding the IT infrastructure in 1) CPU usage rate, CPU queue length
preparation for future increases in 2) Disk busy rate, number of disk queue
transaction throughput requests
2) Performance tuning so IT system resources 3) Available memory capacity, number of
are used effectively swap-in/swap-out operations
3) Predicting the requirements of business 4) Disk usage rate
services in the future SSQC can also collect the following types of
The personnel of an IT infrastructure sup- performance information concerning the middle-
port organization must perform various processes ware of an IT system:
to implement capacity management. For exam- 1) Response time at each client PC
ple, they must measure and monitor IT system 2) Number of Web server processing requests
performance, predict service deployment and and the response times for those requests
demand, and perform capacity planning and 3) Number of application (AP) server requests,
tuning. wait time, and processing time
To assist in these capacity management 4) Execution multiplicity for batch processing
processes, SSQC provides functions for collecting 5) SQL execution time on the DB server
and analyzing capacity and performance infor- 6) Amount of free table space area for the DB
mation from all parts of an IT system, ranging In addition, SSQC can collect the through-
from the infrastructure to the business applica- put of business applications by establishing a

FUJITSU Sci. Tech. J., 43,3,(July 2007) 335


K. Ishibashi: Maintaining Quality of Service Based on ITIL-Based IT Service Management

data import interface with them. of a time series analysis.


The above information can be used to per- SSQC provides many kinds of analyses
form, for example, the following types of capacity functions, and by using these functions, the IT
management: infrastructure support organization can eas-
1) Esta blishment of the cr iter ia f or the ily perform the capacity management processes
resource capacity required for business defined in ITIL.
based on correlation analysis of business
application throughput, Web server through- 3. Availability management
put, and CPU usage rates The purpose of availability management is
2) Prediction of future processing demands to maintain a high level of availability for the
based on time series analysis of business services provided by an IT infrastructure with a
application throughput favorable cost-effectiveness in order to achieve
3) Prediction of CPU and disk resource capaci- business goals.
ties that will be required in the future based For example, availability management can
on predictions of future business throughput be used to:
Fig ure 2 shows some example SSQC 1) Monitor whether IT services are being
reports. Figure 2 (a) shows the result of a corre- processed as planned
lation analysis, and Figure 2 (b) shows the result 2) Reduce the fault occurrence frequency in an

(a) Result of correlation analysis (b) Result of time series analysis

Figure 2
Example SSQC reports.

336 FUJITSU Sci. Tech. J., 43,3,(July 2007)


K. Ishibashi: Maintaining Quality of Service Based on ITIL-Based IT Service Management

IT infrastructure by performing preventive 3) Response time breakdown analysis


maintenance SSQC monitors the responses of Web appli-
3) Keep the mean time between failures (MTBF) cations. It also measures and displays the time
at a high level by minimizing the downtime taken for these responses and the time taken to
due to faults download the elements of the displayed HTML
The personnel of an IT infrastructure screen. SSQC, therefore, not only monitors
sup­port organization must perform various availability but also provides functions for inves-
processes to implement availability management. tigating the causes of problems.
For example, they must design and implement To maintain IT system availability, period-
the IT system availability and measure, monitor, ic IT system reviews about failures and system
report, and improve the IT system availability. weaknesses are important. Furthermore, Fujitsu
To assist in these availability management regards these investigation functions as being
processes, SAView provides a function for important for maintaining availability from the
monitoring business services according to their viewpoint of reducing the mean time to repair
operation plans. SAView can also maintain
activity logs of business services to enable the
availability to be visualized. Figure 3 shows two
examples of SAView screens.
SSQC also assists in availability visualiza-
tion by polling to check the service availability
and by providing service downtime reports.
In addition, SSQC provides the following
troubleshooting functions for minimizing service
interruptions caused by performance problems in
the IT infrastructure:
1) Drill Down View screen
SSQC can display detailed IT infrastructure
resource information and middleware perfor-
mance information from the time a performance (a) Monitoring business services
problem arises. Users can compare these values
with the values obtained at times of normal oper-
ation to see at a glance the cause of the problem.
Items showing large fluctuations in value can be
considered related to the cause of the problem.
Figure 4 shows an example of a Drill Down
View screen.
2) Transaction breakdown analysis
When SSQC is used together with Fujitsu’s
Interstage2) Application Server and Symfoware3)
Server, it can detect the location of performance
bottlenecks in online transactions.
Figure 5 shows an overview of transaction (b) Activity logs of business services

breakdown analysis.
Figure 3
Example SAView screens.

FUJITSU Sci. Tech. J., 43,3,(July 2007) 337


K. Ishibashi: Maintaining Quality of Service Based on ITIL-Based IT Service Management

Figure 4
Drill Down View screen.

Web server AP server

Request

DB server

IBAS IBAS

■ Monitoring of transaction ■ Analysis of processing time ■ Analysis of processing time


throughput and average/ breakdown, in transaction breakdown, in transaction
maximum processing times units, at each server units, for Web/EJB
at each server applications at each server

IBAS: Interstage Business Application Server

Figure 5
Overview of transaction breakdown analysis.

338 FUJITSU Sci. Tech. J., 43,3,(July 2007)


K. Ishibashi: Maintaining Quality of Service Based on ITIL-Based IT Service Management

(MTTR). agreed on and set between the provider and


For example, in a certain data center, by recipient. Instead, some SLAs are implicitly set,
using SSQC, the cause of a slowdown was detect- especially in in-house IT systems.
ed and system operation was restarted in an
hour. The problem was caused by an exhaustion 4.1 Management processes
of DB temporary area and had also occurred in Because Fu jitsu has constructed and
the previous year. However, because SSQC was operated many mission-critical IT systems,
not in use at that time, the cause was not inves- we have abundant experience of service level
tigated and it took 10 hours to restart operation. management.
In this case, SSQC reduced the IT system MTTR The ultimate aim of service level manage-
to just 10% of the previous value. ment is to maintain and improve the QoS. To
These investigation functions were incor- achieve this, the following management processes
porated into the first version of SSQC and related to the service level must be continuously
distinguish it from other similar products. performed:
1) Monitoring
4. Service level management 2) Reporting
The purpose of service level management 3) Reviewing
is to maintain and improve the quality of an IT 4) Predicting
service. Service level management obtains a 5) Maintaining
consensus between a service provider and recipi- The above processes suggest implementa-
ent concerning the quality of an IT service and tion of the capacity management and availability
monitors, reports, and reviews the quality for a management described above.
specified period. We have previously postulated the following
The following are some examples of service as service level management processes:4), 5)
level management: 1) Determine the SLA.
1) The IT service provision department guar- 2) Determine configurations in accordance
antees the maximum response time and with the SLA.
reports the monthly response status to the 3) Collect information required for automation
IT service users. of on-going processes, regular performance
2) The IT service provision department guar- information, and other information.
antees the upper limit for the amount of 4) For short-term problems:
service down-time in a month and provides • Detect problems
continuous monitoring and improvement to • Identify potential problems
uphold this guarantee. • Predict problems
The first requirement of service level man- • Generate alerts indicating problem
agement is for the service provider and recipient occurrences
to reach an agreement and establish a service These processes relate to availability
level agreement (SLA). Service level manage- management.
ment must then incorporate the SLA into the 5) Write regular service level reports based on
IT infrastructure and continually monitor and the SLA. Include predictions concerning the
report the service level. The Quality of Service next reporting period.
(QoS) provided to the recipient must then be 6) Predict medium-term problems.
maintained and improved. This process also relates to availability
In practice, not all SLAs are explicitly management.

FUJITSU Sci. Tech. J., 43,3,(July 2007) 339


K. Ishibashi: Maintaining Quality of Service Based on ITIL-Based IT Service Management

7) If required, conduct capacity planning and mation collection, analysis, and reports.6)
tuning studies. The SLAs that are subject to service level
This process relates to capacity manage­ment. management are not limited to performance and
8) Submit an SLA report. availability related items. Information han-
9) Review SLA-related requirements. dled by the ITIL service support components —
10) Change tools and the environment. incident management, problem management,
Figure 6 shows the relationships between and change management — are also used as
these processes. indices. Some examples of this information are:
Items to note are the regular implementa- 1) Average time required for the service desk
tion of information collection, problem detection, managed by the IT service provision depart-
reports, and reviews. The processes to be per- ment to resolve incidents reported by IT
formed only on demand are capacity planning service recipients
and tuning. 2) The number of proactive problem analy-
ses performed by the IT service provision
4.2 Provided solutions department
SSQC supports all of the above capacity and These types of indices must also be targeted
availability related processes, for example, infor- as part of service level management.

START

Configure Determine SLA


Publish
Change tools/ Review SLA/
report
environment requirements

Collect/store Detect/predict Generate


performance short-term problems, service level
and other data generate alerts report

Predict medium/
long-term
problems

Conduct capacity Conduct tuning


Generate
planning study, study,
alert
generate report generate report

Once only Continuous On-demand

Figure 6
Service level management processes.

340 FUJITSU Sci. Tech. J., 43,3,(July 2007)


K. Ishibashi: Maintaining Quality of Service Based on ITIL-Based IT Service Management

SSQC can handle these types of information using a three-layer architecture. The functions
as user information and supports the reporting of of these layers are as follows:
this information. 1) Agent layer
This layer performs data collection.
5. Architecture • Agent
SSQC ena bles capacit y management, An agent is an operation unit installed on
avail­a­bility management, and service level man- a managed server. Agents collect resource infor-
agement of an IT system, and SAView enables mation and performance information concerning
the visualization of availability. The architec- applications, Web servers, AP servers, DB
tures of SSQC and SAView are described below. servers, and the platform operating system
itself. Agents also store the collected informa-
5.1 SSQC tion, without changing its format, during periods
5.1.1 Three-layer construction specified for problem analysis purposes.
As shown in Figure 7, SSQC is implemented • Browser agent

Enterprise manager
• Monitoring
• Reporting

Report framework

PDB
Distributed DB • Summary data

Data
transport
path

Manager Data Proxy manager


transport
path
PDB
Distributed DB
• Detailed data

Data Http
transport path
path

Agent
Browser agent

Trouble-
shooting log

PDB : Performance database

Figure 7
SSQC architecture.

FUJITSU Sci. Tech. J., 43,3,(July 2007) 341


K. Ishibashi: Maintaining Quality of Service Based on ITIL-Based IT Service Management

A browser agent is an operation unit installed Either of the following two information
on an end-user PC that measures the end-user transfer modes can be selected for the data trans-
response time. port paths:
2) Manager layer 1) Push mode: This mode enables just-on-time
This layer collects and stores information. information transfer when data is collected.
• Manager A proprietary protocol is used to push data
A manager is an operation unit installed on up from the lower layers to the upper layer.
an admin server in the department. Managers 2) Pull mode: HTTP requests are sent from the
gather the information collected by the agent upper layer to the lower layers, and infor-
layer and store detailed information. They also mation is pulled up in response to these
send summary information to the enterprise requests. This mode enables secure data
manager described below. transfer from agents or proxy managers out-
Managers also perform polling to collect side the firewall to the internal manager.
service activity status information. In many cases, both the enterprise man-
• Proxy manager ager and the managers are installed together on
A proxy manager is an operation unit that a single server that runs as just one IT service
operates on behalf of a manager to collect service management server.
activity information and information from agents. As described above, the functions of each
Proxy managers are used for two reasons. layer can be customized. For example, the enter-
One reason is to distribute the processing prise manager and the managers can be arranged
load by collecting information on behalf of by installing the report frameworks of the enter-
overloaded managers. The other is to reduce the prise manager in each department server. In
number of data transport paths. Especially as this configuration, senior managers can access all
a security policy, it is generally recommended systems data from the enterprise manager, and
to reduce the number of internal-external paths department managers can use the managers to
that pass through a firewall server such as data access systems data only in their departments.
transport paths connecting external agents with
internal mangers protected by the firewall. For 5.1.2 Distributed database and presentations
example, if a proxy manager is set outside the The information collected by SSQC is han-
firewall, it can collect information from external dled in a number of forms by agents and stored
agents and send it to internal managers through in a distributed database in the enterprise man-
a single data transport path. ager and the managers.
3) Enterprise manager layer The collected data is classified by resolution.
This layer shows information about the SSQC keeps data having a rather coarse resolu-
entire IT system. tion for long-term analysis and fine-resolution
• Enterprise manager data for the trouble investigation function.
The enterprise manager is an operation unit This data is automatically deleted from the
on an enterprise admin server. The enterprise distributed database at the specified expiration
manager stores the information sent from the time.
managers in each department, holds the report One of the distinguishing points of SSQC is
framework, and performs status monitoring and that it collects several types of calculated data for
reporting. different purposes in a distributed database.
Collected data is sent from the lower layers Table 1 shows the data management
to the upper layer through data transport paths. scheme that shows the type of stored servers and

342 FUJITSU Sci. Tech. J., 43,3,(July 2007)


K. Ishibashi: Maintaining Quality of Service Based on ITIL-Based IT Service Management

Table 1
Data management scheme.
Data form Storage scheme
1-minute resolution Stored at agent system
10-minute resolution Stored on department server and kept for 7 days
1–hour resolution Stored on department server and kept for 6 weeks
1-day resolution Stored on department server and kept for 53 weeks
Summary Stored on enterprise server and kept for 3 days (renewed daily)

EJB interface Message interface


Business server 2

Systemwalker Centric Manager


Linkage to
external
Systemwalker applications Other systems
Availability
View Manager

Systemwalker Systemwalker
Availability Operation
View agent Manager

Systemwalker Centric Manager Systemwalker Centric Manager

IT operation management server Business server 1

Figure 8
SAView architecture.

the retention period of each type of data. Like other Systemwalker products, SAView
The report base is the main presentation comprises a manager and agents.
function in the enterprise manager. It access- Agents are assigned to each business server
es the above distributed database to access and collect batch processing activity informa-
the contents to be displayed and analyzed. It tion from Systemwalker Operation Manager.7)
extracts and analyzes the required information, The collected activity information is sent to the
implements presentations and monitoring, and manager, where it is stored as activity logs. A
provides reports for managers. comparison of this information with the business
The utilization of these reports completes planning information defined in the manager
the series of service level management processes enables batch processing availability to be visual-
and enables service level reporting, short-term ized based on plans and actual results.
and long-term status analyses, and trouble SAView also has the following interfaces for
investigation. collecting activity information concerning other
types of processing:
5.2 SAView architecture 1) EJB interface
Figure 8 shows the SAView architecture. This interface is provided by the manager

FUJITSU Sci. Tech. J., 43,3,(July 2007) 343


K. Ishibashi: Maintaining Quality of Service Based on ITIL-Based IT Service Management

of SAView. It receives business system activity CMDB.


information directly from applications. 3) Dashboard
2) Message interface A dashboard that supports all the
This interface receives event messages from components of service delivery — service level
Systemwalker Centric Manager on business management, availability management, and
servers and admin servers. capacity management — will provide a flexible
SAView can collect start and stop informa- and integrated visualization GUI.
tion about any business activity by defining event In conjunction with the above product
messages. enhancements, we also plan to continue research
into service management systems that conform
6. Conclusion to ITIL.
In this paper, we described the functions We hope that these activities will lead to
provided by SSQC and SAView with reference to even greater benefits for Fujitsu’s customers.
ITIL and described the management processes
these functions support. We also described the References
product architectures required to implement 1) The IT Service Management Forum (itSMF)
International: The knowledge Network for IT
these processes. Service Management.
Currently, SSQC and SAView are accepted http://www.itsmf.org/
2) T. Kosuge and T. Ishikawa: Interstage: Fujitsu’s
by customers as effective tools for performing Application Platform Suite. FUJITSU Sci. Tech. J.,
ITIL service delivery processes. 43, 3, p.274-284 (2007).
3) T. Goto: Disaster Recovery Feature of Symfoware
However, to improve their flexibility and DBMS. FUJITSU Sci. Tech. J., 43, 3, p.301-314
usability, we are planning to add the following (2007).
4) M. Tsykin et al.: Automated Monitoring and
SSQC and SAView functions in the future: Reporting of Enterprise Quality of Service.
1) SOA-based architecture support Proceedings of the 7th World Multi-conference
on Systemics, Cybernetics and Informatics
Support for Service Oriented Architecture (SCI2003), Orlando, July 2003.
(SOA) based architectures will enable flexi- 5) M. Tsykin et al.: On Automated Monitoring of
SLAs. CMG Journal of Capacity Management,
ble access to the information held by the ICT Summer 2002, CMG, p.27-36.
infrastructure management functions so 6) K. Ishibashi and M. Tsykin: Management of
Enterprise Quality of Service. FUJITSU Sci.
this information can be used for service level Tech. J., 40, 1, p.133-140 (2004).
management. This support will also make 7) Fujitsu: Systemwalker Operation Manager.
htt p://www.f u jitsu.com/global/services/
it easy to implement service level manage­- software/systemwalker/products/operationmgr/
ment that is linked to the information held by index.html
ITIL service support functions.
2) Flexibility by using a federated CMDB
Koji Ishibashi, Fujitsu Ltd.
By using system configuration information Mr. Ishibashi received the B.S. degree
stored in a federated configuration management in Communication Engineering from
Osaka University, Osaka, Japan in
database (CMDB), our products will improve the 1981. He joined Fujitsu Ltd., Kawasaki,
Japan in 1981, where he has been
availability of IT systems by providing capabili- engaged in research and development
of system management software since
ties such as troubleshooting of problems caused 1990. He is currently responsible for
by resource faults or by changing the IT system developing Systemwalker Service
Quality Coordinator and Systemwalker
configuration. Availability View.
In addition, the information held by SSQC
and SAView will be able to be used more flexi-
bly when it can be provided through a federated

344 FUJITSU Sci. Tech. J., 43,3,(July 2007)

You might also like