You are on page 1of 6

SOFTWARE Editor: Christof Ebert

Vector Consulting Services


TECHNOLOGY christof.eber t@vector.com

IT Infrastructure-
Monitoring Tools
Josune Hernantes, Gorka Gallardo, and Nicolás Serrano

Clients often ask me, what’s the cost of our IT services? How do
they map to different applications? What are the availability and
performance of our services in geographically dispersed centers? How
can we effectively reduce the total cost of ownership while improving
service quality? A good starting point is to actively use IT-monitoring
technology. It provides a quantitative starting point that—with a good
understanding of IT systems and service needs—facilitates improving
your IT performance. In this installment, Josune
Hernantes, Gorka Gallardo, and Nicolás Serrano
provide an overview of recent monitoring
technologies. I look forward to hearing from
both readers and prospective column authors
about this column and the technologies you
want to know more about. —Christof Ebert

PROVISIONING NEW INFRASTRUC- 24/7 customer service and support. The


TURES and applications has never systems can be on-premise applications,
been so easy. Virtualization and cloud applications on a public or private cloud,
computing have made this process rou- or any combination of these.
tine. The result for enterprises’ IT in- The fi rst step in this process is moni-
frastructure has been an increased toring the IT infrastructure at the hard-
number of diverse elements to man- ware, service, and application levels.
age. Furthermore, nowadays systems Figure 1 describes the architecture.
are usually geographically dispersed Here, we discuss current tools that
or use different OSs, which compli- monitor networks to detect issues, ensure
cates their management. This situation the components’ availability, and mea-
has renewed the importance of an old sure the resources those components use.
topic: monitoring IT infrastructures
and applications. Selecting Tools
Also, as more businesses rely on soft- When selecting an infrastructure-
ware, IT system health is critical for monitoring suite, you need to take

88 IEEE SOFTWARE | PUBLISHED BY THE IEEE COMPUTER SOCIETY 0740-7459/15/$31.00 © 2015 IEEE

s4swt.indd 88 6/4/15 1:40 PM


Authorized licensed use limited to: Universiteit Antwerpen. Downloaded on July 30,2022 at 13:47:07 UTC from IEEE Xplore. Restrictions apply.
SOFTWARE TECHNOLOGY

into account several factors to find


the perfect match for your needs.
First, select a tool on the basis of
the required functionalities so that
it aligns with your technical and Alert
business needs. Next, evaluate the
deployment and maintenance fac-
tors to match the tool with your IT
team’s resources and capabilities. Report Monitoring
Finally, with a proper understand- server
ing of how the tool will affect your
organization, calculate the total Database
cost of ownership.

Functionality
From a functionality perspective,
understanding the needs of the dif-
Windows Linux
ferent users (development, IT opera- Application Service
agent agent
tions, and so on) is important. For
example, a business decision maker
might be more interested in having FIGURE 1. An infrastructure-​monitoring architecture. The IT infrastructure must be
service-​level-​agreement data reports, monitored at the hardware, service, and application levels.
which from a technical viewpoint
might be more valuable for detecting
performance issues and their origins. possible to problems. Consequently, align with corporate policies. Next,
The layers to evaluate are numerous, you might want to act when degra- the tool should be compatible with
and the tool should support the front dation occurs. So, a customizable your languages, infrastructure, and
and back ends, letting you detect all alert service might be your best ally. IT department capabilities. Then,
sorts of problems from slowdowns When comparing systems, you might evaluate the methods that will help
and crashes to memory leaks. look at you collect the measures and data
that represent insightful informa-
User interfaces. Infrastructure-​ • different alert methods (short tion. Toward that end, there are sev-
monitoring tools have been available message service [SMS], email, eral ways to monitor performance,
for a long time. On one hand, this custom scripts, and so on), based on where this information is
means you can rely on well-​proven • the customization needed, generated. For example, the moni-
suites. On the other hand, you might • the supported OSs, and even toring information can be generated
find that the tools’ UIs are outdated. • integration into your help desk directly from code, logs, installed
Evaluate whether the tools suit your system so that you can seam- clients, and hardware devices.
needs. Moreover, depending on the lessly integrate the monitoring Also, assess the installation and
users’ skills and profile, you might system into your bug resolution maintenance effort. Because all
need to either find a tool with a Web processes. monitoring tools should be tailored
interface to guarantee access from to the business and application
heterogeneous clients, or look for a As you gain knowledge of your in- needs, installation and configuration
mobile UI if you primarily use mo- frastructure, you might automate are important in any implementa-
bile devices. tasks on the basis of events to keep tion project. Take into account easy-​
problems under control. deployment characteristics such as
Alerts, help desk integration, and automatic discovery of application
automation. A goal of any monitor- Deployment and Maintenance topology, and evaluate your team
ing system is to respond as soon as First, the deployment method should capabilities and resources.

J U LY / A U G U S T 2 0 1 5 | I E E E S O F T WA R E 89

s4swt.indd 89 6/4/15 1:40 PM


Authorized licensed use limited to: Universiteit Antwerpen. Downloaded on July 30,2022 at 13:47:07 UTC from IEEE Xplore. Restrictions apply.
SOFTWARE TECHNOLOGY

Eight popular IT-​monitoring tools.


TABLE 1

Automation

OS support
integration
Help desk

Strengths
interface

business
Support
License

Web or
mobile

Target
Alerts

client
User
Tool

size
Nagios Open Active sup- Improved Web Email, Web Yes Yes† Linux, Small, Flexible and highly
source port com- GUI † SMS*, interface Unix, medium, configurable,
(GPL*) munity custom Windows via and large robust and reliable
proxy agent

Zabbix Open Active sup- Well-​designed Email, Web Yes Yes Windows, Enterprise Flexibility to orga-
source port com- Web GUI SMS, interface with Mac, nize monitoring
(GPL) munity, custom API Linux, data,
email, Unix configurability,
forums, scalability
help desk,
phone,
wiki

Hyperic Open Support Good Web Email, Web Yes Yes† Windows, Small and Native manage-
source community, interface SMS interface Mac, medium ment for Unix,
(GPL v2) email, Linux, Linux, Windows,
help desk Unix and Mac;
scalability

Solar- Propri- Active sup- Excellent GUI Email, Web Yes Yes Windows, Small and Quick and easy
Winds etary port com- custom interface, Mac, medium deployment,
munity, mobile Linux, affordability,
email, Unix native support for
forums, VMware
help desk,
phone

Manage­ Propri- Email, Unconven- Email, Web Yes Yes Windows, Small and Great feature set
Engine etary forums, tional UI custom interface, Mac, medium
OpMan- help desk that’s hard to mobile Linux,
ager navigate Unix

HP Opera- Propri- Forums, Good Web Email, Web Yes Yes Windows, Large Integration with
tions Man- etary help desk, interface SMS, interface, Linux, other products
ager webinars custom mobile Unix from the same
company;
integration with
HPIC, which can
integrate with
SCCM or SCOM.*

IBM Tivoli Propri- Email, Good, intuitive Email, Web Yes Yes Windows, Enterprise Automatic analysis
etary forums, Web interface SMS interface Linux, and repair,
help desk Unix efficient where
many resources
must be monitored

Whats­Up Propri- Phone, Clumsy inter- Email, Web Yes Yes Windows Small, Easy setup and
Gold etary email, face SMS, interface medium, network discovery,
forum sound and large great feature set
* GPL is GNU General Public License, SMS is short message service, HPIC is HP Insight Control, SCCM is System Center Operations Manager, and SCOM is System Center Operations Manager.
† Only in the paid version.

90 I E E E S O F T WA R E | W W W. C O M P U T E R . O R G / S O F T W A R E | @ I E E E S O F T WA R E

s4swt.indd 90 6/4/15 1:40 PM


Authorized licensed use limited to: Universiteit Antwerpen. Downloaded on July 30,2022 at 13:47:07 UTC from IEEE Xplore. Restrictions apply.
SOFTWARE TECHNOLOGY

Cost gathering and can scale to large en- can be configured to run in response
Cost is always important; a quick re- vironments. It allows monitoring to them. Hyperic can automatically
turn on investment should be your servers, network devices, and appli- discover, monitor, and manage soft-
goal. Consider the total cost of own- cations, gathering accurate statistics ware and network resources. It too
ership—for example, to compare a and performance data. has an active support community.
software-as-a-service deployment It’s easy to install, but configura- Hyperic’s main disadvantage is
with an on-premise alternative, for tion can be complex, particularly to the higher amount of resources used
which licensing and hardware costs add new or custom checks. Zabbix by the Java virtual machine, com-
could add up quickly. has a well-designed Web GUI and ex- pared to other monitoring tools.

Eight Popular Tools


Here, we look at eight of the most
popular IT-monitoring tools1,2 (see
Table 1).
Traditional infrastructure monitoring
will soon be replaced by application
Nagios performance management.
Nagios is one of the best-known
open source tools for monitoring IT
infrastructures such as end-user sta-
tions, IT services, and active network
components. It has a free open source tensive reporting and graphing capa- SolarWinds
version, Nagios Core, and a paid bilities as part of the standard pack- SolarWinds is available as a self-
version, Nagios XI. Many features age, which combines monitoring and hosted solution and as software as
aren’t available on Nagios Core. trending functionalities. Zabbix can a service. Installation takes from
Nagios XI provides an updated, deliver email or SMS notifications several minutes to a few hours, de-
easy-to-navigate Web interface that informing network administrators pending on the complexity of the
improves on Nagios Core’s poor in- when the acceptable limits are ex- configuration data, such as tickets
terface. This improved interface fea- ceeded. Like Nagios, Zabbix has an or locations. SolarWinds can be de-
tures an interactive dashboard that active support community. ployed using only internal staff and
includes a high-level overview of can easily scale to large organiza-
hosts, services, and network devices. Hyperic tions. It also provides native VM-
It provides trending and capacity- Hyperic is monitoring and manage- ware support and has great commu-
planning graphs that let organiza- ment software optimized for virtual nity support, provided by Thwack
tions plan infrastructure upgrades. environments, as it’s a VMware (https://thwack.solarwinds.com).
Installation is easy, but management initiative. It has a free open source Its user interface is intuitive,
of configuration files to run devices and version, Hyperic HQ, and a paid with customizable forms and mo-
tests has a steep learning curve. version, vFabric Hyperic. It effi - bile access. Detailed graphs depict
Nagios has a large, active support ciently manages any OS, Web, app, network failures, availability, and
community that develops additional or database server. vFabric Hyperic performance. You can easily con-
plug-ins. These plug-ins solve some adds functionalities such as auto- figure alerts and defi ne complex
of the tool’s limitations, such as dif- mated corrective actions. rule-based workflow. SolarWinds
ficult configuration and the lack of Installation is easy and takes provides preconfigured dashboards
automatic device discovery. Another only minutes. Hyperic provides a you can change to suit your needs.
example is the support of virtual well- designed, customizable UI. For Furthermore, it creates customized
environments. example, you can save the dash- reports that can be automated ac-
board and edit it to include fre- cording to a schedule.
Zabbix quently used graphs.
Zabbix is open source software that Alerts can be configured as SMS ManageEngine OpManager
offers great performance for data or email, and administrative actions OpManager’s installation is fast and

J U LY / A U G U S T 2 0 1 5 | I E E E S O F T WA R E 91

s4swt.indd 91 6/4/15 1:40 PM


Authorized licensed use limited to: Universiteit Antwerpen. Downloaded on July 30,2022 at 13:47:07 UTC from IEEE Xplore. Restrictions apply.
SOFTWARE TECHNOLOGY

easy, but configuration is manual IBM Tivoli ure alerts (email, SMS, or custom
and can be complex. Administrators Installation of Tivoli is easy and scripts) for when the software de-
can automate routine maintenance takes just a few minutes, although tects that a device has exceeded a
and troubleshooting. configuring, updating, and refi ning threshold.
OpManager provides several the analytical and response features
dashboard views that can be cus- require IT expertise. Future Trends
tomized, although navigating the Tivoli offers an intuitive Web in Monitoring Tools
UI is difficult. The tool generates interface with customizable work- As the cloud’s popularity grows,
many types of reports and can set spaces and includes an easy-to-use cloud-based solutions are becoming
threshold alarms to trigger notifica- data warehouse and advanced re- common for most enterprise appli-
tion through email, SMS text, and porting capabilities. It provides dy- cations.3 Cloud-based infrastructure
monitoring can ease installation and
maintenance, but data privacy and
control concerns will arise. As usual,
selecting a deployment method will
Cloud-based infrastructure monitoring be based on corporate policies. But
can ease installation and maintenance technical restrictions might also ap-
but cause privacy and control concerns. ply; your application deployment
method (on-premises, public cloud,
private cloud, hybrid cloud, and
so on) might affect your selection
because not all providers will be
custom scripts. It has three levels of namic thresholding and performance compatible.
thresholds: Warning, Trouble, and analytics to improve incident avoid- Finally, traditional infrastructure
Error. OpManager offers several ance. It features proactive monitoring monitoring will soon be replaced by
plug-ins as separate products. and automated fault management. It application performance manage-
also collects monitoring information ment because the performance of in-
HP Operations Manager for reporting, performance analysis, ternal and external applications can
HP Operations Manager is the and trend prediction. greatly affect business profitability.
central component of the HP mon- IBM offers free phone and email Application responsiveness is vital
itoring suite. It’s a client–server support during business hours, and and can affect business processes
solution with agents required on extensive access to product documen- and customer retention. At the same
each node. The initial setup can be tation and a user knowledge base. time, increased uncertainty and the
complex if you want to install mul- need to bring value earlier are en-
tiple suites. WhatsUp Gold couraging agile development meth-
HP Operations Manager has an Installing WhatsUp Gold is easy, odologies with a faster software
excellent GUI for monitoring appli- but configuration requires using release cycle. In this scenario, soft-
cation, system, and network health. both the Web console and Win- ware quality can’t be measured only
It provides planning features includ- dows application. This tool pro- by pure functionality (passing the
ing predictive analysis and datacen- vides more than 200 configurable tests) because continuous delivery
ter modeling. You can fi lter alarms reports, including historical data might decrease performance. Tra-
by severity or node type. The tool for trend analysis. Real-time reports ditional IT infrastructure manage-
offers proactive monitoring and au- are available, which are helpful for ment will make room for a DevOps
tomated alerting. It adds resolution troubleshooting. Several plug-ins view, in which IT infrastructure
information to events to advise op- are available to expand WhatsUp is important throughout applica-
erators on a recommended remedia- Gold’s features. tion development and application-
tion approach, and it includes pre- The UI can be clumsy for sim- performance- management tools
defi ned tools and automated actions ple functions, such as reporting on add value throughout the software-
to fi x processes. specifi c elements. You can config- engineering life cycle.

92 I E E E S O F T WA R E | W W W. C O M P U T E R . O R G / S O F T W A R E | @ I E E E S O F T WA R E

s4swt.indd 92 6/4/15 1:40 PM


Authorized licensed use limited to: Universiteit Antwerpen. Downloaded on July 30,2022 at 13:47:07 UTC from IEEE Xplore. Restrictions apply.
SOFTWARE TECHNOLOGY

F or a look at a real-world ex-


ample of a company select-
ing and implementing IT-
monitoring tools, see the sidebar.
A CASE STUDY INVOLVING
IT-MONITORING TOOLS
An electronic-components e-commerce company with a presence in Europe
and the US wanted to increase IT availability and performance, which directly
affect its profitability. It needed to ensure availability of its internal IT resources
References such as email, enterprise resource planning, and its e-commerce site. Any
1. J. Kowall and W. Cappelli, Magic Quad- downtime on IT resources directly affects both internal and external stakehold-
rant for Application Performance Moni-
toring, Gartner, Oct. 2014; www.gartner ers. So, the quick resolution of incidents or even avoiding them with proactive
.com/doc/2889421/magic- quadrant IT management is an important business goal.
-application-performance-monitoring.
The company had been using three IT-monitoring tools; each department
2. “Vendor Landscape: Systems Manage-
ment,” Info-Tech Research Group, 2011; had implemented its own solution that had a limited scope and responded to
www.infotech.com/research/ss/it-vendor an urgent need. By implementing a unified suite, the company expected to in-
-landscape- systems-management.
crease collaboration and reduce resolution time.
3. K. Fatema et al., “A Survey of Cloud Moni-
toring Tools: Taxonomy, Capabilities and The infrastructure consisted of datacenters with more than 50 servers. The
Objectives,” J. Parallel and Distributed suite needed to monitor SQL, disk space, memory, whether the host was up or
Computing, vol. 74, no. 10, 2014,
pp. 2918–2933. down (via ping), log files, email service, and transaction performance on the
basis of load time. When selecting the suite, the company considered function-
alities and total costs (for licenses, hardware, configuration, training, mainte-
nance, and so on). Because the IT staff had no experience with the suite, train-
ing different user roles was a priority during implementation.
JOSUNE HERNANTES is a professor of A main objective was to reduce the resolution time of any incident because
computer science and software engineering downtime directly affects the bottom line and customer satisfaction. So, dur-
at the University of Navarra. Contact her at
ing implementation, the company devoted considerable effort to setting up an
jhernantes@tecnun.es.
alert system that was integrated with its ticketing system. The alert system
GORKA GALLARDO is a professor of infor- automatically reroutes each event to the most relevant support area and level.
mation systems at the University of Navarra. Moreover, the configuration includes a fair amount of automation, with auto-
Contact him at ggallardo@tecnun.es.
matic server reboots and automatic provisioning of virtual instances based on
NICOLÁS SERRANO is a professor of com- puppets and scripts.
puter science and software engineering at the With this new proactive management of the infrastructure, the company has
University of Navarra. Contact him at nserrano@
reduced service desk tickets by more than 30 percent. However, the benefits
tecnun.es.
have gone beyond reduced support costs and increased availability. Now, the
IT department can easily accommodate increasing demand and objectively jus-
tify investment in new infrastructure, and management staff can easily access
service-level-agreement information.
Selected CS articles and columns
are also available for free at
http://ComputingNow.computer.org.

Subscribe today for the latest in computational science and engineering research, news and analysis,
CSE in education, and emerging technologies in the hard sciences.

www.computer.org/cise

J U LY / A U G U S T 2 0 1 5 | I E E E S O F T WA R E 93

s4swt.indd 93 6/4/15 1:40 PM


Authorized licensed use limited to: Universiteit Antwerpen. Downloaded on July 30,2022 at 13:47:07 UTC from IEEE Xplore. Restrictions apply.

You might also like