You are on page 1of 154

CISA EXAM

PREPARATION

1
Chapter 4
Information Systems Operations and Business Resilience

2
Overview

Information systems operations and business


resilience are important to provide assurance to
users and management that the expected level of
service will be delivered.

Service level expectations are derived from the


organization’s business objectives.

Information technology (IT) service delivery includes


information systems (IS) operations, IT services and
management of IS and the groups responsible for
supporting them.

3
Overview

Disruptions are also an often-unavoidable factor of doing


business.

Preparation is key to being able to continue business


operations while protecting people, assets and
reputation.

Employing business resiliency tactics helps organizations


address these issues and limit the impact.

This domain represents 23 percent of the CISA


examination (approximately 34 questions).

4
Domain 4
Part A: Information Systems Part B: Business Resilience
Operations
1. Common Technology Components 1. Business Impact Analysis (BIA)
2. IT Asset Management 2. System Resiliency
3. Job Scheduling and Production 3. Data Backup, Storage, and
Process Automation Restoration
4. System Interfaces
4. Business Continuity Plan (BCP)
5. End-User Computing
5. Disaster Recovery Plans (DRPs)
6. Data Governance
7. Systems Performance
Management
8. Problem and Incident
Management
9. Change, Configuration, Release,
and Patch Management
10. IT Service Level Management
11. Database Management

5
Learning Objectives/Task Statement

Within this domain, the IS auditor should be able to:


• Evaluate the organization’s ability to continue business
operations. (T13)
• Evaluate whether IT service management practices
align with business requirements. (T20)
• Conduct periodic review of information systems and
enterprise architecture. (T21)
• Evaluate IT operations to determine whether they are
controlled effectively and continue to support the
organization’s objectives. (T22)
• Evaluate IT maintenance practices to determine
whether they are controlled effectively and continue to
support the organization’s objectives. (T23)

6
Learning Objectives/Task Statement

• Evaluate database management practices. (T24)


• Evaluate data governance policies and practices. (T25)
• Evaluate problem and incident management policies
and practices. (T26)
• Evaluate change, configuration, release, and patch
management policies and practices. (T27)
• Evaluate end-user computing to determine whether the
processes are effectively controlled. (T28)
• Evaluate policies and practices related to asset life cycle
management. (T33)

7
SELF-ASSESSMENT QUESTIONS 1

Which one of the following provides the BEST


method for determining the level of performance
provided by similar information processing facility
environments?

A. User satisfaction
B. Goal accomplishment
C. Benchmarking
D. Capacity and growth planning

C. Benchmarking provides a means of determining the level of


performance offered by similar information processing
facility environments.

8
SELF-ASSESSMENT QUESTIONS 2

For mission critical systems with a low tolerance to


interruption and a high cost of recovery, the IS
auditor, in principle, recommends the use of which of
the following recovery options?

A. Mobile site
B. Warm site
C. Cold site
D. Hot site

D. Hot sites are fully configured and ready to operate within


several hours or, in some cases, even minutes.

9
SELF-ASSESSMENT QUESTIONS 3

Which of the following is the MOST effective method


for an IS auditor to use in testing the program change
management process?

A. Trace from system-generated information to the


change management documentation
B. Examine change management documentation for
evidence of accuracy
C. Trace from the change management
documentation to a system-generated audit trail
D. Examine change management documentation for
evidence of completeness

A. When testing change management, the IS auditor should


always start with system-generated information, containing
the date and time a module was last updated, and trace from
there to the documentation authorizing the change.

10
SELF-ASSESSMENT QUESTIONS 4

Which of the following would allow an enterprise to


extend its intranet across the Internet to its business
partners?

A. Virtual private network


B. Client-server
C. Dial-up access
D. Network service provider

A. Virtual private network (VPN) technology allows external


partners to securely participate in the extranet using public
networks as a transport or shared private network. Because
of low cost, using public networks (Internet) as a transport is
the principal method. VPNs rely on tunneling/encapsulation
techniques, which allow the Internet Protocol (IP) to carry a
variety of different protocols (e.g., SNA and IPX).

11
SELF-ASSESSMENT QUESTIONS 5

The classification based on criticality of a software


application as part of an IS business continuity plan is
determined by the:

A. nature of the business and the value of the


application to the business.
B. replacement cost of the application.
C. vendor support available for the application.
D. associated threats and vulnerabilities of the
application.

A. The criticality classification is determined by the role of the


application system in supporting the strategy of the organization.

12
SELF-ASSESSMENT QUESTIONS 6

When conducting an audit of client-server database


security, the IS auditor should be MOST concerned
about the availability of:

A. system utilities.
B. application program generators.
C. systems security documentation.
D. access to stored procedures.

A. System utilities may enable unauthorized changes to be made to data on the


client-server database. In an audit of database security, the controls over such
utilities would be the primary concern of the IS auditor.

13
SELF-ASSESSMENT QUESTIONS 7

When reviewing a network used for Internet


communications, an IS auditor will FIRST examine
the:

A. validity of password change occurrences.


B. architecture of the client-server application.
C. network architecture and design.
D. firewall protection and proxy servers.

C. The first step in auditing a network is to understand the


network architecture and design. Understanding the network
architecture and design provides an overall picture of the
network and its connectivity.

14
SELF-ASSESSMENT QUESTIONS 8

An IS auditor should be involved in:

A. observing tests of the disaster recovery plan.


B. developing the disaster recovery plan.
C. maintaining the disaster recovery plan.
D. reviewing the disaster recovery requirements of
supplier contracts.

A. The IS auditor should always be present when disaster


recovery plans are tested to ensure that the tested recovery
procedures meet the required targets for restoration, that
recovery procedures are effective and efficient, and to report
on the results, as appropriate.

15
SELF-ASSESSMENT QUESTIONS 9

Data mirroring should be implemented as a recovery


strategy when:

A. recovery point objective (RPO) is low.


B. recovery point objective (RPO) is high.
C. recovery time objective (RTO) is high.
D. disaster tolerance is high.

A. Recovery point objective (RPO) is the earliest point in time at


which it is acceptable to recover the data. In other words,
RPO indicates the age of the recovered data (i.e., how long
ago the data were backed up or otherwise replicated). If RPO
is very low, such as minutes, it means that the organization
cannot afford to lose even a few minutes of data. In such
cases, data mirroring (synchronous data replication) should
be used as a recovery strategy.

16
SELF-ASSESSMENT QUESTIONS 10

Which of the following components of a business


continuity plan is PRIMARILY the responsibility of an
organization’s IS department?

A. Developing the business continuity plan


B. Selecting and approving the recovery strategies
used in the business continuity plan
C. Declaring a disaster
D. Restoring the IT systems and data after a disaster

D. The correct choice is restoring the IT systems and data after a


disaster. The IT department of an organization is primarily
responsible for restoring the IT systems and data after a
disaster within the designated timeframes.

17
PART A: Information
Systems Operations

18
Introduction

IT service management practices are important to


provide assurance to users and to management that
the expected level of service will be delivered.

Service level expectations are derived from the


organization’s business objectives.

IT service delivery includes IS operations, IT services,


and management of IS and the groups responsible
for supporting them.

19
4.1 COMMON TECHNOLOGY COMPONENTS

This section introduces:


• Technology components
• Hardware platforms
• Basic concepts of, and history behind, the different types
of computers
• Advances in IT

20
4.1.2 COMMON ENTERPRISE BACK-END DEVICES

Following are some of the most common devices


encountered:
• Print servers
• File servers
• Application (program) servers
• Web servers
• Proxy servers
• Database servers
• Appliances (specialized devices), examples of appliances
are: firewalls, IDS/IPS, switches, routers, VPNs, load
balancers

21
4.1.3 UNIVERSAL SERIAL BUS

Risk Related to USBs


Risk related to the use of USBs includes the
following:
• Viruses and other malicious software
• Data theft
• Data and media loss
• Corruption of data
• Loss of confidentiality

22
Security Controls Related to USBs

The following controls can be used to help reduce


risk associated with the use of USB devices:
• Encryption
• Granular control
• Security personnel education
• The lock desktop policy enforcement
• Antivirus policy
• Use of secure devices only
• Inclusion of return information

23
4.1.4 RADIO FREQUENCY IDENTIFICATION

Applications of RFID
Application of RFID include the following:
• Asset management (RFID-based asset management
systems)
• Tracking
• Authenticity verification
• Matching
• Process control
• Access control
• Supply chain management (SCM)

24
Security Controls for RFID

Some security controls for RFID include:


• Management—A management control involves
oversight of the security of the RFID system. For
example, policies
• Operational—An operational control involves the
actions performed on a daily basis by the system’s
administrators and users. For example, operational
controls that ensure the physical security of the systems
and their correct use.
• Technical—A technical control uses technology to
monitor or restrict the actions that can be performed
within the system, such as encrypting wireless
communications.

25
4.1.5 HARDWARE MAINTENANCE PROGRAM

Hardware Monitoring Procedures


The following are typical procedures and reports for
monitoring the effective and efficient use of
hardware:
• Availability reports
• Hardware error reports
• Asset management reports
• Utilization reports

26
4.1.6 HARDWARE REVIEWS

Capacity
Hardware
IT asset management
acquisition plan
management and
and execution
monitoring

Hardware
Preventive Problem logs,
availability and
maintenance job accounting
utilization
schedule system reports
reports

27
4.2 IT ASSET MANAGEMENT

The first step in IT asset management is the process of


identifying and creating an inventory of IT assets.

The inventory record of each information asset should


include:
• Owner
• Designated custodian
• Specific identification of the asset
• Relative value to the organization
• Loss implications and recovery priority
• Location
• Security/risk classification
• Asset group (where the asset forms part of a larger
information system)

28
4.4 SYSTEM INTERFACES

System interfaces exist where data output from one


application is sent as input to another, with little or no human
interaction. Interfaces that involve humans are usually called
user interfaces.

The primary objective of maintaining security of data being


transferred through system interfaces is to ensure that the
data from the originating system are the same as the data
that were downloaded and recorded in the recipient system.

The secondary objective is to prevent unauthorized access to


the data via interception, malicious activity, error or other
means.

Unavailability of system interfaces can also affect the


reliability of data.

29
4.4.3 CONTROLS ASSOCIATED WITH SYSTEM
INTERFACES
Controls need to be implemented, for example, an
organization may use a software package that can
generate controls during the extraction that
automatically reconcile the data after they are
recorded on the receiving system.

Although automated controls are generally preferred


over manual controls, another control can be manual
reconciliation by a qualified person

30
4.4.3 CONTROLS ASSOCIATED WITH SYSTEM
INTERFACES
IS auditors should also ascertain if the organization is
using encryption

There also should be a control over nonrepudiation

To ensure that an audit trail is associated with the


system interface, the organization needs to capture
important information, including who sent the data,
when they were sent, when they were received,
what data structure (e.g., xls, csv, txt or xml) was
used, how the data were sent and who received the
data.

31
4.5 END-USER COMPUTING

End users are the people who access business


applications that were programmed, serviced and
installed by others.

End-user computing (EUC) refers to the ability of


end users (who typically are not programmers) to
design and implement their own application or
information system using computer software
products.

One of the benefits of EUC is that users can quickly


build and deploy applications, taking the pressure off
of the IT department.

32
4.5 END-USER COMPUTING

Lack of IT department involvement in EUC also brings


associated risk, because the applications may not be
subject to an independent review and,frequently, are
not created in the context of a formal development
methodology.

This lack of IT department involvement can result in


applications that:
• May contain errors and give incorrect results
• Are not subject to change management or release
management, resulting in multiple, perhaps different,
copies
• Are not secured
• Are not backed up

33
4.5 END-USER COMPUTING

The lack of IT department oversight of EUC can lead


to security risk. Examples include:
• Authorization—There may be no secure mechanism to
authorize access to the system.
• Authentication—There may be no secure mechanism to
authenticate users to the system.
• Audit logging—This is not available on standard EUC
solutions (e.g., Microsoft Excel and Access).
• Encryption—The application may contain sensitive data
which have not been encrypted or otherwise protected.

34
Discussion Question

Which of the following is a prevalent risk in the development


of end-user computing (EUC) applications?

A. Applications may not be subject to testing and IT


general controls.
B. Development and maintenance costs may be
increased.
C. Application development time may be increased.
D. Decision-making may be impaired due to diminished
responsiveness to requests for information.

A is the correct answer.

Justification:
A. End-user computing (EUC) is defined as the ability of end users to design and
implement their own information system utilizing computer software products.
End-user developed applications may not be subjected to an independent
outside review by systems analysts and frequently are not created in the
context of a formal development methodology. These applications may lack
appropriate standards, controls, quality assurance procedures, and
documentation. A risk of end-user applications is that management may rely on
them as much as traditional applications.
B. EUC systems typically result in reduced application development and
maintenance costs.
C. EUC systems typically result in a reduced development cycle time.
D. EUC systems normally increase flexibility and responsiveness to management’s
information requests because the system is being developed directly by the user
community.

35
4.6 DATA GOVERNANCE

Data governance ensures that:


• Stakeholder needs, conditions and options are evaluated
to determine balanced, mutually agreed enterprise
objectives to be achieved through the acquisition and
management of data/information resources.
• Direction is set for data/information management
capabilities through prioritization and decision making.
• Performance and compliance of data/information
resources are monitored and evaluated relative to
mutually agreed-upon (by all stakeholders) direction and
objectives.

36
4.6.1 DATA MANAGEMENT

The Data Management


Body of Knowledge
(DMBOK) defines data
management as:

“the planning and


execution of policies,
practices, and projects
that acquire, control,
protect, deliver, and
enhance the value of
data and information
assets.”

37
4.6.1 DATA MANAGEMENT

Data Quality
Data quality is key to data management. There are
three subdimensions of quality:
• intrinsic,
• contextual
• security/accessibility.

Each subdimension is divided further into several


quality criteria

38
Data Quality

39
Data Life Cycle

Data life cycle management describes the stages that


data go through in the course of existence in an
organization.

40
4.7.3 ACCESS CONTROL SOFTWARE

System software Feasibility study


System software IT asset
selection and selection
procedures security management
process

System software
System software Authorization System
maintenance
implementation documentation documentation
activities

System software
System software installation
change controls change controls

41
4.7.6 SOFTWARE LICENSING ISSUES

Free software licensing Paid software licensing


types types
• Open source • Per central processing
• Freeware unity (CPU)
• Shareware • Per seat (unique
users)
• Concurrent users
• Utilization
• Per workstation
• Enterprise

42
4.7.6 SOFTWARE LICENSING ISSUES

Options available to prevent software license


violations include:
• Ensure a good software asset management process
exists
• Centralize control, distribution and installation of
software (includes disabling the ability of users to
install software, where possible).
• Require that all PCs be restricted workstations with
disabled or locked-down disk drives, USB ports, etc.

43
4.7.6 SOFTWARE LICENSING ISSUES

• Install metering software on the LAN and require that all


PCs access applications through the metered software.
• Regularly scan user networks endpoints to ensure that
unauthorized copies of software have not been loaded
(achieved by comparing actual software loaded to the
list of software assets).
• Enforce documented policies and procedures that
require users to sign an agreement not to install
software without management authorization and a
software license agreement.

44
Discussion Question

An IS auditor discovers that some users have installed


personal software on their PCs. This is not explicitly forbidden
by the security policy. Of the following, the BEST approach for
an IS auditor is to recommend that the:

A. IT department implement control mechanisms to


prevent unauthorized software installation.
B. security policy be updated to include specific language
regarding unauthorized software.
C. IT department prohibit the download of unauthorized
software.
D. users obtain approval from an IS manager before
installing nonstandard software.

B is the correct answer.

Justification:
A. An IS auditor’s obligation is to report on observations noted and make the best
recommendation, which is to address the situation through policy. The IT
department cannot implement controls in the absence of the authority provided
through policy.
B. Lack of specific language addressing unauthorized software in the acceptable
use policy is a weakness in administrative controls. The policy should be
reviewed and updated to address the issue—and provide authority for the IT
department to implement technical controls.
C. Preventing downloads of unauthorized software is not the complete solution.
Unauthorized software can be also introduced through compact discs (CDs) and
universal serial bus (USB) drives.
D. Requiring approval from the IS manager before installation of the nonstandard
software is an exception handling control. It would not be effective unless a
preventive control to prohibit user installation of unauthorized software is
established first.

45
4.7.7 SOURCE CODE MANAGEMENT

Source code should be managed using a version


control system (VCS), often called revision control
software (RCS).

These maintain a central repository, which allows


programmers to check out a program source to make
changes to it. Checking in the source creates a new
revision of the program.

46
4.7.7 SOURCE CODE MANAGEMENT

The advantages of VCSs The IS auditor should be


include: aware of the following:
• Control of source code • Who has access to
access source code
• Who can commit the
• Tracking of source code code
changes • Alignment of program
• Allowing for concurrent source code to program
development objects
• Allowing rollback to • Alignment with change
earlier versions and release
management
• Allowing for branching
• Backups of source code
including those offsite
and escrow agreements

47
4.7.8 CAPACITY MANAGEMENT

Capacity management is The following information


the planning and is key to the successful
monitoring of computing completion of this task:
and network resources • CPU utilization
to ensure that the • Computer storage
utilization
available resources are
• Telecommunications
used efficiently and • LAN and WAN bandwidth
effectively. utilization
• I/O channel utilization
• Number of users
• New technologies
• New applications
• Service level agreements

48
4.7.8 CAPACITY MANAGEMENT

Larger organizations often have hundreds, if not


thousands, of servers that are arrayed in groups
referred to as server farms.

Where virtual servers are used, these may be


organized as private (also known as internal or
corporate) clouds.

If an organization has put data storage hardware in


place, the IS auditor should review the capacity
management plans, which involve data storage
utilization and storage area network (SAN) utilization.

49
Capacity Planning and Monitoring Elements

50
4.8.1 PROBLEM MANAGEMENT

Problem management aims to resolve issues


through the investigation and in-depth analysis of a
major incident or several incidents that are similar
in nature to identify the root cause.

Standard methodologies for root cause analysis


include the development of fishbone/Ishikawa
cause-and-effect diagrams, brainstorming and the
use of the 5 Whys

51
Fishbone/Ishikawa Cause-and-effect Diagrams

52
4.8.1 PROBLEM MANAGEMENT

After a problem is identified and analysis has


identified a root cause, the condition becomes a
known error.

This problem is then added to the known error


database (KEDB).

The goal is to proactively prevent reoccurrence of the


error elsewhere or, at a minimum, have a
workaround that can be provided immediately
should the incident reoccur.

53
4.8.1 PROBLEM MANAGEMENT

Objective

• Reduce the number and/or severity of incidents.


Problem
Management • Improve the quality of service of an IS
organization.

• React to issues as they arise.


Incident • Return the affected process back to normal
Management service quickly.
• Minimize business impacts of incidents.

54 © Copyright 2016 ISACA. All rights reserved.

Instructor Note: Consider mentioning that problem management focuses on incident


reduction. Incident and problem management are related but have different methods
and objectives. These are compared on this slide.

Source: ISACA, CISA Review Manual 26th Edition, 4.2.5 Incident and Problem
Management

54
4.8.2 PROCESS OF INCIDENT HANDLING

Incident management focuses on providing increased


continuity of service by reducing or removing the
adverse effect of disturbances to IT services.

It is essential for any incident handling process to


prioritize items after determining the impact and
urgency.

Incident management is reactive, and its objective is


to respond to and resolve issues restoring normal
service (as defined by the SLA) as quickly as possible.

55
4.8.3 DETECTION, DOCUMENTATION, CONTROL,
RESOLUTION & REPORTING OF ABNORMAL CONDITIONS

56
4.8.4 SUPPORT/HELP DESK

The responsibility of the technical support function


is to provide specialist knowledge of production
systems to identify and assist in system
change/development and problem resolution.

In addition, it is technical support’s responsibility to


apprise management of current technologies that
may benefit overall operations.

57
Typical Support Functions

58
4.8.5 NETWORK MANAGEMENT TOOLS

Response time reports identify the time necessary


for a command entered by a user at a terminal to be
answered by the host system.

Downtime reports track the availability of


telecommunication lines and circuits.

Help desk reports are prepared by the help desk,


which is staffed or supported by IT technicians who
are trained to handle problems occurring during
normal IS usage.

59
4.8.5 NETWORK MANAGEMENT TOOLS

Online monitors check data transmission accuracy


and errors. Monitoring can be performed by echo
checking (received data are bounced back to sender
for verification) and status checking all transmissions,
ensuring that messages are not lost or transmitted
more than once.

Network monitors provide a real time display of


network nodes and status.

60
4.8.5 NETWORK MANAGEMENT TOOLS

Network (protocol) analyzers are diagnostic tools


attached to a network link that use network
protocols’ intelligence for monitoring the packets
flowing along the link and produce network usage
reports.

Simple Network Management Protocol (SNMP) is a


TCP/IP-based protocol that monitors and controls
variables throughout the network, manages
configurations and collects statistics on performance
and security.

61
4.8.6 PROBLEM MANAGEMENT REPORTING
REVIEWS

62
4.9.1 PATCH MANAGEMENT

Several products are available to automate patch


management tasks.

Patches can be ineffective and can cause more


problems than they fix. To avoid problems, patch
management experts suggest that system
administrators take simple steps, such as performing
backups and testing patches on non-critical systems
prior to installations.

Patch management can be viewed as part of change


management.

63
4.9.2 RELEASE MANAGEMENT

Emergency
Major release Minor release
release
• Normally contains • Upgrades, offering • Normally contains
a significant small corrections to a
change or enhancements small number of
addition to a new and fixes known problems
functionality • Usually • These require
• These usually supersedes all implementation
supersede all preceding as quickly as
preceding minor emergency fixes possible, limiting
upgrades the execution of
testing and
release
management
activities

64
Discussion Question

During fieldwork, an IS auditor experienced a system crash


caused by a security patch installation. To provide reasonable
assurance that this event will not recur, the IS auditor should
ensure that:

A. only systems administrators perform the patch process.


B. the client’s change management process is adequate.
C. patches are validated using parallel testing in
production.
D. an approval process of the patch, including a risk
assessment, is developed.

B is the correct answer.

Justification:
A. While system administrators would normally install patches, it is more important
that changes be made according to a formal procedure that includes testing and
implementing the change during nonproduction times.
B. The change management process, which would include procedures regarding
implementing changes during production hours, helps to ensure that this type
of event does not recur. An IS auditor should review the change management
process, including patch management procedures, to verify that the process has
adequate controls and to make suggestions accordingly.
C. While patches would normally undergo testing, it is often impossible to test all
patches thoroughly. It is more important that changes be made during
nonproduction times, and that a backout plan is in place in case of problems.
D. An approval process alone could not directly prevent this type of incident from
happening. There should be a complete change management process that
includes testing, scheduling and approval.

65
Discussion Question

Which of the following ways is the BEST for an IS auditor to


verify that critical production servers are running the latest
security updates released by the vendor?

A. Ensure that automatic updates are enabled on critical


production servers.
B. Verify manually that the patches are applied on a
sample of production servers.
C. Review the change management log for critical
production servers.
D. Run an automated tool to verify the security patches
on production servers.

D is the correct answer.

Justification:
A. Ensuring that automatic updates are enabled on production servers may be a
valid way to manage the patching process; however, this would not provide
assurance that all servers are being patched appropriately.
B. Verifying patches manually on a sample of production servers will be less effective
than automated testing and introduces a significant audit risk. Manual testing is
also difficult and time consuming.
C. The change management log may not be updated on time and may not accurately
reflect the patch update status on servers. A better testing strategy is to test the
server for patches, rather than examining the change management log.
D. An automated tool can immediately provide a report on which patches have
been applied and which are missing.

66
Discussion Question

Which of the following processes should an IS auditor


recommend to assist in the recording of baselines for
software releases?

A. Change management
B. Backup and recovery
C. Incident management
D. Configuration management

D is the correct answer.

Justification:
A. Change management is important to control changes to the configuration, but the
baseline itself refers to a standard configuration.
B. Backup and recovery of the configuration are important, but not used to create
the baseline.
C. Incident management will determine how to respond to an adverse event, but is
not related to recording baseline configurations.
D. The configuration management process may include automated tools that will
provide an automated recording of software release baselines. Should the new
release fail, the baseline will provide a point to which to return.

67
Discussion Question

In a small organization, developers may release emergency


changes directly to production. Which of the following will
BEST control the risk in this situation?

A. Approve and document the change the next business


day.
B. Limit developer access to production to a specific time
frame.
C. Obtain secondary approval before releasing to
production.
D. Disable the compiler option in the production machine.

A is the correct answer.

Justification:
A. It may be appropriate to allow programmers to make emergency changes as
long as they are documented and approved after the fact.
B. Restricting release time frame may help somewhat; however, it would not apply
to emergency changes and cannot prevent unauthorized release of the programs.
C. Obtaining secondary approval before releasing to production is not relevant in an
emergency situation.
D. Disabling the compiler option in the production machine is not relevant in an
emergency situation.

68
4.9.3 IS OPERATIONS

IS operations are processes and activities that


support and manage the entire IS infrastructure,
systems, applications and data, focusing on day-to-
day activities.

IS operations staff is responsible for the accurate and


efficient operation of the network, systems and
applications and for the delivery of high-quality IS
services to business users and customers.

69
4.9.3 IS OPERATIONS

Tasks of the IS operations staff include:


• Execute and monitor scheduled jobs.
• Facilitate timely backup.
• Monitor unauthorized access and use of sensitive data.
• Monitor and review the extent of adherence to IS
operations procedures as established by IS and business
management.
• Participate in tests of disaster recovery plans (DRPs).
• Monitor the performance, capacity, availability and
failure of information resources.
• Facilitate troubleshooting and incident handling.

70
IS Operations Reviews

Consider adequacy
Observe IS Review operator
of operator
personnel access
manuals

Consider Examine file


Examine access to
contents/location handling
the library
of offline storage procedures

Examine data entry Review lights-out


processes operations

71
4.10 IT SERVICE LEVEL MANAGEMENT

IT services can be better managed with an SLA, and


the services offered form a basis for such
agreements.

SLAs can also be supported by operational level


agreements (OLAs), which are internal agreements
covering the delivery of services that support the IT
organization in its delivery of services.

72
4.10.1 SERVICE LEVEL AGREEMENTS

An SLA is an agreement between the IT organization


and the customer. The SLA details the service(s) to
be provided.

The IT organization could be aninternal IT


department or an external IT service provider, and
the customer is the business.

During the term of the agreement, it serves as the


standard for measuring and adjusting the services.

73
4.10.2 MONITORING OF SERVICE LEVELS

Defined service levels must be regularly monitored


by an appropriate level of management to ensure
that the objectives of IS operations are achieved. It is
also important to review the impact on the
customers and other stakeholders.

For example, a bank may be monitoring the


performance and availability of its automated teller
machines (ATMs). One of the metrics may be
availability of ATM services at expected levels (99.9
percent); however, it may also be appropriate to
monitor the impact on customer satisfaction due to
nonavailability.

74
Discussion Question

Which of the following issues should be a MAJOR concern to


an IS auditor who is reviewing a service level agreement
(SLA)?

A. A service adjustment resulting from an exception


report took a day to implement.
B. The complexity of application logs used for service
monitoring made the review difficult.
C. Performance measures were not included in the SLA.
D. The document is updated on an annual basis.

C is the correct answer.

Justification:
A. Resolving issues related to exception reports is an operational issue that should
be addressed in the service level agreement (SLA); however, a response time of
one day may be acceptable depending on the terms of the SLA.
B. The complexity of application logs is an operational issue, which is not related to
the SLA.
C. Lack of performance measures will make it difficult to gauge the efficiency and
effectiveness of the IT services being provided.
D. While it is important that the document be current, depending on the term of the
agreement, it may not be necessary to change the document more frequently
than annually.

75
Discussion Question

During a human resources (HR) audit, an IS auditor is


informed that there is a verbal agreement between the IT and
HR departments as to the level of IT services expected. In this
situation, what should the IS auditor do FIRST?

A. Postpone the audit until the agreement is


documented.
B. Report the existence of the undocumented agreement
to senior management.
C. Confirm the content of the agreement with both
departments.
D. Draft a service level agreement (SLA) for the two
departments.

C is the correct answer.

Justification:
A. There is no reason to postpone an audit because a service agreement is not
documented, unless that is all that is being audited. The agreement can be
documented after it has been established that there is an agreement in place.
B. Reporting to senior management is not necessary at this stage of the audit
because this is not a serious immediate vulnerability.
C. An IS auditor should first confirm and understand the current practice before
making any recommendations. Part of this will be to ensure that both parties
are in agreement with the terms of the agreement.
D. Drafting a service level agreement (SLA) is not the IS auditor’s responsibility.

76
Discussion Question

Which of the following is the BEST reference for an IS auditor


to determine a vendor’s ability to meet service level
agreement (SLA) requirements for a critical IT security
service?

A. Compliance with the master agreement


B. Agreed-on key performance metrics
C. Results of business continuity tests
D. Results of independent audit reports

B is the correct answer.

Justification:
A. The master agreement typically includes terms, conditions and costs but does not
typically include service levels.
B. Metrics allow for a means to measure performance. Service level agreements
(SLAs) are statements related to expected service levels. For example, an
Internet service provider (ISP) may guarantee that their service will be available
99.99 percent of the time.
C. If applicable to the service, results of business continuity tests are typically
included as part of the due diligence review.
D. Independent audits report on the financial condition of an organization or the
control environment. Reviewing audit reports is typically part of the due diligence
review. Even audits must be performed against a set of standards or metrics to
validate compliance.

77
4.11 DATABASE MANAGEMENT

Database management system (DBMS) software


offers several benefits:
• Aids in organizing, controlling and using the data needed
by application programs
• Provides the facility to create and maintain a
well-organized database
• Reduces data redundancy and access time, while
offering basic security over sensitive data

78
4.11.2 DATABASE STRUCTURE

79
4.11.3 DATABASE CONTROLS

Enforced Data backup and Updates by


Access control
definition recovery authorized
levels
standards procedures personnel only

Controls on Checks on data


Database
concurrent accuracy, Job stream
reorganization to
updating of completeness checkpoints
ensure efficiency
same data and consistency

Database Use of Minimize use of


restructuring performance non-system tools
procedures reporting tools or utilities

80
4.11.4 DATABASE REVIEWS

Logical
Physical schema Access time reports
schema

Backup and disaster


Database security Interfaces with
recovery procedures
controls other software
and controls

Database-supported IT asset
IS controls management

81
Discussion Question

The database administrator (DBA) suggests that database


efficiency can be improved by denormalizing some tables.
This would result in:

A. loss of confidentiality.
B. increased redundancy.
C. unauthorized accesses.
D. application malfunctions.

B is the correct answer.

Justification:
A. Denormalization should not cause loss of confidentiality even though confidential
data may be involved. The database administrator (DBA) should ensure that
access controls to the databases remain effective.
B. Normalization is a design or optimization process for a relational database that
minimizes redundancy; therefore, denormalization would increase redundancy.
Redundancy, which is usually considered positive when it is a question of
resource availability, is negative in a database environment because it demands
additional and otherwise unnecessary data handling efforts. Denormalization is
sometimes advisable for functional reasons.
C. Denormalization pertains to the structure of the database, not the access
controls. It should not result in unauthorized access.
D. Denormalization may require some changes to the calls between databases and
applications, but should not cause application malfunctions.

82
Discussion Question

Segmenting a highly sensitive database results in:

A. reduced exposure.
B. reduced threat.
C. less criticality.
D. less sensitivity.

A is the correct answer.

Justification:
A. Segmenting data reduces the quantity of data exposed as a result of a particular
event.
B. The threat may remain constant, but each segment may represent a different
vector against which it must be directed.
C. Criticality (availability) of data is not affected by the manner in which it is
segmented.
D. Sensitivity of data is not affected by the manner in which it is segmented.

83
Discussion Question

An IS auditor observed that users are occasionally granted the


authority to change system data. This elevated system access
is not consistent with company policy yet is required for
smooth functioning of business operations. Which of the
following controls would the IS auditor MOST likely
recommend for long-term resolution?

A. Redesign the controls related to data authorization.


B. Implement additional segregation of duties controls.
C. Review policy to see if a formal exception process is
required.
D. Implement additional logging controls.

C is the correct answer.

Justification:
A. Data authorization controls should be driven by the policy. While there may be
some technical controls that could be adjusted, if the data changes happen
infrequently, then an exception process would be the better choice.
B. While adequate segregation of duties is important, it is simpler to fix the policy
versus adding additional controls to enforce segregation of duties.
C. If the users are granted access to change data in support of the business
requirements, but the policy forbids this, then perhaps the policy needs some
adjustment to allow for policy exceptions to occur.
D. Audit trails are needed, but this is not the best long-term solution to address this
issue. Additional resources would be required to review logs.

84
Discussion Question

During an audit of a small enterprise, the IS auditor noted that the


IS director has superuser-privilege access that allows the director to
process requests for changes to the application access roles (access
types). Which of the following should the IS auditor recommend?

A. Implement a properly documented process for application


role change requests.
B. Hire additional staff to provide a segregation of duties
(SoD) for application role changes.
C. Implement an automated process for changing application
roles.
D. Document the current procedure in detail, and make it
available on the enterprise intranet.

A is the correct answer.

Justification:
A. The IS auditor should recommend implementation of processes that could
prevent or detect improper changes from being made to the major application
roles. The application role change request process should start and be approved
by the business owner; then, the IS director can make the changes to the
application.
B. While it is preferred that a strict segregation of duties (SoD) be adhered to and
that additional staff be recruited, this practice is not always possible in small
enterprises. The IS auditor must look at recommended alternative processes.
C. An automated process for managing application roles may not be practical to
prevent improper changes being made by the IS director, who also has the most
privileged access to the application.
D. Making the existing process available on the enterprise intranet would not
provide any value to protect the system.

85
Discussion Question

Which of the following choices BEST ensures accountability


when updating data directly in a production database?

A. Before and after screen images


B. Approved implementation plans
C. Approved validation plan
D. Data file security

A is the correct answer.

Justification:
A. Creating before and after images is the best way to ensure that the appropriate
data have been updated in a direct data change. The screen shots would include
the data prior to and after the change.
B. Having approved implementation plans would verify that the change was
approved to be implemented but will not ensure that the appropriate change was
made.
C. Having an approved validation plan will ensure that the data change had a
validation plan designed prior to the data change but will not ensure that the data
change was appropriate and correct.
D. Data file security would only ensure that the user making the data change was
appropriate. It would not ensure that the data change was correct.

86
PART B: Business
Resilience

87
4.12 BUSINESS IMPACT ANALYSIS

BIA is used to evaluate the critical processes (and IT


components supporting them) and to determine
time frames, priorities, resources and
interdependencies.

There are different approaches for performing a BIA.


One popular approach is a questionnaire approach,
which involves developing a detailed questionnaire
and circulating it to key users in IT and end-user
areas.

88
4.12 BUSINESS IMPACT ANALYSIS

Another popular approach is to interview groups of


key users.

A third approach is to bring relevant IT personnel and


end users (i.e., those owning the critical processes)
together in a room to come to a conclusion regarding
the potential business impact of various levels of
disruptions.

89
4.12 BUSINESS IMPACT ANALYSIS

There are two independent cost factors to consider:


• the downtime cost of the disaster.
• the alternative corrective measures.

90
4.12 BUSINESS IMPACT ANALYSIS

The sum of all costs—downtime and recovery—


should be minimized.

The first group (downtime costs) increases with time,


and the second (recovery costs) decreases with time;
the sum usually is a U curve.

91
4.12.1 CLASSIFICATION OF OPERATIONS AND
CRITICALITY ANALYSIS
Many organizations use a risk of occurrence to
determine a reasonable cost of being prepared.

For example, they may determine that there is a 0.1


percent risk (or 1 in 1,000) that over the next five
years the organization will suffer a serious disruption.

If the assessed impact of a disruption is US $10


million, then the maximum reasonable cost of being
prepared might be US $10 million × 0.1 percent = US
$10,000 over five years.

92
4.13.1 APPLICATION RESILIENCY AND DISASTER
RECOVERY METHODS
Protecting an application against a disaster entails
providing a way to restore it as quickly as possible.
Clustering makes it possible to do so.

A cluster is a type of software (agent) that is installed


on every server (node) in which the application runs
and includes management software that permits
control of and tuning the cluster behavior.

Clustering protects against single points of failure (a


resource whose loss would result in the loss of service
orproduction). The main purpose of clustering is higher
availability.

93
4.13.1 APPLICATION RESILIENCY AND DISASTER
RECOVERY METHODS
There are two major types of application clusters:
active-passive and active-active.

In active-passive clusters, the application runs on


only one (active) node, while other (passive) nodes
are used only if the application fails on the active
node.

In active-active clusters, the application runs on


every node of the cluster. With this setup, cluster
agents coordinate the information processing
between all of the nodes, providing load balancing

94
4.13.2 TELECOMMUNICATION NETWORKS
RESILIENCY AND DISASTER RECOVERY METHODS

Alternative Diverse
Redundancy
routing routing

Long-haul Last-mile
Voice
network circuit
recovery
diversity protection

95
Discussion Question

When reviewing the configuration of network devices, an IS


auditor should FIRST identify:

A. the good practices for the type of network devices


deployed.
B. whether components of the network are missing.
C. the importance of the network devices in the
topology.
D. whether subcomponents of the network are being
used appropriately.

C is the correct answer.

Justification:
A. After understanding the devices in the network, a good practice for using the
device should be reviewed to ensure that there are no anomalies within the
configuration.
B. Identification of which component is missing can only be known upon reviewing
and understanding the topology and a good practice for deployment of the device
in the network.
C. The first step is to understand the importance and role of the network device
within the organization’s network topology.
D. Identification of which subcomponent is being used inappropriately can only be
known upon reviewing and understanding the topology and a good practice for
deployment of the device in the network.

96
Discussion Question

An IS auditor is evaluating network performance for an


organization that is considering increasing its Internet
bandwidth due to a performance degradation during business
hours. Which of the following is MOST likely the cause of the
performance degradation?

A. Malware on servers
B. Firewall misconfiguration
C. Increased spam received by the email server
D. Unauthorized network activities

D is the correct answer.

Justification:
A. The existence of malware on the organization’s server could contribute to
network performance issues, but the degraded performance would not likely be
restricted to business hours.
B. Firewall misconfiguration could contribute to network performance issues, but
the degraded performance would not likely be restricted to business hours.
C. The existence of spam on the organization’s email server could contribute to
network performance issues, but the degraded performance would not likely be
restricted to business hours.
D. Unauthorized network activities—such as employee use of file or music sharing
sites or online gambling or personal email containing large files or photos—
could contribute to network performance issues. Because the IS auditor found
the degraded performance during business hours, this is the most likely cause.

97
4.14.1 DATA STORAGE RESILIENCY AND DISASTER
RECOVERY METHODS
Redundant Array of Independent (or Inexpensive)
Disks (RAID) is the most common, basic way to
protect data against a single point of failure, in this
instance, a disk failure.

98
4.14.2 BACKUP AND RESTORATION

To ensure that the critical activities of an organization


(and supporting applications) are not interrupted in
the event of a disaster, secondary storagemedia are
used to store software application files and
associated data for backup purposes.

These secondary storage media are removable media


(tape cartridges, CDs, DVDs) or mirrored disks (local
or remote) or network storage.

99
4.14.2 BACKUP AND RESTORATION

When disaster strikes, the offsite storage library


often becomes the only remaining copy of the
organization’s data.

To ensure that these data are not lost, it is very


important to implement strict controls over the
data—both physical and logical.

100
4.14.2 BACKUP AND RESTORATION

Secure physical access Ensuring that the


Encryption of backup
to library contents, physical construction
media, especially
accessible only to can withstand heat,
during transit
authorized persons fire and water

Location of the library Maintenance of an


away from the data inventory of all storage Maintenance of library
center and disasters media and files for records for specified
that may strike both specified retention retention periods
together periods

Maintenance and
protection of a catalog
of information
regarding data files

101
4.14.3 BACKUP SCHEMES

102
4.15 BUSINESS CONTINUITY PLAN

The purpose of business continuity/disaster recovery


is to enable a business to continue offering critical
services in the event of a disruption and to survive a
disastrous interruption to activities.

The BCP/DRP should be supported by a formal


executive policy that states the organization’s overall
target for recovery and empowers those people
involved in developing, testing and maintaining the
plans.

103
4.15 BUSINESS CONTINUITY PLAN

BCP is primarily the responsibility of senior


management, as they are entrusted with
safeguarding the assets and the viability of the
organization, as defined in the BCP/DRP policy.

In addition to the plan for the continuity of


operations, the BCP includes:
• The DRP that is used to recover a facility rendered
inoperable, including relocating operations into a new
location
• The restoration plan that is used to return operations to
normality whether in a restored or new facility

104
4.15.1 IT BUSINESS CONTINUITY PLANNING

The results of risk assessment and BIA are fed into the IS
business continuity strategy.

A BCP identifies what the business will do in the event


of a disaster. For example, where will employees report
to work, how will orders be taken while the computer
system is being restored, which vendors should be called
to provide needed supplies?

A subcomponent of the BCP is the IT DRP. This typically


details the process that IT personnel will use to restore
the computer systems, communications, applications and
their data.

105
4.15.2 DISASTERS AND OTHER DISRUPTIVE
EVENTS
Disasters are disruptions that cause critical
information resources to be inoperative for a period
of time, adversely impacting organizational
operations.

Natural disasters, technical/technological disasters,


man-made disasters

106
Dealing With Damage to Image, Reputation or
Brand

Consequences of damaging
rumors may be devastating. One
of the worst consequences of
crises is the loss of trust.

A properly trained spokesperson


should be appointed and
prepared beforehand. Normally,
senior legal counsel or a PR
officer is the best choice.

No one, irrespective of his/her


rank in the organizational
hierarchy, except for the
spokesperson, should make any
public statement.

107
Business Continuity Planning Life Cycle

108
4.15.4 BUSINESS CONTINUITY POLICY

109
4.15.5 BUSINESS CONTINUITY PLANNING INCIDENT
MANAGEMENT

A classification system could include the following


categories:
• Negligible incidents are those causing no perceptible or
significant damage, such as very brief OS crashes with full
information recovery or momentary power outages with UPS
backup.
• Minor incidents are those that, while not negligible, produce
no negative material (of relative importance) or financial
impact.
• Major incidents cause a negative material impact on business
processes and may affect other systems, departments or
even outside clients.
• A crisis is a major incident that can have serious material (of
relative importance) impact on the continued functioning of
the business and may also adversely impact other systems or
third parties.

110
4.15.6 DEVELOPMENT OF BUSINESS CONTINUITY
PLANS
The various factors that should be considered while
developing/reviewing the plan are:
• Predisaster readiness covering incident response
management to address all relevant incidents affecting
business processes
• Evacuation procedures
• Procedures for declaring a disaster (rating and escalation
procedures)
• Circumstances under which a disaster should be declared
• Responsible parties
• Contract information
• The step-by-step explanation of the recovery process
• The clear identification of the various resources required for
recovery and continued operation of the organization

111
4.15.8 COMPONENTS OF A BUSINESS CONTINUITY
PLAN

The BCP should include:

Continuity of Disaster recovery Business


operations plan plan resumption plan

It may also include:


Crisis
IT contingency Incident Transportation
communications
plan response plan plan
plan

Occupant Emergency
Evacuation plan
emergency plan relocation plan

Instructor Note: One example of the components of a BCP is suggested by NIST


Special Publication 800-34 Revision 1: Contingency Planning Guide for Federal
Information Systems. Figure 2.20 in the CISA Review Manual 26th Edition illustrates
these components.

Source: ISACA, CISA Review Manual 26th Edition, 2.12.9 Components of a Business
Continuity Plan

112
Components of a BCP

113
Insurance

Specific types of coverage available are:


• IT equipment and facilities—Provides coverage for
physical damage to the IPF and owned equipment.
• Media (software) reconstruction—Covers damage to IT
media that is the property of the insured and for which
the insured may be liable. Considerations in determining
the amount of coverage needed are programming costs
to reproduce the media damaged; backup expenses; and
physical replacement of media devices, such as tapes,
cartridges and disks.
• Extra expense—Designed to cover the extra costs of
continuing operations following damage or destruction
at the IPF.
• Media transportation—Provides coverage for potential
loss or damage tomedia in transit to off-premises IPFs.

114
Insurance

• Business interruption—Covers the loss of profit due to


the disruption of the activity of the company caused by
any malfunction of the IT organization
• Valuable papers and records—Covers the actual cash
value of papers and records (not defined as media) on
the insured’s premises against direct physical loss or
damage
• Errors and omissions—Provides legal liability protection
if the professional practitioner commits an act, error or
omission that results in financial loss to a client.
• Fidelity coverage—Usually takes the form of bankers
blanket bonds, excess fidelity insurance and commercial
blanket bonds and covers loss from dishonest or
fraudulent acts by employees.

115
4.15.9 PLAN TESTING

The test should strive to accomplish the following tasks:


• Verify the completeness and precision of the BCP.
• Evaluate the performance of the personnel involved in the
exercise.
• Appraise the training and awareness of employees who are
not members of a BC team.
• Evaluate the coordination among the BC team and external
vendors and suppliers.
• Measure the ability and capacity of the backup site to
perform prescribed processing.
• Assess the vital records retrieval capability.
• Evaluate the state and quantity of equipment and supplies
that have been relocated to the recovery site.
• Measure the overall performance of operational and IT
processing activities related to maintaining the business
entity.

116
4.15.9 PLAN TESTING

To perform testing, each The following types of


of the following test tests may be performed:
phases should be • Desk-based
completed: evaluation/paper test
• Pretest • Preparedness test
• Test • Full operational test—
• Posttest This is one step away
from an actual service
disruption.

117
4.15.9 PLAN TESTING

The following factors, and others, may impact business


continuity requirements and the need for the plan to be
updated:
• A strategy that is appropriate at one point in time may not be
adequate as the needs of the organization change (business
processes, new departments, changes in key personnel).
• New resources/applications may be developed or acquired.
• Changes in business strategy may alter the significance of
critical applications or deem additional applications as critical.
• Changes in the software or hardware environment may make
current provisions obsolete or inappropriate.
• New events or a change in the likelihood of events may cause
disruption.
• Changes are made to key personnel or their contact details.

118
Business Continuity Management Good Practices

Some of the following entities or


practices/regulations/standards are:
• Business Continuity Institute (BCI)—Provides good practices
for business continuity management
• Disaster Recovery Institute International (DRII)—Provides
professional practices for business continuity professionals
• US Federal Emergency Management Association (FEMA)—
Provides business and industry guidance for emergency
management
• ISACA—The COBIT framework provides guidance on IT
controls that are relevant to the business.
• US National Institute of Standards and Technology (NIST)
• US Federal Financial Institutions Examination Council (FFIEC)
• US Health and Human Services (HHS)—The Health Insurance
Portability and Accountability Act (HIPAA) describes the
requirements for managing health information.
• ISO 22301:2012: Societal security—Business continuity
management systems—Requirements

119
4.15.10 SUMMARY OF BUSINESS CONTINUITY

The process of developing and maintaining an


appropriate DRP/BCP follows:
• Conduct a risk assessment.
• Identify and prioritize the systems and other resources
required to support critical business processes in the event of
a disruption.
• Identify and prioritize threats and vulnerabilities.
• Prepare BIA of the effect of the loss of critical business
processes and their supporting components.
• Choose appropriate controls and measures for recovering IT
components to support the critical business processes.
• Develop the detailed plan for recovering IS facilities (DRP).
• Develop a detailed plan for the critical business functions to
continue to operate at an acceptable level (BCP).
• Test the plans.
• Maintain the plans as the business changes and systems
develop.

120
4.15.11 AUDITING BUSINESS CONTINUITY

The IS auditor’s tasks include:


• Understand and evaluate business continuity strategy and its
connection to business objectives.
• Review the BIA findings to ensure that they reflect current
business priorities and current controls.
• Evaluate the BCPs to determine their adequacy and currency,
by reviewing the plans and comparing them to appropriate
standards and/or government regulations including the RTO,
RPO, etc., defined by the BIA.
• Verify that the BCPs are effective, by reviewing the results
from previous tests performed by IT and end-user personnel.
• Evaluate cloud-based mechanisms.
• Evaluate offsite storage to ensure its adequacy, by inspecting
the facility and reviewing its contents and security and
environmental controls.

121
4.15.11 AUDITING BUSINESS CONTINUITY

The IS auditor’s tasks include:


• Verify the arrangements for transporting backup media to
ensure that they meet the appropriate security requirements.
• Evaluate the ability of personnel to respond effectively in
emergency situations, by reviewing emergency procedures,
employee training and results of their tests and drills.
• Ensure that the process of maintaining plans is in place and
effective and covers both periodic and unscheduled revisions.
• Evaluate whether the business continuity manuals and
procedures are written in a simple and easy to understand
manner. This can be achieved through interviews and
determining whether all the stakeholders understand their
roles and responsibilities with respect to business continuity
strategies.

122
Reviewing the Business Continuity Plan

123
Reviewing the Business Continuity Plan

Evaluate offsite
Evaluate key
Evaluate prior test storage facilities,
personnel through
results including security
interviews
controls

Evaluate the Evaluate the


Evaluate insurance
Security at the alternative
coverage
Offsite Facility processing contract

124
4.16.1 RECOVERY POINT OBJECTIVE AND RECOVERY
TIME OBJECTIVE

The RPO is determined based on the acceptable


data loss in case of disruption of operations.

It indicates the earliest point in time in which it is


acceptable to recover the data.

For example, if the process can afford to lose the


data up to four hours before disaster, then the latest
backup available should be up to four hours before
disaster or interruption and the transactions that
occurred during the RPO period and interruption
need to be entered after recovery (known as catch-
up data).

125
4.16.1 RECOVERY POINT OBJECTIVE AND RECOVERY
TIME OBJECTIVE

It is almost impossible to recover the data


completely. Even after entering incremental data,
some data are still lost and are referred to as orphan
data.

The RTO is determined based on the acceptable


downtime in case of a disruption of operations.

It indicates the earliest point in time at which the


business operations (and supporting IT systems)
must resume after disaster.

126
4.16.1 RECOVERY POINT OBJECTIVE AND RECOVERY
TIME OBJECTIVE

Both RPO and RTO are based on time parameters.


The nearer the time requirements are to the center,
the more costly the recovery strategy.

127
4.16.1 RECOVERY POINT OBJECTIVE AND RECOVERY
TIME OBJECTIVE

Disaster tolerance is the time gap within which the


business can accept the unavailability of IT critical
service; therefore, the lower the RTO, the lower the
disaster tolerance.

RTO affects the technology used to make


applications/IT systems available—what to use for
recovery (e.g., warm site, hot site, clusters). RPO
usually affects data protection solutions (backup and
recovery, synchronous or asynchronous data
replication).

128
4.16.1 RECOVERY POINT OBJECTIVE AND RECOVERY
TIME OBJECTIVE

In addition to RTO and RPO, there are some


additional parameters that are important in defining
the recovery strategies:
• Interruption window—The maximum period of time the
organization can wait from the point of failure to the
critical services/applications restoration.
• Service delivery objective (SDO)—Level of services to
be reached during the alternate process mode until the
normal situation is restored. This is directly related to
the business needs.
• Maximum tolerable outages (MTOs)—Maximum time
the organization can support processing in alternate
mode.

129
4.16.2 RECOVERY STRATEGIES

Recovery strategies based on the risk level identified


for recovery include developing:
• Hot sites
• Warm sites
• Cold sites
• Duplicate information processing facilities
• Mobile sites
• Reciprocal arrangements with other organizations

130
4.16.3 RECOVERY ALTERNATIVES

Hot sites
• A facility with all of the IT and communications
equipment required to support critical
applications, along with office accommodations
for personnel.

131
4.16.3 RECOVERY ALTERNATIVES

Warm sites
• A complete infrastructure, partially configured for IT,
usually with network connections and essential
peripheral equipment. Current versions of programs
and data would likely need to be installed before
operations could resume at the recovery site.
Cold sites
• A facility with the space and basic infrastructure to
support the resumption of operation but lacking any
IT or communications equipment, programs, data or
office support.

132
4.16.3 RECOVERY ALTERNATIVES

Mirrored sites
• A fully redundant site with real-time data
replication from the production site.
Mobile sites
• Modular processing facilities mounted on
transportable vehicles, ready to be
delivered and set up on an as-needed
basis.

133
4.16.3 RECOVERY ALTERNATIVES

Reciprocal arrangements

• Agreements between separate, but similar, companies


to temporarily share their IT facilities in the event that
a partner to the agreement loses processing capability.

Reciprocal arrangements with other


organizations
• Agreements between two or more organizations with
unique equipment or applications. Participants promise
to assist each other during an emergency.

134
4.16.4 DEVELOPMENT OF DISASTER RECOVERY
PLANS
Typically, the IT DRP contains:
• Procedures for declaring a disaster (escalation
procedures)
• Criteria for plan activation (i.e., in which circumstances
the disaster is declared, when the IT DRP is put to
action, which scenarios are covered bythe plan [loss of
the IT system, loss of the processing site, loss of the
office])
• Linkage with the overarching plans (for instance,
emergency response plan or crisis management plan or
BCPs for different lines of business)
• The person (or people) responsible for each function in
plan execution
• Recovery teams and their responsibilities

135
4.16.4 DEVELOPMENT OF DISASTER RECOVERY
PLANS
• Contact and notification lists (contact information for
recovery teams, recovery managers, stakeholders, etc.)
• The step-by-step explanation of the whole recovery
process (i.e., where and when the recovery should take
place [the same site or backup site], what has to be
recovered [IT systems, networks, etc.], the order of
recovery)
• Recovery procedures (for each IT system or component).
Note: the level of detail here greatly varies and depends
on the practices used in the organization.
• Contacts for important vendors and suppliers
• The clear identification of the various resources required
for recovery and continued operation of the
organization

136
The Recovery/Continuity/Response Teams

• Incident response team • Transportation team


• Emergency action team • User hardware team
• Information security team • Data preparation and
records team
• Damage assessment team • Administrative support
• Emergency management team
team • Supplies team
• Offsite storage team • Salvage team
• Software team • Relocation team
• Applications team • Coordination team
• Emergency operations team • Legal affairs team
• Network recovery team • Recovery test team
• Training team
• Communications team

137
4.16.5 DISASTER RECOVERY TESTING METHODS

In summary, testing should include:


• Develop test objectives.
• Execute the test.
• Evaluate the test.
• Develop recommendations to improve the effectiveness
of testing processes and recovery plans.
• Implement a follow-up process to ensure that the
recommendations are implemented.

138
Types of Tests

The types of disaster recovery tests include:


• Checklist review—This is a preliminary step to a real
test. Recovery checklists are distributed to all members
of a recovery team to review and ensure that the
checklist is current.
• Structured walk-through—Team members physically
implement the plans on paper and review each step to
assess its effectiveness, identify enhancements,
constraints and deficiencies.

139
Types of Tests

The types of disaster recovery tests include:


• Simulation test—The recovery team role play a
prepared disaster scenario without activating processing
at the recovery site.
• Parallel test—The recovery site is brought to a state of
operational readiness, but operations at the primary site
continue normally.
• Full interruption test—Operations are shut down at the
primary site and shifted to the recovery site in
accordance with the recovery plan; this is the most
rigorous form of testing but is expensive and potentially
disruptive.

140
Progression of DR Tests

Test Results
• Time
• Data
• Amount
• Percentage and/or
number
• Accuracy

141
4.16.6 INVOKING DISASTER RECOVERY PLANS

The required teams (discussed earlier in this section)


should be then be mobilized with the incident
evaluated to confirm which of the tested scenarios it
most closely resembles. Examples include:
• Loss of network connectivity
• Loss of a key IT system
• Loss of the processing site (server room)
• Loss of critical data
• Loss of an office, etc.
• Loss of key service provider (e.g., cloud)

142
Discussion Question

Which of the following specifically addresses how to detect


cyberattacks against an organization’s IT systems and how to
recover from an attack?

A. An incident response plan (IRP)


B. An IT contingency plan
C. A business continuity plan (BCP)
D. A continuity of operations plan (COOP)

A is the correct answer.

Justification:
A. The incident response plan (IRP) determines the information security responses
to incidents such as cyberattacks on systems and/or networks. This plan
establishes procedures to enable security personnel to identify, mitigate and
recover from malicious computer incidents such as unauthorized access to a
system or data, denial-of-service (DoS) or unauthorized changes to system
hardware or software.
B. The IT contingency plan addresses IT system disruptions and establishes
procedures for recovering from a major application or general support system
failure. The contingency plan deals with ways to recover from an unexpected
failure, but it does not address the identification or prevention of cyberattacks.
C. The business continuity plan (BCP) addresses business processes and provides
procedures for sustaining essential business operations while recovering from a
significant disruption. While a cyberattack could be severe enough to require use
of the BCP, the IRP would be used to determine which actions should be taken—
both to stop the attack as well as to resume normal operations after the attack.
D. The continuity of operations plan (COOP) addresses the subset of an

143
organization’s missions that are deemed most critical and contains procedures to
sustain these functions at an alternate site for a short time period.

143
Discussion Question

The PRIMARY objective of performing a postincident review is


that it presents an opportunity to:

A. improve internal control procedures.


B. harden the network to industry good practices.
C. highlight the importance of incident response
management to management.
D. improve employee awareness of the incident response
process.

A is the correct answer.

Justification:
A. A postincident review examines both the cause and response to an incident.
The lessons learned from the review can be used to improve internal controls.
Understanding the purpose and structure of postincident reviews and follow-up
procedures enables the information security manager to continuously improve
the security program. Improving the incident response plan based on the
incident review is an internal (corrective) control.
B. A postincident review may result in improvements to controls, but its primary
purpose is not to harden a network.
C. The purpose of postincident review is to ensure that the opportunity is presented
to learn lessons from the incident. It is not intended as a forum to educate
management.
D. An incident may be used to emphasize the importance of incident response, but
that is not the intention of the postincident review.

144
Discussion Question

During an IS audit of the disaster recovery plan (DRP) of a global


enterprise, the IS auditor observes that some remote offices have
very limited local IT resources. Which of the following observations
would be the MOST critical for the IS auditor?

A. A test has not been made to ensure that local resources


could maintain security and service standards when
recovering from a disaster or incident.
B. The corporate business continuity plan (BCP) does not
accurately document the systems that exist at remote
offices.
C. Corporate security measures have not been incorporated
into the test plan.
D. A test has not been made to ensure that tape backups from
the remote offices are usable.

A is the correct answer.

Justification:
A. Regardless of the capability of local IT resources, the most critical risk would be
the lack of testing, which would identify quality issues in the recovery process.
B. The corporate business continuity plan (BCP) may not include disaster recovery
plan (DRP) details for remote offices. It is important to ensure that the local plans
have been tested.
C. Security is an important issue because many controls may be missing during a
disaster. However, not having a tested plan is more important.
D. The backups cannot be trusted until they have been tested. However, this should
be done as part of the overall tests of the DRP.

145
Discussion Question

Which of the following is the BEST indicator of the


effectiveness of backup and restore procedures while
restoring data after a disaster?

A. Members of the recovery team were available.


B. Recovery time objectives (RTOs) were met.
C. Inventory of backup tapes was properly maintained.
D. Backup tapes were completely restored at an alternate
site.

B is the correct answer.

Justification:
A. The availability of key personnel does not ensure that backup and restore
procedures will work effectively.
B. The effectiveness of backup and restore procedures is best ensured by recovery
time objectives (RTOs) being met because these are the requirements that are
critically defined during the business impact analysis stage, with the inputs and
involvement of all business process owners.
C. The inventory of the backup tapes is only one element of the successful recovery.
D. The restoration of backup tapes is a critical success, but only if they were able to
be restored within the time frames set by the RTO.

146
Discussion Question

An IS auditor is reviewing the most recent disaster recovery


plan (DRP) of an organization. Which approval is the MOST
important when determining the availability of system
resources required for the plan?

A. Executive management
B. IT management
C. Board of directors
D. Steering committee

B is the correct answer.

Justification:
A. Although executive management’s approval is essential, the IT department is
responsible for managing system resources and their availability as related to
disaster recovery (DR).
B. Because a disaster recovery plan (DRP) is based on the recovery and
provisioning of IT services, IT management’s approval would be most important
to verify that the system resources will be available in the event that a disaster
event is triggered.
C. The board of directors may review and approve the DRP, but the IT department is
responsible for managing system resources and their availability as related to DR.
D. The steering committee would determine the requirements for disaster recovery
(recovery time objective [RTO] and recovery point objective [RPO]); however, the
IT department is responsible for managing system resources and their availability
as related to DR.

147
Discussion Question

Which of the following is the MOST efficient way to test the


design effectiveness of a change control process?

A. Test a sample population of change requests


B. Test a sample of authorized changes
C. Interview personnel in charge of the change control
process
D. Perform an end-to-end walk-through of the process

D is the correct answer.

Justification:
A. Testing a sample population of changes is a test of compliance and operating
effectiveness to ensure that users submitted the proper documentation/requests.
It does not test the effectiveness of the design.
B. Testing changes that have been authorized may not provide sufficient assurance
of the entire process because it does not test the elements of the process related
to authorization or detect changes that bypassed the controls.
C. Interviewing personnel in charge of the change control process is not as effective
as a walk-through of the change controls process because people may know the
process but not follow it.
D. Observation is the best and most effective method to test changes to ensure
that the process is effectively designed.

148
Discussion Question

Which of the following is the GREATEST risk of an organization


using reciprocal agreements for disaster recovery between
two business units?

A. The documents contain legal deficiencies.


B. Both entities are vulnerable to the same incident.
C. IT systems are not identical.
D. One party has more frequent disruptions than the
other.

B is the correct answer.

Justification:
A. Inadequate agreements between two business units is a risk, but generally a
lesser one than the risk that both organizations will suffer a disaster at the same
time.
B. The use of reciprocal disaster recovery is based on the probability that both
organizations will not suffer a disaster at the same time.
C. While incompatible IT systems could create problems, it is a less significant risk
than both organizations suffering from the same disaster at the same time.
D. While one party may utilize the other’s resources more frequently, this can be
addressed by contractual provisions and is not a major risk.

149
Discussion Question

During a review of a business continuity plan, an IS auditor


noticed that the point at which a situation is declared to be a
crisis has not been defined. The MAJOR risk associated with
this is that:

A. assessment of the situation may be delayed.


B. execution of the disaster recovery plan could be
impacted.
C. notification of the teams might not occur.
D. potential crisis recognition might be delayed.

B is the correct answer.

Justification:
A. Problem and severity assessment would provide information necessary in
declaring a disaster, but the lack of a crisis declaration point would not delay the
assessment.
B. Execution of the business continuity and disaster recovery plans would be
impacted if the organization does not know when to declare a crisis.
C. After a potential crisis is recognized, the teams responsible for crisis management
need to be notified. Delaying the declaration of a disaster would impact or negate
the effect of having response teams, but this is only one part of the larger impact.
D. Potential crisis recognition is the first step in recognizing or responding to a
disaster and would occur prior to the declaration of a disaster.

150
Discussion Question

An IS auditor is reviewing an organization’s recovery from a


disaster in which not all the critical data needed to resume
business operations were retained. Which of the following
was incorrectly defined?

A. The interruption window


B. The recovery time objective (RTO)
C. The service delivery objective (SDO)
D. The recovery point objective (RPO)

D is the correct answer.

Justification:
A. The interruption window is defined as the amount of time during which the
organization is unable to maintain operations from the point of failure to the time
that the critical services/applications are restored.
B. The recovery time objective (RTO) is determined based on the acceptable
downtime in the case of a disruption of operations.
C. The service delivery objective (SDO) is directly related to the business needs. SDO
is the level of services to be reached during the alternate process mode until the
normal situation is restored.
D. The recovery point objective (RPO) is determined based on the acceptable data
loss in the case of a disruption of operations. RPO defines the point in time
from which it is necessary to recover the data and quantifies, in terms of time,
the permissible amount of data loss in the case of interruption.

151
Discussion Question

When auditing the IT governance framework and IT risk


management practices that exist within an organization, the IS
auditor identified some undefined responsibilities regarding IT
management and governance roles. Which of the following
recommendations is the MOST appropriate?

A. Review the strategic alignment of IT with the business.


B. Implement accountability rules within the organization.
C. Ensure that independent IS audits are conducted
periodically.
D. Create a chief risk officer (CRO) role in the organization.

B is the correct answer.

Justification:
A. While the strategic alignment of IT with the business is important, it is not directly
related to the gap identified in this scenario.
B. IT risk is managed by embedding accountability into the enterprise. The IS
auditor should recommend the implementation of accountability rules to
ensure that all responsibilities are defined within the organization. Note that
this question asks for the best recommendation—not about the finding itself.
C. Performing more frequent IS audits is not helpful if the accountability rules are
not clearly defined and implemented.
D. Recommending the creation of a new role (CRO) is not helpful if the
accountability rules are not clearly defined and implemented.

152
Discussion Question

To optimize an organization’s BCP, an IS auditor should


recommend a BIA to determine:

A. the business processes that generate the most


financial value for the organization and, therefore,
must be recovered first
B. the priorities and order for recovery to ensure
alignment with the organization’s business
strategy
C. the business processes that must be recovered
following a disaster to ensure the organization’s
survival
D. the priorities and order of recovery, which will
recover the greatest number of systems in the
shortest time frame

C is the correct answer.

Justification:
A. It is a common mistake to overemphasize financial value rather than urgency.
For example, while the processing of incoming mortgage loan payments is
important from a financial perspective, it could be delayed for a few days in
the event of a disaster. On the other hand, wiring funds to close on a loan,
while not generating direct revenue, is far more critical because of the
possibility of regulatory problems, customer complaints and reputation
issues.
B. The business strategy (which is often a long-term view) does not have a
direct impact at this point in time.
C. To ensure the organization’s survival following a disaster, it is important to
recover the most critical business processes first.
D. The mere number of recovered systems does not have a direct impact at this
point in time. The importance is to recover systems that would impact
business survival.

153

You might also like