You are on page 1of 38

Auditing Data Centers and

Disaster Recovery
Dr. Mazen Ali
Data Center
• Is a facility to
house an
organization’s
critical systems;
– Hardware
– Operating systems
– Applications
Major Data Center risks
• Natural threats such as weather events,
flooding, earthquakes and fire
• Manmade threats such as terrorist incidents,
riots, theft and sabotage
• Loss of utilities such as electrical power and
telecommunication
Physical Security and Environmental
Controls
• Facility Access Control
Systems
– Authenticate workers prior
to providing physical entry
to facilities
– Something you Know?
– Something you Have?
– Something you Are?
Physical Security and Environmental
Controls
• Alarm Systems
– Burglar alarms (Magnetic
door, windows, motion
sensors)
– Fire alarms
– Water alarms
– Humidity alarms
– Power fluctuations alarms
– Gas alarms
System and Site Resiliency
• Power
– Clean power critical to maintain computer operations.
– Power fluctuations can damage computer
components
– To mitigate risk, data centers provide power
redundancy including the following:
• Redundant power
• Ground to earth
• Power conditioning
• Battery backups
• Generators
System and Site Resiliency
• Heating, Ventilation and Air-conditioning (HVAC)
– Extreme temperature and humidity conditions can
damage systems
• Network Connectivity
• Data Center Operations
– Operated by staff
– Governed by policies, plans and procedures such as
phsyical access control, systems and facility
monitoring, equipment tracking and maintenance
Physical Access Issues

• Physical access exposures may originate from natural and


man-made hazards, and can result in unauthorized access and interruptions in information
availability.
• Exposures include:
Physical Access Controls
Environmental Exposures

• Environmental exposures are due primarily to naturally occurring events.


• Common environmental exposures include:
Environmental Controls

• Environmental exposures should be afforded the same level of protection as other types of
exposures. Possible controls include:
Test Steps for Auditing Data Centers
• Neighbourhood and external risk factors
• Physical access controls
• Environmental controls
• Power and electricity
• Fire suppression
• Data Center operations
• System resiliency
• Data back and restore
• Disaster recovery planning
Neighbourhood and external risk
factors
1. Review data center exterior lighting, building orientation, signage, fences
and neighborhood characteristics to identify facility related risks.
– Data center facilities should provide a physical secure environment for
personnel and information systems
How
– Physical inspection of data center-
• barriers are in place?
• Controls exists to reduce risk of car accidents (bombs)
• Location of data center in the building (ground level of below ground level)
– Signage
• Data centers should be anonymous
• Look for signage that identify the location of data center
– Neighbourhood
– Exterior lighting (lighting deters crime)
– Fences
2. Research the data center location for environmental
hazzards and to determine the distance to
emergency services
• Proximity of fire stations, police stations and
hospitals from data center
How
– Look for information on the following:
• Local crime rate
– If high, you could recommend CCTV
• Proximity to industrial areas
– Industrial areas have higher crime rate and risk chemical spills
• Proximity to emergency services
• Proximity to transportation-related hazards
– Planes do crash and trains do derails
Physical Access Controls
• Exterior doors and walls
• Access control procedures
• Physical authentication mechanisms
• Security guards
• Other mechanisms and procedures used to
secure sensitive areas
3. Review data center doors and walls to
determine whether they protect the facility
adequately
How
•Identify all potential entry points into the facility
•Doors
– Ensure that doors are force-resistance (magnatic
locks)
– Man traps (two locking doors with a corridor)
•Windows
– Identify any windows looking into the data center and
ensure they are constructed with shatterproof glass
– Windows should be avoided but if not make sure they
have blinds or curtains.
4. Ensure that physical access control
procedures are comprehensive and being
followed by data center and security staff
• If physical access procedures are not enforced, physical access will be
compromised
How
• Ensure that access authorization requirements are documented and
defined for both staff and guests
• Verify that guest access procedures include restrictions on taking pictures
and outline conduct requirements within the data center.
• Visitors should sign and log, wear a different badge and escorted at all
times.
• Review a sample of both guest access and employee ID authorization
request to ensure access control procedures are followed
• Obtain a list of all individuals who have access to the data center, select a
representative sample of employees with access and determined whether
access is appropriate.
• Review evidence that management reviews physical access authorisation
5. Evaluate Physical authentication devices to
determine whether they are appropriate and
are working properly
• Misuse of devices such as card-key readers,
proximity badges, biometric devices and so on
allows access to authorised personnel to data
center
How:
Make sure data entry points have the following:
– Restricts access based on individual needs or restricts
access to particular doors and hours
– Easily deactivated in the event an employee is
terminated or changes jobs or key/card/badges lost
– Difficult to duplicate or steal credentials
6. Ensure that burglar alarms and
surveillance systems are protecting the
data center from physical intrusion
How
Verify that critical areas of data centers are covered by intrusion
sensors, CCTV, audio surveillance systems or combination.
– Motion sensors detect infrared motion
– Contact sensors that are placed on windows and doors to detect
when they are opened or broken
– Audio sensors to detect breaking glass or changes in normal ambient
noise
– Door prop alarms to detect if door is left open after a period of time
•Review camera quality and placement.
•Verify that surveillance systems are monitored and evaluate the
frequency of monitoring
•Verify that CCTV is recorded for play back
7. Review security guard building round logs and
other documentation to evaluate the
effectiveness of the security personnel function
• Security guards can be the most effective
physical access controls.
How
– verify that documentation of building rounds,
access logs and incident reports exists (take a
sample)
8. Verify that Heating Ventilation and Air
Conditioning (HVAC) systems maintains
constant temperatures within the data center
How
– Temperature and humidity logs- Determine how
the data center staff has established the
parameters for the equipment
– Temperature and humidity alarms
– HVAC design to verify that all areas of the data
centers are covered appropriately
– Configuration of the HVAC systems. Ensure HVAC
controls to continue function for the data center
in the event of a power loss
9. Ensure that a water alarm system is configured
to detect water in high risk areas of the data
center
How
– Identify water power sources such as drains, AC units,
exterior doors and water pipes to verify sensors are placed
in locations were they will mitigate the most risk. Data
center manager should be aware of all water valves.

10.Ensure that power is conditioned to prevent data loss.


How
• Through interviews and observations, verify that power is being conditions by power surge
protectors or a battery backup system
11.Verify that battery backup systems are
providing continuous power during momentary
black-outs and brown-outs.
Power failure can cause data loss through abrupt
system shutdowns. UPS battery systems mitigate this
risk by 20-30 minutes of power.
How
– Interview data center facility manager and
observe UPS battery backup systems to verify that
UPS systems is protecting all critical computer
systems.
– Review list of equipment tied into UPS and ensure
all critical systems are covered.
12.Ensure that generators protect against
prolonged power loss and are in good working
conditions.
How
– Through observations and interviews, verify that
the data center has a generator.
– Ensure that generates ability to power operations
for sustained period of time by reviewing onsite
fuel storage.
– All types of generators requires servicing and
maintenance- review maintenance and test logs
13.Evaluate the usage and protection of
emergency power-off switches.
EPO are designed to shut off power in the event
of emergencies such as fire.
How
– Through observation, review EPO switch(es) from
the data center.
– Ensure that they are clearly labeled and easily
accessible, yet still secured from unauthorized or
accidental usage.
14.Ensure that data center building construction
incorporate fire suppression features.
• Fire rated walls and doors to prevent fire from
moving from one area of a building to another
• Fire stops where fire rated walls or floor
assemblies are sealed to prevent spread of fire.
• Stand pipe fire hose systems to provide a ready
supply of water for fire suppression.
How
– Review fire suppression features built into the facility.
– Get information from facility manager
15.Determine that personnel are trained in how to
respond to a fire emergency.

16.Verify that extinguisher are strategically placed throughout the data center.
How
Review location of fire extinguishers

17. Verify that fire alarms are in place to protect the data center from the risk of fire
Heat sensors, smoke sensors and flame sensors
How
Review fire alarm sensors type, placement, maintenance records and testing procedures.
Reviews related to Data Center
Operations
• Facility monitoring
• Roles and responsibiities of data center
personnel
• Segregation of duties of data center personnel
• Responding to emergencies and disasters
• Facility and equipment maintenance
• Data center capability building
• Asset management
18. Review alarm monitoring console(s), reports and
procedures to verify that alarms are monitored
continually by data center personnel
• Alarm systems feed into a monitoring console that allows DC
personnel to respond to a condition before calling authorities.
• Absence of console would introduce risk
How
– Review alarm reports and observe data center alarm-
monitoring console to verify that intrusion, fire, water,
humidity etc are monitored continuously by security staff.
19. Verify that network, operating systems and
application monitoring identify potential problems for
systems located in data center.
• Monitoring provides insight into problems resulting from
misconfigurations, and system failure
• Monitoring is important to ensure everything is running
properly.
• How
– Determine how computer systems are monitored
– Review monitoring logs such as software utilisation

20. Ensure roles and responsibilities are clearly defined


• How
– Verify that all job functions are covered
– Verify that responsibility with job function clearly defined
21. Verify that job functions of personnel are segregated
properly.

• Goal is to spread high-risk duties across two or more


employees.
• How
– Verify that high-risk jobs such as access authorisation are
segregated
22. Ensure emergency response procedures address
reasonably anticipated threats (fire, flood, power loss)
• How
– Review response plans
– Verify that plans are present for all foreseeable threats
– Staff should be able to provide these plans
– Observe whether emergency phone number are posted
23.Ensure that data center personnel are trained
properly to perform their job.
How
– Review training history and schedules.
– Ensure training is relevant to job function
24.Ensure that data center capacity (power, network,
heating, ventilation, space) is planned to avoid
unnecessary outages
How
– Review monitoring thresholds and strategies that data center
management uses to determine when facilities, equipment or
networks require upgrading
– Verify that planning procedures are comprehensive and review
evidence that they are followed
25. Review and evaluate asset management for data
center equipment
Asset management is the controlling, tracking, and
reporting of assets to facilitate accounting for the assets.
Without asset management- duplicates, theft
How
•Review and evaluate the data center’s asset management policies
and ensure they comply with company policy and has the following:
– Asset procurement process (ensure process has proper
approvals prior to purchase)
– Asset tracking. Ensure that data center is using tags/database
– Current inventory of all equipment
– Asset move and disposal procedures- Ensure unused equipment
is stored in a secure manner
Data backup and Restore
System backup is regularly performed on most systems
26. Ensure that backup procedure and capacity are
appropriate for respective systems
– Backups procedures- Backup schedules, tape rotations, and
an off-site storage process.
How
• Determine systems are back up periodically
• Determine backup stored off site in a secure location
• Ensure backup media have enough storage space
• Verify that backup are being performed
• Retrieve and review a sample of backup system logs.
27.Verify that systems can be restored from
backup media
– No point of backup if data cannot be restored
How
– Ask system administrator to order backup media
from off-site storage
– Observe restoration of data from media to test
server
– Verify that all files have been restored
Disaster Recovery Planning
• The goals is to ensure system is recovered following a disaster such as
hurricane or flood
28.Ensure that a DRP exists and is comprehensive and that key employees
are aware of their roles in the event of a disaster
How
– Ensure that DRP exists
– Verify that DRP covers all systems and operational areas. It should include a
detailed step-by-step instructions for restoring critical systems.
– Ensure that roles and responsibilities are clearly defined.
– Ensure emergency communications are addressed in the plan
– Determine whether the plan includes criteria for determining whether a
situation is a disaster and procedures for declaring a disaster
– Verify that a current copy of DRP is maintained at secured off-site location
29.Ensure that DRPs are updated and tested
regularly
– If plans are not tested, there is no assurance that they
will work when need.
– They should be updated at least annually
– Failure to update or test DRPs will result in slower
recover times in the event of a disaster
How
– Review the update or version history of the plan
– Verify that tests are performed at least annually
30.Ensure that emergency operations plans
address various scenarios adequately
https://www.youtube.com/watch?v=8g0NrHExD3g

https://www.youtube.com/watch?v=cLory3qLoY8

You might also like