GROUP 3
Prepared by:
Dustin Earth B. Montebon
Eliza Jane H. Rebarter
Raner Dyve S. Valencia
CHAPTER 4
Security Part II:
Auditing Database
Systems
LEARNING OBJECTIVES
After studying this chapter, you should:
• Understand the operational problems inherent in the flat-file approach to data management
that gave rise to the database approach.
• Understand the relationships among the fundamental components of the database concept.
• Recognize the defining characteristics of three database models: hierarchical, network, and
relational.
• Understand the operational features and associated risks of deploying centralized,
partitioned, and replicated database models in the DDP environment.
• Be familiar with the audit objectives and procedures used to test data management controls.
FLAT-FILE
APPROACH
The flat-file environment
promotes a single-user view
approach to data management
whereby end users own their
data files rather than share
them with other users.
This replication of essentially
the same data in multiple files
is called data redundancy.
DATABASE
APPROACH
This approach centralizes the
organization’s data into a common
database that is shared by other users.
The DBMS is a special software system
that is programmed to know which data
elements each user is authorized to
access.
PROBLEMS & SOLUTIONS
FLAT-FILE DATABASE
APPROACH APPROACH
• Data Storage • Elimination of Data Storage
Problem
• Data Updating • Elimination of Data Update
Problem
• Currency of • Elimination of Currency
Information Problem
• Elimination of Task-Data
• Task-Data Dependency Problem
Dependency
Audit Objective Relating to Database Access
• Verify that database access authority and privileges are granted
to users in accordance with their legitimate needs.
Audit Procedures for Testing Database
Access Controls
• Responsibility for Authority Tables and Subschemas. The
auditor should verify that database administration (DBA) personnel retain
exclusive responsibility for creating authority tables and designing user
views. Evidence may come from three sources: (1) by reviewing company
policy and job descriptions, which specify these technical responsibilities;
(2) by examining programmer authority tables for access privileges to
data definition language (DDL) commands; and (3) through personal
interviews with programmers and DBA personnel.
Audit Procedures for Testing Database
Access Controls
• Appropriate Access Authority. The auditor can select a sample of users and verify that their
access privileges stored in the authority table are consistent with their job descriptions
organizational levels.
• Biometric Controls. The auditor should evaluate the costs and benefits of biometric
controls. Generally, these would be most appropriate where highly sensitive data are
accessed by a very limited number of users.
• Inference Controls. The auditor should verify that database query controls exist to pre vent
unauthorized access via inference. The auditor can test controls by simulating access by a
sample of users and attempting to retrieve unauthorized data via inference queries.
Audit Procedures for Testing Database
Access Controls
• Encryption Controls. The auditor should verify that sensitive data, such as
pass words, are properly encrypted. Printing the file contents to hard
copy can do this.
Backup Controls
• Data can be corrupted and destroyed by malicious acts from
external hackers, disgruntled employees, disk failure, program
errors, fires, floods, and earthquakes. To recover from such
disasters, organizations must implement policies, procedures,
and techniques that systematically and routinely provide
backup copies of critical files.
Backup Controls in the Flat-File Environment
• The backup technique employed will depend on the media and the
file structure. Sequential files (both tape and disk) use a backup
technique called grandparent–parent–child (GPC). This backup
technique is an integral part of the master file update process. Direct
access files, by contrast, need a separate backup procedure.
GPC Backup Technique
Direct Access File Backup
• Data values in direct access files are changed in place
through a process called destructive replacement.
• The timing of the direct access backup procedures will
depend on the processing method being used: Batch
systems and Real-Time systems.
Off-Site Storage
• As an added safeguard, backup files created under
both the GPC and direct access approaches should be
stored off-site in a secure location. Off-site storage was
discussed in Chapter 2 in the section dealing with
disaster recovery planning.
Audit Objective Relating to Flat-File Backup
• Verify that backup controls in place are effective in
protecting data files from physical damage, loss,
accidental erasure, and data corruption through
system failures and program errors.
Audit Procedures for Testing Flat-File Backup
Controls
• Sequential File (GPC) Backup. The auditor should select a sample of systems and determine
from the system documentation that the number of GPC backup files specified for each
system is adequate.
• Backup Transaction Files. The auditor should verify through physical observation that
transaction files used to reconstruct the master files are also retained.
• Direct Access File Backup. The auditor should select a sample of applications and identify
the direct access files being updated in each system. From system documentation and
through observation, the auditor can verify that each of them was copied to tape or disk
before being updated.
Audit Procedures for Testing Flat-File
Backup Controls
• Off-Site Storage. The auditor should verify the
existence and adequacy of off-site storage. This audit
procedure may be performed as part of the review of
the disaster recovery plan or computer center
operations controls.
Backup Controls in the Database Environment
• Backup. The backup feature makes a periodic backup of the entire database. This is
an automatic procedure that should be performed at least once a day.
• Transaction Log (Journal). The transaction log feature provides an audit trail of all
processed transactions.
• Checkpoint Feature. The checkpoint facility suspends all data processing while the
system reconciles the transaction log and the database change log against the
database.
• Recovery Module. The recovery module uses the logs and backup files to restart
the system after a failure.
Audit Objective Relating to
Database Backup
• Verify that controls over the data resource are
sufficient to preserve the integrity and physical
security of the database.
Audit Procedures for Testing Database
Backup Controls
• The auditor should verify that backup is performed routinely
and frequently to facilitate the recovery of lost, destroyed, or
corrupted data without excessive reprocessing.
• The auditor should verify that automatic backup procedures
are in place and functioning, and that copies of the database
are stored off-site for further security.
DATA raw, unprocessed data (unorganized)
example: 20 y/o Anna, Philippines
INFORMATION Processed data (organized, detailed information)
example: Anna is 20 years old and lives in the Philippines
DATABASE an organized collection of data, typically stored
electronically in a computer system.
DATABASE MANAGEMENT SYSTEM (DBMS) provides and
controlled environment to assist (or prevent) access to the database
and to efficiently manage the data resource.
DBMS Features
1. Program development
Both programmers and end users may employ this feature to create
applications to access the database.
2. Backup and recovery
DBMS periodically makes backup copies of the physical database. In the
event of a disaster (disk failure, program error, or malicious act)
3. Database usage reporting
It captures statistics on what data are being used, when they are used,
and who uses them.
4. Database access
The most important feature of a DBMS is to permit authorized user
access, both formal and informal to the database
3 SOWTWARE
MODULES
3 Viewing Levels of DDL
(database view)
• INTERNAL VIEW- physical
arrangement of records
• CONCEPTUAL VIEW (schema) -
representation of database
• USER VIEW (subschema) - the
portion of the database each user
view
• It is a programming language used to define the database.
2. Data Manipululation Language (DML)
• Is the proprietary programming language that a particular DBMS
uses to retrieve, process, and store data to/ from the database.
3. QUERY LANGUAGE
• the capability to permits end user and professional
programmers to access data in the database wihout the need
for conventional programs.
• DATA DICTIONARY
- Describe every data element in the
database
(abstract definition of
representations of the physical level.)
-May be both in paper form and
online
• PHYSICAL DATABASE
-This is the lowest level of the
database and the only level that exists
in physical form.
-The physical database consists
of magnetic spots on metallic coated
disks. (hard disc of data,
storage of data)
• Data Structures
Data Access Methods
- are the bricks and mortar of the database.
The data structure allows records to be - the technique used to locate
located, stored, and retrieved, and enables records and to navigate through the
database.
movement from one record to another.
-No single structure is best for
all processing tasks. Selecting one,
2 fundamental components: therefore, involves a trade-off
between desirable features. The
criteria that influence the selection of
1. Data Organization the data
structure include
- refers to the way records are physically
arranged on the secondary storage device. This
may be either sequential or random. The 1. Rapid file access and data retrieval
records in sequential files are stored in 2. Efficient use of disk storage space
contiguous locations that occupy a specified 3. High throughput for transaction
area of disk space. Records in random files are processing
stored without regard for their physical 4. Protection from data loss
relationship to other records of the same file. 5. Ease of recovery from system
Random files may have records distributed failure
DATABASE TERMINOLOGIES:
Data Attribute/Field - is a single item of data, such as customer’s name, account
balance, or address.
Entity.
An entity is a database representation of an individual resource, event, or agent
about which we choose to collect data. Entities may be physical (inventories,
customers, and employees) or conceptual (sales, accounts receivable, and
depreciation expense).
Record Type (Table or File) group together the data attributes that logically define
an entity, they form a record type. For example, the data attributes describing the
sales event could form the sales order record type. Multiple occurrences (more
than one) of a particular type of record are physically arranged in tables or files.
In other words, a company’s sales order record types are physically stored in
their Sales Order table, which is part of their corporate database.
ASSOCATION
Three basic record
associations are:
• one-to-one,
• one to-many
• many-to-many.
DATABASE CONCEPTUAL MODEL
• refers to the particular method used to organize records in a
database.
• a.k.a. ‘logical data structures’
OBJECTIVES: develop the database efficiently so that data can
be acessed quickly and easily.
3 MAIN MODELS
• HIERARCHICAL (tree structure)
• NETWORK
• RELATIONAL
1. The Hierarchical Model
This was a popular method of data representation because it reflected, more or
less faithfully, many aspects of an organization that are hierarchical in
relationship.
IBM’s information management system (IMS) is the most prevalent example of a
hierarchical database. It was introduced in 1968 and is still a popular database
model over 40
years later.
The hierarchical model is constructed of sets that describe the relationship
between two linked files.
Each set contains a parent and a child.
This structure is also called a tree structure.
The highest level in the tree is the root segment, and the lowest file in a
particular branch
is called a leaf
Navigational Databases. The hierarchical data model is called a navigational
database because traversing the files requires following a predefined path. This
is established through explicit linkages (pointers) between related records.
Limitations of the Hierarchical Model.
1. A parent record may have one or more child records. customer is the parent of
both sales invoice and cash receipts.
2. No child record can have more than one parent
RELATI0NAL MODEL
• portrays data in the form of two dimensional table
• its strenght id the ease with which tables may be linked to
one another
• base on the relational algebra function or restrict, project,
and join.
ADVANTAGES OF RELATIONAL TABLE
removes three types of anomalies
various items of interset (customers, inventory, sales) are stored
in separate tables
space is used efficiently
very flexible users can use ad hoc relationship
THE NORMALIZATION PROCESS
a process which systematically splits unnormalized complex
tables into smaller tables that meet two conditions:
all nonkey (secondary) attributes in the table are
dependent on the primary key.
- all nonkey attributes in independent of the other
nonkey attributes
when unnormalized tables are split and reduced to third
normal form, they must then be linked together by foreign keys.
ACCOUNTANTS AND DATA NORMALIZATION
• update anmalies can generate conflicting and absolete database
values.
• insertion anomalies can result in unrecorded transactions and
incomplete audit trails.
• deletion anomalies can cause the loss of accounting records and
the destruction of audit trails.
• accountant shouuld understand the data nirmalization process and
be able to determine weyher a database is properly normalized.
SIX PHASE IN DESIGNING RELATIOAL DATABASE
1. IDENTIFY ENTITIES 5. CONSTRUCT THE PHYSICAL
- identify the primary entities of the organization DATABASE
- contruct a data model of their relationships -create physical tables
- populate tables with data
2, CONSTRUCT A DATA MODEL SHOWING
ENTITY ASSOCIATIONS 6. PREPARE THE USER VIEWS
- datermine the associations between entities - normalized tables should
- model associations nto an ER diagram support all required views of
system users
3. ADD PRIMARY KEY ATTRIBUTES - user views restrict users
- assign primary keys to all entities in the mdel to from having access to
uniquely identify records unauthorized data
4. NORMALIZE AND ADD FOREIGN KEYS
-remove repeating groups, partial and transitive
dependencies
- assign foreign keys to be able to link tables
DATABASES IN A DISTRIBUTED
ENVIRONMENT
2 ways to store data:
1. Centralized Database
2. Distributed Database
• Partitioned Database
• Replicated Database
Centralized Database
• All the data is stored in a single, central location.
• Easier to maintain and control because everything is in one place.
• However, it may slow down access for users who are far away
from the central location. Plus, if the central server fails, the entire
system can go down.
Temporary Inconsistency
• This occurs when multiple transactions happen at
the same time leading to moments when the data
may not accurately reflect the current state.
Example scenario:
In a Restaurant, there are 2 servers; Server A and Server B.
Server A, checks the inventory and sees that there are still 5
servings of their specialty available. Now, Server A, sells 2
servings to a customer. Inventory should be updated to 3 servings
available.
However, Server B, checks the inventory at the same time as
Server A, and sees that there are 5 servings available without
knowing that 2 have already been sold. This is a “temporary
inconsistency.”
Database Lockout
• Prevents multiple servers from updating the same data at the
same time.
Example:
When Server A checks the inventory to sell the dish, the system
locks the inventory. This means that while Server A is taking the
order and processing it, Server B cannot check or change the
Inventory.
Wait Status
• When a database lockout occurs, other servers must enter a wait
status until they can access the locked data.
• Since the inventory is locked, Server B’s request goes into a wait
status. It’s like being put on hold until Server A finishes and
releases the lock on the inventory.
Distributed Database
• The data is spread across different locations, and there are two
types:
Partitioned Databases – Different parts of the data are stored in
different places. This makes it faster for local users but harder to
manage because the data is split up.
Replicated Databases – Copies of the entire data are kept in
multiple places. This makes the system more reliable since if one
location goes down, others can still provide the data. However,
keeping these copies up-to-date can be difficult.
Partitioned Database
• The partitioned database approach is a method where the central
database is split into different sections or partitions. These
partitions are distributed to different locations, typically where the
primary users are. This has several advantages:
1. Increased user control
2. Faster response time
3. Disaster protection
Deadlock Phenomenon
• A deadlock happens when two or more locations lock the data
they need, but then wait for data locked by another location to
complete their transactions. This creates a situation where none
of the locations can proceed, and everything comes to a halt.
Deadlock Resolution
• When a deadlock occurs, the system needs to find a way to
resolve it so the transactions can move forward. The typical
solution involves terminating one or more transactions in the
deadlock.
• Once these transactions are stopped, the remaining transactions
can be completed.
• The terminated transactions will then be restarted later to ensure
everything is processed correctly.
Factors for termination:
Resources Already Invested
• The system looks at how much work has already been done in a
transaction.
Stage of Completion
• The closer a transaction is to finishing, the less likely it is to be
terminated.
Number of Deadlocks Involved
• If a transaction is part of more than one deadlock, it’s a good candidate
for termination because stopping it can solve multiple deadlock situation at once
Example of Deadlock Resolution:
Let’s say three departments in a company are in a deadlock over
access to shared resources.
• Department 1 is halfway through using the company’s financial data.
• Department 2 has just started using marketing data.
• Department 3 is almost done using employee payroll data.
The deadlock resolution system will likely choose to stop
Department 2’s transaction because:
• It has done the least amount of work (just started), so stopping it
won’t waste too much time.
• Stopping it will allow other departments to continue and resolve
the deadlock.
• Once the deadlock is broken, Department 2 can restart its
transaction and complete it later without major disruption.
Replicated Database
• A replicated database is like making copies of the same data
and
keeping one at each location in a company.
• Great for fast access to information but can cause problems if
each location is making changes without staying in sync.
Database Concurrency
• Refers to ensuring that complete and accurate data is
available
across all user locations, especially when multiple transactions
are happening simultaneously.
• To manage this, one common method used is serializing
transactions.
Serializing Transactions
1. Classifying Transactions
• Transactions are grouped into different classes to identify potential
conflicts.
2. Time-stamping Transactions
• Each transaction is given a unique time stamp (which includes the time
and location), using a system-wide clock to ensure that all sites are
Synchronized.