You are on page 1of 13

Code Course Title Credit

(T-P-Pj)
CUTM1034 Database Cluster 2-2-1
Administration and Security
Syllabus
Module I: Construction of Clusters
Define the size of the cluster based on data to be stored, Identify quicker ways to add or remove nodes using cluster
monitoring tools, Deploy and manage clusters, Setup cluster configurations using cookbooks.
Module II: Manage cluster deployment
Define encryption standards at rest and in motion for data stored on the cluster, create documentation on computing
clusters, Validate computing clusters, Administer cluster access.
Module III: Monitor capacity and performance
Evaluate cluster requirement estimates through capacity planning, Monitor cluster connectivity and performance and
recommend efficiencies, Troubleshoot issues with cluster connectivity and performance with appropriate people.
Module IV: Database Security
Secure a Database System applying Information Security, Security Threats, Security Vulnerabilities and Risk
Management.
Module V: Database backup
Implement Database System backup and restoring backup, Implement Database System security by creation of roles
and functions for Database Users.
Module-VI: Message Transfer
Operations browsing, downloading messages, pictures from internet, sending and receiving Emails, sending
attachments of pictures and text information, E-chatting.
Module-VII: Cluster Management
Master-Master, Master-Slave, Multimaster, Master-Multislave
Textbooks
1. Implementing database security and auditing, Author: Ron Ben-Natan
2. Database Security, Authors: Silvana Castano, Giancarlo Martella, M. Fugini.
3. Fundamentals of Database Systems, Author: Ramez Elmasri.
Course Outcomes
1. Students will gain knowledge about concepts of Construction of Clusters.
2. Students will develop skill of developing new techniques in cluster deployment in real world Applications.
3. Students will know different schemes on message transfer and database backup
4. Student will develop skill on database security establishment in different networks.
5. Students will be able to manage the database clusters
Introduction
to
Database Cluster Administration and Security

Session 1
Database Cluster Administration
• Definition of Database Cluster:
• A database cluster is a group of databases that are interconnected and work together to
distribute and manage data.
• Clusters are designed to provide redundancy, fault tolerance, and scalability.
• High Availability and Fault Tolerance:
• Database clusters aim to ensure continuous availability of data, even in the face of
hardware failures or other issues.
• High availability is achieved by distributing data across multiple nodes, so if one node
fails, others can take over seamlessly.
• Scalability:
• Clusters can scale horizontally by adding more nodes to accommodate increasing data
and user loads.
• Vertical scaling involves adding more resources to individual nodes for improved
performance.
Database Cluster Administration
• Load Balancing:
• Load balancing ensures that the workload is evenly distributed among
cluster nodes to optimize performance and prevent bottlenecks.
• Backup and Recovery:
• Robust backup and recovery strategies are essential for protecting data in
case of accidental deletion, corruption, or system failures.
• Performance Monitoring and Optimization:
• Constant monitoring of cluster performance helps identify and address
issues promptly.
• Optimization techniques, such as indexing and query tuning, enhance
overall performance.
Database Cluster Security
• Authentication and Authorization:
• Strong authentication mechanisms ensure that only authorized users can access the
cluster.
• Authorization controls dictate the level of access each user or role has within the cluster.
• Encryption:
• Data at rest and in transit should be encrypted to prevent unauthorized access and
protect sensitive information.
• Audit Trails:
• Logging and auditing mechanisms track user activities and changes to the database,
aiding in forensic analysis and compliance.
• Firewalls and Network Security:
• Implementing firewalls and network security measures protects the cluster from
external threats and unauthorized access.
Database Cluster Security
• Patch Management:
• Regularly updating and patching the database software and underlying operating system
is crucial for fixing vulnerabilities and ensuring security.
• Compliance and Regulations:
• Adhering to industry-specific regulations and compliance standards is essential for
securing sensitive data.
• Incident Response and Disaster Recovery:
• Having a well-defined incident response plan and disaster recovery strategy is critical to
minimizing the impact of security incidents.
• User Education and Training:
• Educating users and administrators about security best practices helps prevent common
pitfalls and reinforces a security-conscious culture.
Size of the cluster based on data to be stored
Determining the size of a database cluster based on the amount of data to be stored involves
considerations related to data volume, growth expectations, performance requirements, and
redundancy goals. Here are key factors to consider:

• Data Volume:
Estimate the current size of the dataset you plan to store in the database cluster. Consider both the
size of individual records and the total number of records.

• Data Growth Rate:


Predict the rate at which your data is expected to grow over time. This projection helps in planning
for future storage needs.

• Performance Requirements:
Define the performance requirements of your application, including factors like data retrieval speed
and transaction throughput. Higher performance requirements may necessitate a larger cluster.
Size of the cluster based on data to be stored
• Redundancy and Fault Tolerance:
Determine the level of redundancy and fault tolerance you want in your cluster. Additional nodes may
be required to provide high availability and fault tolerance in the event of hardware failures.

• Horizontal vs. Vertical Scaling:


Decide between horizontal scaling (adding more nodes) and vertical scaling (increasing resources on
existing nodes). Horizontal scaling is often preferred for distributed databases and clusters.

• Cluster Configuration:
Understand the configuration of your database cluster, including the number of nodes, their
specifications (CPU, RAM, storage), and the distribution of data across nodes.

• Data Distribution:
Consider how the data will be distributed across the cluster nodes. Some databases use sharding or
partitioning to distribute data effectively.
Size of the cluster based on data to be stored
• Backup and Recovery Requirements:
Factor in the storage requirements for regular backups and ensure sufficient space is
available to store backup copies.
• Type of Data:
Different types of data (e.g., text, images, videos) have varying storage requirements.
Consider the nature of your data when estimating storage needs.
• Compression and Indexing:
Implement data compression and efficient indexing to optimize storage utilization, which can
affect the required cluster size.
Mathematics behind determining size of the cluster
1. Estimating Data Size:
Individual Record Size (IRS): Calculate the average size of each record in your dataset.
Number of Records (NR): Determine the expected total number of records.
Estimated Data Size (EDS): Multiply IRS by NR to get the total estimated data size.
𝐸𝐷𝑆 = 𝐼𝑅𝑆 × 𝑁𝑅
2. Planning for Growth:
Data Growth Rate (DGR): Estimate the expected growth rate of your data over a specific period.
Projected Data Growth (PDG): Calculate the projected growth by multiplying the current data size
by (1 + DGR).
𝑃𝐷𝐺 = 𝐸𝐷𝑆 × (1 + 𝐷𝐺𝑅)
4. Determining Cluster Nodes:
Node Capacity (NC): Determine the capacity of an individual node in terms of storage, memory, and
processing power.
Number of Nodes (NN): Divide the projected data growth by the capacity of a single node.
𝑃𝐷𝐺
𝑁𝑁 =
𝑁𝐶

You might also like