Professional Documents
Culture Documents
LECTURE1 INTRODUCTION
Instructor ALLY .S. Nyamawe, nyamawe@udom.ac.tz +255 715 016 580
CS200 Database Systems
DATABASE
Databases and database systems have become an essential component of everyday life in modern society. In the course of a day, most of us encounter several activities that involve some interaction with a database. For example, if we go to the bank to deposit or withdraw funds, if we make a hotel or airline reservation, if we access a computerized library catalog to search for a bibliographic item, or if we buy some item-such as a book, toy, or computerfrom an Internet vendor through its Web page. CS200 Database Systems
Cont...
chances are that our activities will involve someone or some computer program accessing a database. Even purchasing items from a supermarket nowadays in many cases involves an automatic update of the database that keeps the inventory of supermarket items. These interactions are examples of what we may call traditional database applications, in which most of the information that is stored and accessed is either textual or numeric.
CS200 Database Systems
Cont...
In the past few years, advances in technology have been leading to exciting new applications of database systems. Multimedia databases can now store pictures, video clips, and sound messages. Geographic information systems (GIS) can store and analyze maps, weather data, and satellite images. Real-time and active database technology is used in controlling industrial and manufacturing processes. And database search techniques are being applied to the World Wide Web to improve the search for information that is needed by users browsing the CS200 Database Systems Internet.
Database Definition
Databases and database technology are having a major impact on the growing use of computers. It is fair to say that databases play a critical role in almost all areas where computers are used, including business, electronic commerce, engineering, medicine, law, education, and library science, etc. Generally, A database is a collection of related data.
CS200 Database Systems
Cont...
By data, we mean known facts that can be recorded and that have implicit meaning. For example, consider the names, telephone numbers, and addresses of the people you know. You may have recorded this data in an indexed address book, or you may have stored it on a hard drive, using a personal computer and software such as Microsoft Access, or Excel. This is a collection of related data with an implicit meaning and hence is a database.
CS200 Database Systems
Cont...
Also, A database can be defined as collection of information organized in such a way that it can be accessed easily. Examples Tracking customer orders Maintaining Employees Records. Maintaining Students Information
CS200 Database Systems
Database Properties
A database has the following implicit properties: A database represents some aspect of the real world, sometimes called the miniworld or the universe of discourse. Changes to the miniworld are reflected in the database. A database is a logically coherent collection of data with some inherent meaning. A random assortment of data cannot correctly be referred to as a database. A database is designed, built, and populated with data for a specific purpose. It has an intended group of users and some preconceived applications in which these users are interested. CS200 Database Systems
History of Databases
Manual systems File Processing Systems (FPS) Database Management systems (DBMS)
Manual Systems
Structure Information can be stored in dedicated room or in separate offices. Room or office will be furnished with shelves. Different shelves will hold Records for different subjects. Records will be stored in hard flat files, each file will carry one record. Each file will have a specific number to identify it. A person will use the file number to retrieve the CS200 specific file (record). Database Systems
Manual Systems
User
File keeper
Files Cabinet
CS200 Database Systems
Cont..
Characteristics of a DBMS
Computerized record-keeping system Contains facilities that allow the user to: Add, delete files, Insert, retrieve, update, and delete data Collection of databases; each can be used for separate purposes or combined Examples of DBMS are: Sql server, Ms Access, MySql, Oracle.
CS200 Database Systems
Components of a DBMS
users/program mers application program s/queries
stored database
Architecture of a DBMS
user queries schema modifications modifications
Cont...
Storage Manager: In a simple database system, the storage manager is nothing more than the file system of the underlying OS. In larger systems, for the purposes of efficiency, the DBMSs normally control storage on the disk directly. The storage manager consists of two basic components (1) the buffer manager, and (2) the file manager.
CS200 Database Systems
Cont...
File Manager: Keeps track of the location of files on the disks and obtains the block or blocks containing a file on request from the buffer manager. Disks are typically blocked into regions of contiguous space ranging between 212 and 214 bytes (between roughly 4000 to 16,000 bytes/block). Buffer Manager: Handles main memory. It obtains blocks of data from the disk, via the file manager, and chooses a page of main memory in which to store the block. The paging algorithm will determine how long a page will remain in main memory. However, the transaction manager can also force a page in main memory to be returned to disk.
CS200 Database Systems
Cont...
Query Manager: Turns a query or database manipulation, which may be expressed at a very high level (e.g., SQL) into a sequence of request for stored data such as specific tuples of a relation or parts of an index to a relation. Often the hardest part of query processing is query optimization, which involves the formulation of a good query execution strategy.
CS200 Database Systems
Cont...
Transaction Manager: There are certain guarantees that a DBMS must make when performing operations on a database. These guarantees are often referred to as the ACID properties. Atomicity: all of a transaction is executed or none of it is executed. Consistency: data cannot be in a inconsistent state. Isolation: concurrent transactions must be isolated from each other both in effect and in visibility. Durability: changes to the database caused by a transaction must not be lost even if the system fails immediately after the transaction completes.
CS200 Database Systems
Database Design
For the system to be acceptable to the endusers, the database design activity is crucial. A poorly designed database will generate error that may lead to bad decisions being made, which may have serious repercussions for the organization. On the other hand, a welldesigned database produces, in an efficient way, a system that provides the correct information for the decision-making process to succeed.
CS200 Database Systems
Database Administrator
A database administrator (DBA) controls and manages the database. Functions of a DBA Make decisions concerning the content of the database Plan storage structures and access strategies Provides support to users Defines security and integrity checks Interprets backup and recovery strategies.
CS200 Database Systems
Cont...
Application Developers Once the database has been implemented, the application programs that provide the required functionality for the end-users must be implemented. This is the responsibility of the application developers.
Cont...
End Users End users are the clients for the database and can be broadly categorized into two groups based upon how they utilize the system. Nave users are typically unaware of the DBMS. They access the database through specially written application programs which attempt to make the operations as simple as possible. They typically know nothing about the database or the DBMS. Sophisticated users are familiar with the structure of the database and the facilities offered by the DBMS. They will typically use a high-level query language like SQL to perform their required operations and may even write their CS200 Database Systems own application programs.
Cont...
Disadvantages: Reduction in speed of data access time Requires special knowledge Possible dependency of application programs to specific DBMS versions
conceptual level
internal level
db db
Data Independence
One of the major objectives of the three-level architecture is to provide data independence, which means that the upper levels are unaffected by changes to lower levels. There are two types of data independence: logical and physical. Logical data independence refers to the immunity of the external schemas to changes in the conceptual schema. Physical data independence refers to the immunity of the conceptual schema to changes in the internal schema.
CS200 Database Systems
Data Independence(cont.)
user 1 View 1 View 1 user 2 View 2 View 2 user n View n View n external level logical data independence Conceptual Conceptual Schema Schema Internal Internal Schema Schema conceptual level physical data independence internal level
db db
CS200 Database Systems
Database Languages
A data sublanguage consists of two parts: a Data Definition Language (DDL) and a Data Manipulation Language (DML). The DDL is used to specify the database schema and the DML is used to both read and update the database. These languages are called data sublanguages because they do not include constructs for all computing needs such as conditional or iterative statements, which are provided by the high-level programming languages. Most DBMSs have a facility for embedding the sublanguage in a high-level programming language such as COBOL, Pascal, C, C++, Java, or Visual Basic which is then called the host language. Most data sublanguages also provide a non-embedded or interactive version of the language to be input directly from a CS200 Database Systems terminal.
DMLs (cont.)
DMLs are distinguished by their underlying retrieval constructs. We can distinguish two basic types of DMLs: procedural and non-procedural. Procedural DMLs are languages in which the user informs the system what data is required and exactly how to retrieve that data. Non-procedural DMLs are languages in which the user informs the system only of what data is required and leaves the how to retrieve the data entirely up to the system. It is common for procedural DMLs to be embedded in highlevel programming languages. Procedural DMLs tend to be more focused on individual records while non-procedural DMLs tend to operate on sets of records. CS200 Database Systems
Types of Databases
Flat Databases
A single kind of record with a fixed number of fields. a way of organizing all information in a single table. suitable for extremely simple databases. inherit data redundancy
Relational Database
Data stored in a collection of columns and rows called a table, or a relation Tables may be electronically linked via a key field containing common data Easy to add, delete and modify the data and the table structures
CS200 Database Systems
Relational Database
Summary
In this lecture we defined a database as a collection of related data, where data means recorded facts. A typical database represents some aspect of the real world and is used for specific purposes by one or more groups of users. A DBMS is a generalized software package for implementing and maintaining a computerized database. The database and software together form a database system. We identified several characteristics that distinguish the database approach from traditional fileprocessing applications. We discussed about database history and it's types as well as it's users (a DBA, DA etc).
CS200 Database Systems
Challenge
Discuss the capabilities that should be provided by a DBMS. Discuss the main characteristics of the database and how it differs from traditional file systems. Why would you choose a database system instead of simply storing data in operating system files? When would it make sense not to use a database system?
CS200 Database Systems
Questions