This action might not be possible to undo. Are you sure you want to continue?
Concepts of Database Management Systems
1.0 INTRODUCTION TO DBMS 1.1 OBJECTIVES 1.2 DATA PROCESSING – AN IMPORTANT ASPECT OF ANY BUSINESS 1.2.1 Data and Information 1.2.2 Data / Information Processing and Databases 1.2.3 Data – Types and Properties 184.108.40.206 Data Types 220.127.116.11.1 Data Representation 18.104.22.168.2 Data Size 1.2.4 Data Organization and Grouping 22.214.171.124 Character 126.96.36.199 Field 188.8.131.52 Record 184.108.40.206 File 220.127.116.11 Database 1.3 DATABASES AND THEIR MANAGEMENT 1.3.1 Objectives of DBMS 1.3.2 Components of DBMS 1.3.3 Types of Databases 18.104.22.168 Operational Database 22.214.171.124 Analytical Database 126.96.36.199 Distributed Database 188.8.131.52 Personal end user Database 1
184.108.40.206 Multimedia Database 220.127.116.11 Special Purpose Database 1.3.4 Database Models 1.4 Storage of information 1.4.1 Operational unit 1.4.2 Storage unit 1.4.3 External storage unit 1.5 Record and Record Organization 1.5.1 Definition and concepts
1.5.2 Record Organization 1.6 File and File Organization 1.6.1 Structure of sequential file 1.6.2 Processing of sequential file 1.7 Index Sequential file 1.7.1 Structure of Index sequential file 1.8 Direct file Organization 1.9 Summary 1.10 Questions 1.11 Further Readings
INTRODUCTION TO DBMS
Database Management is an important aspect of data processing. It involves, several data models evolving into different DBMS software packages. These packages demand certain knowledge in discipline and procedures to effectively use them in data processing applications. We need to understand the relevance and scope of Database in the Data processing area. This we do by first understanding the properties and characteristics of data and the nature of data organization. Further we look at the various objectives of Databases technology and its characteristics by studying different packages available in the market. Data structure can be defined as specification of data. Different data structures like array, stack, queue, tree and graph are used to implement data organization in main memory. Several strategies are used to support the organization of data in secondary memory. In this unit we will look at different strategies available for organizing data in secondary memory. We will also learn about data representation for files in external storage devices, so that required operations (e.g. retrieval, update) may be carried out efficiently.
In this unit you will learn 1. Concept, relevance and scope of Database in Data processing. 2. The type and characteristics of data. 3. The data structures from storage and retrieval perspective 4. Characteristics and objectives of Databases, and 5. Overview of DBMS products in the market. 6. Storage in Operational units (i.e. register) 7. Internal storage unit( i.e. Main memory)
8. External storage unit ( i.e. Secondary memory) 9. Record and Record Organization 10. File and File Organization 11. Different types of file organization
1.2 DATA PROCESSING – AN IMPORTANT ASPECT OF ANY BUSINESS:
Business organizations – Big and Small – generate lot of data in terms of activities they perform. Even individuals need to handle lot of data in their day to day life. A simple example would be an address book that we all maintain. In this book we keep information like name, address, and phone numbers etc., for all the people with whom we interact. Without this book, it will be impossible for us to carry on our day to day activity of contacting and communicating with our friends, relatives and business associates. As the size of the business organization increases, the amount of data it generates increases exponentially. Hence the need for storing and using them too raises multifold. Modern businesses have recognized this need and duly stress the importance of data as a vital resource to conduct business profitably. Two terms – Data and Information are used in this connection. Let us understand their scope and difference. 1.2.1 Data and Information: Data are raw facts or observations typically about physical phenomenon or business transactions. More specifically data are objective measurements of the attributes (or characteristics) of entities (such as people, places, things and events) Example: 1. A sale of automobile may generate a lot of data (like, the type of Vehicle, model, price, date of purchase, buyer’s name / address, seller’s name / address etc). 2. A meteorological satellite may collect and send lot of data about atmospheric pressure, wind velocity and direction, cloud density, Humidity, Temperature etc., on a regular basis continuously. The observed data is usually represented by symbols such as numbers words, codes (composed of a mixture of numerical and alphabetical and other characters). It could even take other forms like, voice, images, pictures, drawings, etc.
If the observed / collected data is converted into a useful and meaningful form, then it becomes Information. Data is usually subjected to a value-added process called Data Processing OR Information Processing, where – 1. its form is aggregated, manipulated and organized, 2. its content is analyzed and evaluated and 3. It is presented in a context meaningful to a human user. Thus we see that information is processed data, placed in a context that gives its value for specific end users as shown below.
North, Rs 15,000 Ramesh, Rs 20,000 South, Rs 50,000 Saxena Rs 35,000 Narayan Rs 4,000
Sales North : Rs. 95,000 Ramesh Rs 60,000 Saxena Rs 35,000 South : Rs 50,000 Narayan Rs 4,000
1.2.2 Data / Information Processing and Databases: Information systems are several, depending on the needs of the different types of businesses. Further, there are different types of information systems like, Transaction processing systems, Decisions support systems, expert systems etc., which are used by different levels of management in a business organization. In spite of all these variations and differences, all Information systems have some things in common. 1. They all use some kind of computerized techniques to store all the data and information generated in the system. 2. They all access the stored information in different ways to do further processing or presentation. Thus we see that data storage and retrieval is one of the central activities in Information processing. Such collection and organization of information is called “Data bank”. In early days of business, Data banks existed in the minds of key Personnel in the business. As the volume and complexity increased several tools like, Books, records, manuals, drawings etc., were devised as “Data banks” and manual procedures and skills were evolved to retrieve information from these banks when needed.
However these techniques were not reliable and fast enough when the information involved was huge and complex. Hence business decisions could not be accurate and timely. To correct this Lacuna, Information systems were computerized. The speed and accuracy of computers resulted in tremendous improvement of reliability and timeliness of information generated. This process however, involved the development of techniques and tools to handle data banks on computers, namely, the tools to store and retrieve information in computers. The development of such techniques and tools resulted in what are known as DBMS packages. Integrated databanks stored in Computer Systems are called “Databases”. The Computer Software Packages (a set of tools and utilities) that facilitate the creation use and managing of Databases is called DBMS (Data Base Management Systems). DBMS provides computational capacity to store, retrieve, edit, sort and perform computations including statistics upon data, which it extracts from its storage. The tasks handled by DBMS packages can be classified as: a. Database Development - Define, organize the content, relationships and structure of the data needed to build a database. b. Database Interrogation - Access the data in a database to display information in various Formats. Users can selectively retrieve and display information and produce forms, reports, and other documents. c. Database Maintenance - Add, delete, update, and correct the data in a database. d. Application Development - Develop prototypes of queries, presentation forms, reports for a Proposed business application. Let us try and understand these tasks in detail later. First let us start a detailed study of Data. 1.2.3 Data – Types and Properties: All data items have certain fundamental properties. It is important to know them first in order to create databanks. First and foremost property of the data is its form. Every data element will have a form. Data items are classified as different data types based on their form. The form decides the way it is stored in the computer.
18.104.22.168 Data Types: Data can be classified as Numeric, Picture, Voice, Data based on its Form. The last 2 types namely picture and voice are special forms of data and normally they are used less frequently. It is the textual data that is very large and most used. Hence let us focus on that first. Textual data can be numeric or alphanumeric (combination of numeric and alphabetic) Numeric data consists of numbers. Example: Number of students in a class is 50; Marks obtained in a subject are 78; Price of a given item is Rs. 48.56; As you can notice from the examples, pure numeric data items can be classified further into 2 types. One of them is a whole number. (Like, number of students in a class, number of vehicles in the city) These are called integers. On the other hand, we also have numeric data, which includes fractions. (Like price of an item is 48.56, Max. Temperature today was 28.32 etc). These data items are called Real numbers. This difference of data types namely integer and real number is of importance to us because they are represented and manipulated differently in a computer. The next data type is alphabetic or alphanumeric. This type of data is made up of alphabetic and numeric characters. (E.g.: The name of a person is HARI, the Reg.No. of vehicle is KA – 09 F-1234) This type of data may contain numbers along with alphabets but the number is not used as a numeric data in any calculation. This data type is called a string of alphanumeric characters. How are these data represented inside computer? 22.214.171.124.1: Data Representation: All data in computer must be represented using only 2 symbols namely 0 and 1. This system of representation is called binary representation. In order to represent all data types in computers using only 0 and 1, some kind of coding is needed. Integers get directly represented as binary numbers. Real numbers are represented using a technique called Floating point representation. Strings are represented through an elaborate coding mechanism called ASCII (American Standard Code for Information Interchange). This coding uses 8 bits (binary digits) to represent a character.
Example : Letter A could be 00110000; Letter B could be 00111000 etc. (You will have the details of data representations in other modules) Even pictorial and voice data gets coded into a large number of 0’s and 1’s. 126.96.36.199 Data Size: All data items do have a size. Looking at previous examples we may say Number of Students in a class needs2 digit of space, price for an item may need 4 digit space (2 before decimal and 2 after decimal. – decimal point need not be stored). A name string may need a maximum number of 30 character positions. Further, when it is stored inside a computer, it may need 30x 8 =240 bits. A picture data may need several thousand-bit positions. The property size is of special importance to us because we need to provide adequate space to store these items in the system. Further, DBMS packages should be able to distinguish these data types and provide necessary functions to manipulate them. 188.8.131.52 Relationship Even though data items are individual entities, they never occur in isolation in the real world. They are always associated with other data item. Ex: Data item price is related to the vehicle in question, Date of transaction and the seller. There are 3 different types of data relationships. Let us understand each one of them. Simplest of all is 1 : 1 relationship. For each value of a data item there is one and only corresponding value in the other item. E.g.: Student ID and the student name. E.g.: Vehicle number and vehicle. Normally all such data items are grouped and kept together as a record. Second type of relationship is one to many (1: M). Here for every value of one data item there are several values of the other data item. However on the reverse, several values of other data items are related to a unique value of this data item. E.g.: 1. A book has several chapters. But several chapters correspond to one and only one book.
2. A person can own several vehicles; all vehicles will have only one owner. One to many relationships can be represented in computers using pointers and arrays. (Details later) Third type of relationships is called Many to Many. (N: M). Most of the relationships in real world are this type. E.g.: - 1. A student has several teachers; A teacher might have several students. 2. A book can have several Authors. An author might have written several books. This type of relationships is difficult to represent and handle in computers. Hence, as far as possible we try to reduce them to two one to many relations (1: M and N: 1) and eliminate one which is irrelevant to the user. The Database must maintain all the data and their relationships and allow the user to access data based on these relations. E.g.: Get me all vehicles owned by a person. Get me the subjects taught by a teacher. 1.2.4. Data Organization and Grouping: Data as we already mentioned occurs in real world individually. But it is grouped and organized to help process it and generate information. The grouping of related data items from user’s view is called logical grouping. The grouping of data items from the point of view of its storage inside the computer is called physical grouping. Just as writing is organized in letters, words, sentences, paragraphs and chapters, Data can be organized as characters, fields, records, files and databases. 184.108.40.206 Character: Character is the most basic logical data element, which consists of a single alphabetic / numeric or other symbol. E.g.: The grade obtained in a subject could be A or B or C or D or E. Sex of a person could be M or F. Subject taught during hour.
220.127.116.11 Field: Field is the next higher level of data. A field consists of grouping of characters. E.g.: 1. Person’s name field will be grouping of alphabetic characters. 2. Sales amount field will be grouping of numeric characters. 3. Teacher teaching the subject for a class. A field represents an attribute of some entity (object, person, place, or event) E.g.: An employee’s salary is an attribute that is a typical data field associated with the entity employee (in 1: 1 relation) 18.104.22.168 Record: Related data fields are grouped to form a RECORD. A record, thus is a collection of attributes that describe an entity. E.g.: 1. An employee record could consists of attributes like, his ID, name and salary he draws etc. 2. Set of subjects taught for a class during each hour. 22.214.171.124 File: A group of related records is a data FILE. E.g.: 1. A group of all employee records showing one record for each employee could be an employee file. Files are frequently classified by application for which they are used. 2. Timetable for a class for a week showing subjects taught each hour on each day of the week. Files are frequently classified by the application for which they are primarily used such as payroll file, Inventory file etc. 126.96.36.199 Database: A DATBASE is an integral collection of logically related records or objects. It consolidates records stored in various files into common pool of data records that provide data for several users. E.g.: 1. The timetable for an entire school showing the details of classes, subjects, room, teacher's etc.
A Personnel database consolidates data files like, Payroll files, Personnel action files, employee skill files etc.
Employee Rec # 1
Employee Rec # 2 Name Id Salary Matl Id
Matl Rec # 1 Desc Qty
Malt Rec # 2
Fig.1.1 Database, Files, Records and Fields
1.3 DATABASES AND THEIR MANAGEMENT:
Databases, as we have already seen represents the techniques of storing, accessing and managing of data. DBMS serves as the software providing these techniques and interfaces between Database and the users. It (DBMS) is a set of computer programs that controls definition, construction and maintenance and use of databases – a central repository of all data of an organization and its end users.
Application Programmes Application Programmes Application Programmes
Fig 1.2 A schematic of Database Management System
Creation of database involves specifying data types, structures and their relationship constraints for the data stored in database. Construction of a database is the process of storing the database, by populating data in it in the computer storage medium. Maintenance of database includes such functions as updating and accessing the data in the database to reflect changes in the real world. E.g.: Let us consider a college environment, wherein we need to maintain data about class scheduling. Data like a) b) c) d) e) Courses and sections Subjects to be taught for each course Teachers teaching the subjects Rooms in which classes are held Timing for teaching the subject.
The basic entities in this example are subjects, courses, teachers, rooms, student's etc.; there will be associations or relationships linking these entities. E.g.: Subject and Teacher have N: M association. A teacher may teach several subjects. Several teachers may teach a subject. 1.3.1 Objectives of DBMS: DBMS as a system has been designed to serve the management of a business organization. Its objectives can be listed as follows. 1. Provide for mass storage of relevant data. 2. Make access to the data easier to user. 3. Provide prompt response to the user’s request for data. 4. Allow for the modification of data in a consistent manner. 5. Eliminate or reduce the redundant data. 6. Allow multiple users to be active at a time.
7. Protect data from physical hardware failure and unauthorized access. 1.3.2 Components of DBMS: DBMS packages on personal computers allow end users to develop databases for their personal need. They are called single user databases. However, large organizations with lot of users usually place control of enterprise database development in the hands of the DATABASE ADMINISTRATORS (DBA’s) and other specialists. This improves the integrity and security of organizational databases. Database developers use DATA DEFINITION LANGUAGE (DDL) to specify data structures, relationships and modify these structures if needed. The detailed information about these structures is called METADATA. It is stored in the DATA DICTIONARY component of DBMS, which is maintained by DBA. Users are allowed to insert, modify, delete and retrieve data from the database according to their needs. They use DATA MANIPULATION LANGUAGE (DML) for this purpose. Further, DBA needs to guard this database from media failures, accidental erases etc., For this purpose, he creates copies of the databases and the changes occurring for later recovery in case of failures. He uses DATABASE UTILITIES to handle these functions of backup and recovery. 1.3.3. Types Of Databases: Developments in Information Technology have resulted in several major types of Databases. 188.8.131.52. Operational Databases: These databases store detailed data needed to support an entire organization. They are also called subject area databases, (SADB) Transaction databases and Production databases. These databases carry up-to-date information of business activities. Business supervisors in charge of day-to-day operation most frequently use them. 184.108.40.206. Analytical Databases: These databases contain information extracted from operational databases. They are used by the managers to study the trends and patterns emerging in the business to make strategic decisions and policy making. They are also known as Data warehouses, information Databases and Decision support Databases. They are generally used in query mode rather than update mode. Techniques like online Analytical Processing (OLAP) and
Data Mining are used in these databases to generate meaningful information for business analysis, market research etc, 220.127.116.11. Distributed Databases: Many of the contemporary applications have geographical distribution. Advent of networking technology has made it possible to distribute the database across several computers connected in a network. This improves local access of data, and remote update without increasing the load on networks. Hence many organizations distribute copies or parts of databases to computer systems at different sites, linked to each other through networking. Such databases over a network of computers are known as Distributed Databases. Ensuring that all of the data in an organization’s distributed databases are consistently and concurrently updated is a major challenge of Distributed Database Management. 18.104.22.168. Personal End User Databases: These databases consists of a variety of data files created by end users on their PC for personal uses. They are generally single user databases with lesser stress on backup and recovery. The data in these databases may be generated with, word processors, spreadsheets and other PC software packages. 22.214.171.124. Multimedia Databases: These databases include non-conventional data like, pictures, voice tracks along with conventional alphanumeric data. These databases tend to be huge in size and access is done through specialized access language constructs. The data accessed further needs to be interpreted and displayed by additional front-end software like Browsers and media players. From database management viewpoint, the set of interconnected multimedia data needs to be handled as specialized structures rather than simple records. 126.96.36.199. Special Purpose Databases: These databases are developed and used for certain special purpose applications. Spatial Databases, Temporal databases Biological databases etc. belongs to this category. The data stored in these applications are of a different kind and needs to be interpreted according to the ground rules of those applications. Hence special techniques are used for storage and access of data in these databases.
1.3.4 Database Models: Databases are distinguished based on the conceptual model of data and the underlying relationships among them. All models try to represent data and their relationships using simple elegant models. An early data model widely used in 70’s was HIERARCHICAL Model where the model captures the intuitive hierarchy of data elements. User is allowed to navigate through the data structures using the “tree – like” hierarchies. The early generation database from IBM, namely IMS, is based on this model. Hierarchical models cannot represent many to many relationships in an elegant fashion. Such data relationships resulted in cumbersome structures with lot of duplication of data and slow access. To get over these limitations CODASYL committee proposed a NETWORK MODEL in 70’s and 80’s. IDMS from cullinet, DMS – 1100 from Unisys Corporation, are typical representatives of this generation of databases. While the network model provided much more abstraction power and very good performance for large volume data, it lacked elegance. It required high level of skills to use these databases. Further, it was difficult to dynamically alter the structures. Mr. Codd of IBM later proposed an elegant and flexible RELATIONAL MODEL. The elegance, simplicity and a solid theoretical foundation made this the darling of database developers and users. Today, this is the most popular database available on range of machines from PC’s to mainframes. DB2 of IBM, ORACLE, INFORMIX, ACCESS, LOTUS etc., are all based on this popular model. DBMS’s built using this model use SQL (Structured Query Language) as the means to create and manipulate data. SQL is an elegant, simple yet powerful interface to all relational databases. The present day RDBMS’s provide support for several other tools and utilities to ease application development. Most common utilities are A screen designer to generate user-friendly fill – in – form type interface to access and manipulate data. E.g.: ORACLE FORMS Report Generator to access data and present it in a printed format suitable for the end user. E.g.: ORACLE REPORT GENERATOR Utilities to load and extract Bulk data from the database are provided to speed up data loading and extraction. E.g.: Import, Export Features of ORACLE.
DBA utilities to, manage security and limit access to data. Current generation DBMS packages provide most of these above utilities along with some more to manage Databases effectively. They in fact, create a total environment under which the user can comfortably handle all his information processing needs.
CHECK YOUR PROGRESS
1.0 Say True or False T 1. Information is the processed form of data 2. A collection of data is called database 3. Data representation affects its access 4. M:N Relationship is difficult to handle 5. Relational Module is the most popular database model F
2.0 Choose the appropriate answer. 1. Data Processing involves a) Storing b) Retrieval c) Formatting d) All of these 2. DBMS software handles data a) Storage and Retrieval b) Processing and formatting c) Printing and Presenting d) Sorting 3. Alphanumeric data type can store a) Numbers b) Strings c) Pictures d) Voice data 4. Student : Subject is a good example of a) 1:1 Relationship b) 1: M Relationship c) M: N Relationship b) M: 1 Relationship 5. One of the objectives of DBMS a) Provide mass storage for data b) Present data in a report c) Provide language to write program logic c) Copy data from one disk to other
3.0 3.1 Three functions of Data Processing are
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------3.2 List 4 tasks handled by DBMS
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------3.3 List 3 important properties of data --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------3.4 Different data structures in the order of complexity are -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------3.5 Six objectives of DBMS are ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1.0 1.1 1.2 1.3 1.4 1.5 2.0 2.1 (d), 2.2 (a), 2.3 (b), 2.4 (c), 2.5 (a) 3.0 3.1 1. Aggregate, manipulate and organizing of the data, 2. Analyze and evaluate contents and 3. Present it in a context meaningful to user. 3.2 a) Database development b) Database Integration c) Database maintenance c) Application Development 3.3 Type Size Relationship 3.4 Character Field Record File Database a. Provide for mass storage of relevant data. b. Make access to the data easier to user c. Provide prompt response to the user’s request for data. d. Allow for the modification of data in a consistent manner. e. Eliminate or reduce the redundant data. f. Allow multiple users to be active at a time. g. Protect data from physical hardware failure and unauthorized access. True False False True True
Storage, Record and File Organization
1.4 The Storage of information
In a digital computer there are two types of memory units, namely operational units and storage units. The name that is commonly associated with operational units are register, A register is used for the temporary storage and manipulation of information. 1.4.1 Operational Unit Some of the most important registers are contained in the central processing unit (CPU) of the computer. The CPU contains registers which holds the arguments ( ie operands or information) of the arithmetic computations. Besides storing operands and results from arithmetic operations registers are also used to temporarily store program instructions and control information concerning which instruction that is to be executed next. Because of their highly specialized nature, registers have a great deal of combinational logic (i.e. circuitry) associated with them. This makes them expensive relative to storage–type memory units in computer. Consequently, registers are only used to store information temporarily. 1.4.2 Storage Unit The storage type memory unit is designed to store information, which is more permanent in nature. For example, a particular storage unit or set of storage units is associated with a particular variable in program, variable can be referred as varies the value or quantity which is present during execution. However, before arithmetic computations involving a variable are performed, the value of variable as stored in the memory unit, must be transferred to register unit. The transfer must take place, because memory units do not have the necessary logic associated with them (or between them) to execute arithmetic operations. If the result of a computation is to be assigned to a variable, the result value must be transferred from an arithmetic register back to the memory unit associated with the variables.
When program is executed, its instructions and data generally reside in storage units. The entire set of storage units in the main frame or main part of the computer is often called main memory. In some instances program can also reside in storage units, which do not belong to main memory. Examples of such storage unit devices (often called secondary storage device) are magnetic disk and magnetic drum. The data in the main memory or internal memory of computer can be accessed very quickly, a typical access time is less than 1 micro second (= 10–6 sec). Main memory provides for the immediate storage requirements of central processor for execution of program. The storage capacity of main memory is limited by two factors the cost of memory and technical problems in developing large capacity main memory. The storage requirements for programs and the data on which they operate exceed the capacity of main memory in virtually all computer systems. Therefore, it is necessary to extend the storage capabilities of a computer by using device external to main memory. 1.4.3 External Storage unit An external storage device may be loosely defined as a device other than main memory on which information or data can be stored and from which the information can be retrieved for processing of some subsequent point in time. The storage and retrieval operations are referred to as writing and reading, respectively. External storage devices have a larger capacity and are less expensive per bit of information stored than in main memory. The time required to access the information however is much greater with these devices. The primary use of external storage device include 1. Backup of programs during execution. 2. Storage of programs and subprogram for future use. 3. The storage of information in ‘files The most common external storage devices in order of their initial development and use are magnetic tape, magnetic drum and magnetic disk.
1.5 Record and Record Organization
Record is a collection of fields. In a file the records are organized in a logical sequence of records. These records are mapped onto disk blocks. Files are provided as a basic construct in operating system. Although blocks are of a fixed size determined by the physical properties of the disk and by the operating system, record size vary.
1.5.1 Definition and concepts Comprehensive and consistent overview of hierarchy of information structures associated with file processing. A record (some time called a group or segment) is a collection of information items about a particular entity, for example record may consist of information about a passenger on an airplane flight, or an article sold at retail distribution store, or information about students. An item (some time called field) of a record is a unit of meaningful information about an entity. The different items of a passenger record may be passenger’s name, address, seat number, date and time. Generally an item of a record may be an integer or real or character–string data element. However items may themselves be composed of aggregates of items, such as an array of items or a sub collection of non-homogenous items. Non- homogenous items are mixed type, like integer, real and character could be mixed. The notion of record in its most general interpretations can be loosely equated to a structure. For example, a possible structure for a passenger record is declared as follows. Record Name : Passenger Name Initials char (4), Surname char (15), Address char (20), Menu char (2) A collection of records involving a set of entities with certain aspects in common and organized for some particular purpose is called a file. For example the collection of all passengers on a particular flight constitutes a file. A record item that uniquely identified in a file. In the passenger file, individual passenger records can uniquely identified by the passenger’s assuming duplicate names do not occur for a particular flight. The seat number item can be also be used as key, if desired, since seat numbers are uniquely assigned for a given flight. It is common practice to order the records in a file according to a key. Therefore if the passenger name is selected as the key item, the record for Adams appears before the record for Brown, which appears before the record for Camp in alphabetical ordering by surname. Some files are ordered on a particular item, termed the sequence item, which may not be unique for each record. E.g.: In a file of monthly sales for particular company, several records containing sales information may appear for one customer. The file can be
ordered by customer account number with more than one occurrence of a customer sales record type for a given account number. Thus we have observed a hierarchy of information structures in which items are composed to form records and records are composed to form a file. If the set of files used by the application programs for some particular enterprise or application area, and if these files exhibit certain associations relationship between the records of the files than such collection of files as often referred to as a database or data bank. Figure 1.3 shows the information structure hierarchy as it applies to a file processing application.
Figure 1.3 Information structure hierarchy for file processing.
Let us examine that some of the factors that effect the organization of a file. The prime factor, which determines the organization of a file, is the nature of operations that are to be performed on the file, as dictated by applications. The operations normally performed are namely, retrieval, addition, deletion and updation. A particular operation involving a record or set of records is called transaction. E.g.: Delete Rama from the student list for the Ist Year is a transaction. Add Watson to student list for Ist Year
1.5.2 Record Organization In a relational database record of distinct relations are generally of different sizes. One approach to mapping the database to files is to use several files and store records of only fixed length in any given file. An alternative is to structure our files in such a way that we can accommodate multiple lengths for record. Files of fixed length records are easier to implement than files of variable records. Fixed Length Record A record item has a fixed length value and its domain is too large for an efficient encoding, a primitive data-structure( i.e., integer, real, char) format should be selected for the representation of the item. For example it is unreasonable to bit-encode an item representing the net sales for the month. We can declare a record containing such an item in the programming language being used. Record : Monthly_Report Month Char(9), Net_sales Fixed (5,2) The net sales item can be range in value from -999.99 to 999.99. It unrealistic for the programmer to bit encode such wide range of item values when the compiler provides en efficient encoding of an item value in binary with a fixed decimal format. Record item represented by Month can be significantly reduced in size if we use a fixed-length a binary code of '0000'B for January, '0001'B for February,..... '1011'B for December and declare the item to be type BIT(4). Because both of these items may be considered as fixed length items, they can technically be called precoordinated. That is fixed length item can only have a finite set of values which can be priori enumerated. Variable Length Record Many applications arise in which the value associated with a record item may be list of entities. For example 'the degree held' and 'programming languages used at a computer installation' are item which can assume multiple entities. In these instances, the item vale may be "B.Sc., M.Sc., Ph.D," or "COBOL,C, Pascal, Fortran" respectively. The most popular method of handling repeating fields is to create an item, which can accommodate up to some maximum number of replications. If we represent this maximum number to three, then the example items can accommodate such information as the three most recent degrees obtained and three most often used programming languages.
1.6 Files and file Organization
The technique used to represent and store the records on a file is called file organization. The fundamental file organization techniques are Sequential and Index sequential. The presentation of each of these organizations begins with a description of its file structure. There are two basic ways, that the file organization techniques differ. First, the organization determines the file’s record of sequencing, which the physical ordering of the records in storage. The second, the file organization determines set of operations necessary to find a particular record. Individual records are typically identified by having particular values in search key fields. This data field may or may not have duplicate values in file, the field can be a group or elementary item, some file organization techniques provide rapid accessibility on a verity of search key; other techniques support direct access only on the value of a single key. The most appropriate organization for a particular file is determined by the operational characteristics of storage medium used and the nature of the operations to be performed on the data. The most important characteristics of storage device that influence the selection of a storage device, once the appropriate file organization techniques have been determined. Whether the device, allows direct access to particular record occurrences without accessing all physically prior record occurrences that are stored on the device, or allows only sequential access to record occurrences. Magnetic disks are examples of direct access storage devices, magnetic tapes are examples of sequential storage devices. 1.6.1 Structure of Sequential Files In a sequential file, records are stored one after the other on storage device and sequential allocation is conceptually simple, yet flexible enough to cope with many of the problems associated with handling large volumes of data, a sequential file has been the most popular basic file structure used in the data–processing industries. All types of external storage devices support a sequential file organization. Some devices, by there physical nature, can only support sequential files. Information is stored on magnetic tape as a continuous series of record along the length of the tape. Accessing particular record requires the accessing of all previous records in a file. Other devices, which are strictly sequential in nature, are tape cassettes and line printers.
The operations that can be performed on a sequential file may differ slightly, depending on the storage device used. For example, a file on magnetic tape can be either an input file or output file, but not both at one time. A sequential file on a disk can be used strictly for input, strictly for output, for update. Update means that, as records are read, the record most recently read can be rewritten on the same file 1.6.2 Processing of Sequential File Having discussed the physical layout of a sequential file and how records are transferred to/from the program area from/to the file, let us examine the types of processing for which sequential files are most suitable. Serial processing is the accessing of records , one after the other, according to physical order in which they appear in the file. Obviously, it is an easy matter to process sequential files serially. Sequential processing is the access of records, one after the other, in ascending order by a key or index item of the record. Eg; MASTER file of employee’s records is ordered by employee surname, as follows AGARKER first , BAKER second ,…………., ZIDANE last. Then sequentially processing the file by surname is equivalent to serially processing file. Most sequential files are ordered by a key or index item, such as employee name, student identification number when the file is created. The key or index item should be the item, which is most often searched for when processing the file. To show the importance of the key selection, assume the MASTER file of employees’ identification number. Suppose we want to find the records of a number of employee given only there names. Finding the first employee’s record, say AGARKER is simply a matter of serially processing the file until the record with name item of AGARKER appears. Consider the processing of a second record, say for BAKER. Since the position of BAKER‘s record bears no relationship with position of AGARKER record, we have no alternative but to start once again serially processing at the beginning of the MASTER file. There are occasions in which, serial processing is all that is required on a file irrespective of the key or item index upon which the file is ordered. For example, if we are to add a pay increase of 1000 Rupees the wage item of all employees, it is irrelevant whether the file is sequenced by name or by employee’s identification number.
In Sequential processing, transaction records are usually grouped together and sorted according to the same index item as records in the file. Each successive record of the file is read, compared with an incoming record and then processed in a manner that is usually dependent upon whether the value of the record index item is less than, equal to, or greater than the value of the index item of the transaction record. Sequential and serial processing are most effective when high percentage of the record in file must be processed. Since every record in the file must be scanned, a relatively large number of transactions should be grouped together for processing. If records are to be added to a file, it is necessary to create a new file unless the records are to be added to the end of the file. Important points of the sequential process of sequential files. 1. Sequential processing is most advantageous if a large number of transaction can be grouped to form a single ‘run’ on the file. 2. A new file should be created if there are any additions and deletions requested. 3. Quick response time should not be expected for a transaction or a batch of transactions.
1.7 Index Sequential file
The retrieval of a record from a sequential file, is inefficient and time consuming for large files . To improve the query response time of sequential file, the type indexing techniques can be added. Most important aspect affecting the file structure is the type of physical medium on which the file resides. The capability of directly accessing a record based on a key can only be achieved if the external storage device used supports this type of access. In particular, devices such as magnetic tape and cassette tape units allow the access of a particular record only after reading all the other records that physically appear before a desired record in the file. Hence direct access is impossible for these types of devices. The type of external storage devices that support for both sequential and direct are magnetic disks unit. The file structure concept relating to indexed sequential are best exemplified when considering a magnetic disk as the storage medium. In fact, because of their low
price/performance ratio and large total storage capacity, disks are generally chosen when using indexed sequential files. Indexing associates a set of orderable quantities, which are usually smaller in number for faster search. The idea of indexing is to expedite the search process. Indexes are created from a sequential (or sorted) set of primary keys are referred to as index sequential. We shall use the term index file to describe the indexes, data file referred to data records and pointer is address of the variable. A sequential file that is indexed is called an index sequential file. The index provide the random access to records, while sequential nature of the file provides easy access to the subsequent records as well as sequential processing. An index sequential file consists of three separate areas: the prime area the index area and the overflow area. An additional feature of this file system is the overflow area. This feature provides an additional space for record addition without necessitating the creation of a new file. The prime area is an area into which data records are written when the file is first created. The file is created sequential, that is, by writing records in prime area in a sequence dictated by the alphabetical ordering of the keys of the records. The cylinder of a disk. When this cylinder is filled writing continuos on the second track of the next cylinder and continues in this fashion until the file’s creation is completed. If the newly created file is accessed sequentially according to the key item, the records are processed in the order they were written. Type of Indexes Index access structure is similar to that behind the indexes used commonly in textbooks. A textbook index lists important terms at he end of book in alphabetic order. Along with each term, a list of page numbers where the term appears is given. We can search the index to find a list of addresses -page numbers in this case and use addresses to locate term in the textbook by searching the specified pages. Primary Indexes A primary index is an ordered file whose records are fixed length with two fields the first field is of the same data types as the ordering key field of the data file , and the second field is pointer to disk block address. The ordering key field is called the primary key of the data file. There is one index entry in the index file for each block in the data file. Each index entry has the value of the primary key for the record in a block and a pointer to other block as its two filed values. We will refer to two field values of index entry i as K(i), P(i). 27
Block 1 NAME Aaron, Ed Abbott, Diane : Acosta, Marc Block 2 Adams, John Adams, Robin : Akers, Jan Block n Wright Pam Wyatt,Charles : Zimmer, Byron
Figure 1.4: Some blocks on an ordered (sequential) file of Employee records with name as the ordering field
To create a primary index on the ordered file shown in figure 1.4, we use the Name field as primary key, because that is ordering key field to the file. Each entry in the index will have a Name value and pointer. Figure 1.5 illustrate this primary index. The total number of entries in the index will be the same as the number of disk blocks in the ordered data file. The first record in each block of the data file is called the anchor record of the block, or simply the block anchor similar to one described here can be used , with last record in each block, rather than the first, as block anchor, a primary index is an example of what is called non-dense index because it includes an entry for each disk block of the data file rather than for every record in the data file. A dense index, on the other hand, contains an entry for every record in the file. The index file for a primary index needs substantially fewer blocks than the data file for two reasons. First there are fewer index entries than there are records in the data file because an entry exist for each whole block of the data file rather than for each record. Second each index entry is typically smaller in size than a data record because it has only two fields, so more index entries than data records will fit in one block. A binary search on the index file will hence require fewer block accesses than a binary search on the data file. 28
DATA FILE (PRIMARY KEY FIELD) NAME Aaron, Ed Abbott, Diane INDEX FILE (<K(i), P(i)> entries BLOCK ANCHOR PRIMARY KEY VALUE Aaron, Ed Adams, John Alexander, Ed | | | | | Adams,John BLOCK POINTER * * * Alexander, Ed Alfred, Bob Allen, Rob Wong, James Wood, Donald Woods, Manny Wright, Pam Wyatt, Charles Zimmer, Byron Adams, Robin Akers, Jan Acosta, Marc SSN JOB SALARY SEX
Wong, James Wright, Pam
Figure 1.5 : Primary index on the ordering key field of the file
Major problem with primary index as with any ordered file is insertion and deletion of records. With primary index, the problem is compounded because if we attempt to insert in its correct position in the data file., we not only have to move records but also change some index entries because moving records will change the anchor records of some blocks. We can use unordered overflow file. Another possibility is to use a linked list of overflow records for each block in the data file. We can keep the records within each block and its overflow-linked list sorted to improve retrieval time. Record deletion can be handled using deletion markers.
9 5 13 8 6 15 3 17 21 11 16 2 24 10 20 1 4 23 18 14 12 7 19 22
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Figure 1.6: A dense secondary index on a non ordering key field of a file
A secondary index also is an ordered file with two fields, and, as in the order indexes, the second field is pointer to disk block. The first field is of the same data type as some non-ordering field of the data file. The field on which the secondary index is constructed is called an indexing field of the file, whether its values are distinct for every record or not. There can be many secondary indexes, and hence indexing fields, for the same file. We first consider a secondary index on a key field a field having a distinct value for every record in the data file. Such field sometimes called a secondary key for the file. In this case there is one index entry for each record in the file, which has the file of the secondary key for the record and pointer to the block i which the record is stored. A secondary index on a key field is a dense index because it contains one entry for each record in the data file. We again refer to the two field vales of index entry i as K(i), P(i). The entries are ordered by value of K(i), so we can use binary search on the index. Because the records of the data file are not physically ordered by value of the secondary key field, we cannot use block anchors. That is why index entry is created for each record in the data file rather than for each block as in the case primary index. Figure 1.6 illustrates a secondary index on key attributes of a data file. Notice that in figure 1.6 the pointers P (i) in the index entry are block pointers, not record pointers. Once appropriate block is transferred to main memory. A search for the desired record within the block can be carried out. A secondary index will usually need substantially more storage space than primary index because of its larger number of entries. However, the improvement in search time for an arbitrary record is much greater for a secondary index than it is for a primary index. Because we would have to do a linear search on the data file if the secondary index did not exist. For primary index, we could still use binary search on the main file even if the index did not exist because the records are physically ordered by the primary key field.
1.7.1 Structure of Index sequential file.
An index sequential file consist of the data and one more levels of indexes. When inserting a record, we have to maintain the sequence of records and this may necessitate shifting subsequent records. For large file this is a costly and inefficient process. Instead the records that overflow their logical area are shifted into designated overflow area and
pointer is provided in the logical area or associated index entry point to overflow location. This is illustrates figure 2.5 record 615 is inserted in the original logical block causing a record to be moved to an overflow block. 611 612 614 618 624
Original logical Block 611 612 614 615 618 624 Overflow Block
Original logical Block
Figure 1.7 : Overflow of record
Multiple record belonging to same logical area may be chained to maintain logical sequencing. When records are forced into overflow area as result of insertion, the insertion process is simplified, but the search time is increased. Deletions of records from index-sequential files create logical gaps; the records are not physically removed but only flagged as having been deleted. If there were a number of deletions, we may have great amount of unused space. Index-sequential file have following components: 1. A primary data storage area. In certain systems this area may be unused spaces embedded within it to permit addition of records It may also include records that have been marked as having been deleted. 2. Overflow areas. This permits the additions of records to the file. A number of schemes exist for the incorporation of records in these areas into the expected logical sequence. 3. A hierarchy of indices. In a random inquiry or update, the physical location of the desired record is obtained by accessing these indices. The primary data area contains the records written by the users' programs. The records are written in data blocks in ascending key sequence. These data blocks are in turn stored in ascending sequence in the primary data area.
1.8 Direct File Organization
The index-sequential file organization considered in the previous sections, the mapping from search-key value to the storage location is via index entries. In direct file
organization the key value is mapped directly to storage location. The usual method of direct mapping is by performing some arithmetic manipulation of the key value. This process is called hashing. Let us consider hashing function h that maps key value key k to the value h(k). The value h(k) is used as an address and for our application we require that this value be in some range. If our address area for the records lies between s1 and s2, the requirement for the hash function h(k) is that for all values of k it should generate values between s1 and s2. It is obvious that a hash function that maps many different key values to a single address or one that does not map the key values uniformly is bad hash function. A collision is said to occurs when two distinct key values are mapped to the same storage location. Collision is handled in a number of ways. The colliding records may be assigned to the next available space, or they may be assigned to overflow area. We can immediately see that with hashing schemes there are no index to traverse. With welldesigned hashing functions where collisions are few, this is great advantage. Another problem that we have to solve is to decide what address is represented by h(k). Let address generated by the hash function the address of buckets in which the y, address pair value of records are stored. Figure 1.8 shows bucket contains the y, address pairs that allow a reorganization of the actual data file and actual records address without affecting the hash functions. A limited number of collisions could be handled automatically by use of the bucket of sufficient capacity. Obviously the space required for the bucket will be, in general, much smaller than the actual data file. Consequently, its reorganization will not be that expensive. Once the bucket address is generated from the key by hash function, a search in the bucket is also required to locate the address of the required record. However the bucket size is small, this overhead is small. The use of the bucket reduces the problem associated with the collisions. In spite of this, a bucket may become full and the resulting overflow could be handled by providing overflow buckets and using a pointer from the normal bucket to an entry in the overflow bucket. All such overflow entry are linked. Multiple entries from the same bucket results in a long list and slow down the retrieval of these records. In an alternative scheme, the address generated by the hash function is bucket address and the bucket is used to store the records directly instead of using a pointer to the block containing the record. Let s represent the value: S = upper bucket address value - Lower bucket address value + 1
S gives the number of buckets, simple hashing functions h(k) = k mod s, where k the numeric representation of the key and h(k) produces a bucket address. Simple Hashing Functions are given below 1) Use the lower order part of the key. For key that is consecutive integers with few gaps, this method can be used to map the keys to the available range. 2) Square all or part of the key and take a part from the result, the whole or some defined part of the key is squared and number of digits are selected from square as being part of the hash result. A variation is the multiplication scheme where one part of the key is multiplied by the remaining part and a number of digits are selected from the result. 3) End Folding, for a long key, we identify start, middle and end regions, such that sum of the lengths of the start and end regions equals the length of the middle region. The start and end regions are concatenated of digits is added to the middle region digits. This new number, mod s where s is the upper limit of the hash function, gives the bucket address: 123456 123456789012 654321 for above key the end folding gives the two values to be added as 123456654321 and 123456789012
Blocks of records
Bucket1 Key address 209 610 920 976
331 920 209 209
Overflow Buckets 331 362
Figure 1.8: Bucket and block organization for hashing
Advantage of hashing: 1) Key matches are extremely quick.
2) Hashing is very good for large keys, or those with multiple columns, provided the complete key value is provided for the query. 3) No disk space used by this indexing method
Disadvantage of hashing: 1) It becomes difficult to predict overflow because the working of the hashing
algorithm will not be visible to the data base administrator. 2) No sorting of data occurs either physically or logically so sequential access is poor. 3) This organization is usually takes a lot of disk space to ensure that no overflow occurs.
All businesses need to process data. As data volume increases, the data processing becomes highly complex. Computers are used in this process. One important aspect of this computerized data processing is the storage and retrieval of data. Databases provide this functionality and DBMS packages are software tools to implement databases. Data as an entity has several important properties like Form, size, organization and relationships. The form of data namely numeric, Alphabetic, integers and real numbers represent the different types of data stored in databases. Size of the data plays a central role in deferring the volume of database and techniques needed to store them. Organizing and grouping of the data, into characters, fields, records and files of define the basic building blocks of the database. Databases are classified into different types of databases based on their usage. Different Data Models have resulted in different kinds of databases that provide the basic service of storage and retrieval of the data. In this unit, we discussed storage of information in register, main memory and secondary memory. Three fundamental file organization techniques, These are sequential, Index-sequential and direct file organization. The selection of the appropriate organization for a file in an information system is important to the performance of that system. The fundamental factors that influence the selection process include the following:
Nature of operation to be performed Characteristics of storage media to be used.
3 Volume and frequency of transaction to be processed 3 Response time requirements.
1. What is record organization ? Explain the variable and fixed length record 2. How the Index-sequential file organized and explain deletion and addition of records 3. Explain Direct file organization.
1.11 Further Readings
1. Tremblay and Sorenson, 'An introduction to Data structures with applications' 2nd Edition 1984, Mc Graw Hill publications 2. Bipin Desai, An Introduction to data base system, Golgotia Publications New Delhi, 1994
Data Definition in DBMS
2.1 Introduction to Microsoft Access Structure: 2.1.1 2.1.2 2.1.3 2.1.4 2.1.5 2.1.6 2.1.7 2.1.8 Introduction Microsoft Access database Tables and Queries Forms and Reports Accessing Microsoft Access Opening a database Database window Objects of the Access database
2.2 Working with Access database Structure: 2.2.1 2.2.2 2.2.3 2.2.4 2.2.5 2.2.6 2.2.7 2.2.8 2.2.9 2.2.10 2.2.11 2.2.12 2.2.13 Introduction Creating a Microsoft Access database Creating objects Customizing toolbars Fields and data types Creating a table Field properties Save and close a table Add and save records Edit records and close a table Modify fields in a table Modify columns and rows in data sheet Validation rule to a field
2.3 Data Manipulation in DBMS Structure: 2.3.1 2.3.2 2.3.3 2.3.4 2.3.5 2.3.6 2.3.7 2.3.8 2.3.9 2.3.10 2.3.11 2.3.12 Introduction Find a value Find and replace a value Create and apply a filter Sort records Create a query Query window Join tables Select fields Specify criteria Calculate totals Modify and save a query
2.4 Creating and Customizing Forms Structure: 2.4.1 2.4.2 2.4.3 2.4.4 2.4.5 2.4.6 2.4.7 2.4.8 Introduction Creation with Form Wizard View, Add, Delete and Save records Save and Close a Form Change Form Design Select, Resize, Move and Delete controls Change Fonts, Size and Color of Text Showing data from more than one table
2.5 Creating Reports Structure: 2.5.1 2.5.2 2.5.3 2.5.4 Introduction Create a report Preview, print and save a report Report in design view
This unit gives you an introduction, as to what an RDBMS is, and what is the difference between MS-Access, an RDBMS and other packages. Also you will learn to open an existing database and see all the objects present in an Access database. A database is a collection of data related to a particular topic. Database, typically consists of a heading that describes the type of information it contains, and each row contains some information. In database terminology, the columns are called fields and the rows are called records. This kind of organization in a database is called a Table. A DBMS is a system that stores and retrieves information in a database. Data management involves creating, modifying, deleting and adding data in files and using this data to generate reports or answer adhoc queries. The software that allows us to perform these functions easily is called a DBMS.
2.1.2 Microsoft Access database
Microsoft Access is a relational DBMS. Microsoft Access is also a database like any other database. Why one should go in for MS-Access, why not for any other one, like FoxBASE or Dbase? In MS-Access unlike other databases it is possible to display an image on screen apart from all the other details, that is you can store pictures in Access but not in other databases. As an example, let us introduce the personal information system of a company. The company has many departments. There are many employees working under the organization. The company wants to maintain a database, which will store the details and the entire information about the employees. The details of the database are Employee number, Employee name, Data of joining, Sex, Basic salary, Qualification and Department.
2.1.3 Tables and Queries A Table - Data
A Microsoft Access database is a collection of database files, which are also known as tables. And each table is a collection of records, and a record is a collection of fields.
If the company wants to store the employee details, they will have to form a table, which will be part of some database. The information about an employee will make one record of that table and the information will be stored under fields such as employee number, Employee name, and others. Example
Each record in a table contains the same set of fields and each field contains the same type of information for each record.
A Query – A question and an answer
In MS-Access, a Query is a question you ask about the data in your database. The answer to the question can be from a single table or several tables; the query brings the data together.
Suppose in the personal information system, the manager of the company wants to know the total basic salary of all the employees. The answer to the query may be Yes or No. Keeping track of a large number of employees is difficult.
The total basic salary of all the employees.
You create a query that describes the set of records you want. When you use the query to access the data, you automatically get current data from the table/s.
2.1.4 Forms and reports A Form – information on the screen
There are two ways in which you can view data, which is stored in a table.
The second way of viewing data is more preferable. A query output can be viewed as in the first way. But it can be viewed in the second way by using Forms. A form is a customized way of viewing, entering and editing records in the database. You can specify how data is to be displayed when you design the form. Forms can be created to resemble more closely the way data would be entered on paper form so that the user feels familiar with the operation.
A Report – Required results in print
Forms and queries present the data on screen. Reports are used to present data on printed paper. It provides a way to retrieve and present data as meaningful information, which might include totals and sub totals, which have to be shown across a set of records.
2.1.5 Accessing Microsoft Access
As any other windows based application, you can start and quit Microsoft Access in the same way.
To start Microsoft Access
1. Open the program group that contains the Microsoft Access icon.
2. Double click the Microsoft Access icon. Microsoft Access starts and displays Microsoft Access window, where you can create or open a database.
To quit Microsoft Access
Choose Exit from file menu.
2.1.6 Opening a database
A Microsoft Access database is a collection of objects. A database file contains the tables, queries, forms and reports that help you to use information in the database.
To open a database
1. Choose Open database from the file menu. It will show the following Open database window
2. Select the directory from directories list that contains the database file. 3. Select database from file name list box 4. Click on Open to display Microsoft Access Database window. As soon as you click on Open, a database window will be displayed as shown below. The database window displays a list of the tables created in the database.
2.1.7 Database window
When a database is opened, Microsoft Access displays its database window in the Microsoft Access window. From Access window you can create and use any object in your database and other features of the Microsoft access.
• • •
Title bar is located at the top of the screen and displays the name of the program. Menu bar is located below the title bar. It lists the various options. Tool bars generally located below the menu bar, provides quick access to most
frequently used commands and utilities. It can be customized by dragging the tool bars and placed in convenient positions by the user.
Status bar is a horizontal bar at the bottom of the screen that displays information
about commands, toolbar buttons and other options.
2.1.8 Objects of the Access database
Tables, queries, forms, reports, macros and modules are objects of the Access database. The object buttons in the database window provide direct access to every object in the database.
Example To view all Tables created by you:
Click the Table button in the database window. Microsoft Access displays the list of tables stored in the database.
Similarly all other objects in the database window can be viewed by clicking on the appropriate object buttons.
To close a database
Select Close database from the File menu.
2.2 Working with Access database
Now, we are familiar with opening an existing database and all the objects in the database. Let us learn to create a new database and objects in the database. A table is a collection of data stored about a particular subject. The data in a table is presented in columns and rows. We will also learn to create the basic structure of a table, to add rows (records) and to edit them.
2.2.2 Creating a Microsoft Access database
When a Microsoft Access database is created, one file that contains all the tables in the database as well as queries, forms, reports and other objects that help us to use the information is created.
To create a Microsoft Access database
1. Select New database from the File menu. 2. The following dialog box is displayed. 3. Select Blank Database and Click Ok.
4. The following File new database dialog box is displayed.
5. Select the directory in which you want to create the database. Enter a database name, which can contain upto 8 characters but no spaces in the file name box. No need to give extension because Microsoft Access automatically adds an extension to the database name.
6. Click on Create to create an empty database file.
2.2.3 Creating objects
A database contains different types of objects. Now we know how to create a database. The next step is the creation of objects in this database. Tables are the first objects to be created in the database. The number of tables to be created is based on the user requirements. To get desired information from the database, the next step is to create queries, forms, reports and other objects.
Create / modify an object
To create a Microsoft Access object 1. 2. Select the object type to create from the database window. Click on the New button.
1. 2. 3.
To modify the design of an object Select the object type to modify from the database window. Select the object name from the list to modify. Click the Design button to display object window in design view. Note: There is an option to create objects yourself or through the of access wizard.
An access wizard is like a database expert, which prompts you with queries about the object and then builds the object based on the answers to the queries. Creation of objects with the help of wizards will be covered later.
2.2.4 Customizing toolbars
Microsoft Access provides a wide variety of graphical tools, which can be used to
create and modify objects in the database. When you start, Microsoft Access displays tools only for opening and creating a database. After a database is opened, new toolbars get added to the existing ones. The toolbars get or loose focus as and when you open any object (forms, tables, queries, reports, etc.) in Design, open or New view. Initially, the toolbar appears at the top of the Microsoft Access window and the tools are arranged in a single row. We can customize the toolbar into vertical side of window, bottom of the window and middle of window and change its shape.
To Customize toolbars
1. Select Toolbars from view menu to display toolbars dialog box. Toolbar customize window is displayed Use of different options allows the toolbars to be customized. 2. In toolbars dialog box we can: • Click Large buttons to enlarge or return them to the original size • Show ToolTips. • Click on Close button to close the dialog box.
2.2.5 Fields and datatypes
The first step in designing the database is to make the table structure. Each table in the database represents a single subject, for example employee information or an invoice. Before designing a table one should be very clear about the data that is to be stored in the table, based on which a table structure is created. For example, details of employee information stored in a table requires employee number, employee name, date of joining, sex, basic salary, qualification, department. These details are referred to as fields in database terminology. Fields can be of different data types like number, character or date. Microsoft Access uses the Datatypes to decide how much storage to give to a field and to ensure that the right kind of data is entered in the field. For example, a text cannot be entered in a numeric field. Choosing the right Datatypes for a field is important before entering data in the table. Datatypes of a field that already contains data can be changed but if the Datatypes are not compatible there may be loss of data.
Structure of an EMPLOYEE table
Field name EMP_NO EMP_NAME DOJ SEX BASIC_SALARY QUALIFICATION DEPT_CODE
Field type N C D C N C C
Size 5 20
1 7 10 5
We are trying to store the following details of an employee: Employee number (EMP_NO) Employee name (EMP_NAME) Date of joining (DOJ) Sex (SEX) Basic salary (BASIC_SALARY) Qualification (QUALIFICATION) Department (DEPT_CODE). EMP_NO and BASIC_SALARY fields will have numeric data and so can be of type ‘number’ EMP_NAME, SEX, QUALIFICATION, and DEPT_CODE store character data and hence can be of type ‘text’ DOJ is for storing a date and so can be of type ‘date’
2.2.6 Creating a table
A table first created is an empty container for data. The table is designed to contain specific type of data.
To create a table
1. 2. Click on Table button in datasheet window. Click on New button to display the new table dialog box.
Click the New table button to open table window in Design view. Click Ok to display the table structure in Design view.
We now have a window where we can specify the fields in our table and what kind of data they will be storing. The creation of table structure begins from here. The window below depicts the table in design view. The table window has two portions. The upper portion has field name, data type and description of the field. The lower portion has field properties like size, format, etc. For creating the structure:
a. b. c.
Enter the first field name ‘EMP_NO’ in field name box. Field name can consist of upto 64 characters. Press Tab key to go in data type box and select datatype, for example Number. Press Tab key to go in Description box and type, for example ‘Employee number’. This description appears in the status bar when data is being entered in the field. Press Tab key to go in to the next field. Repeat steps a, b, and c to add other fields.
To set a field property
1. 2. Select field in the upper portion of the table window in design view. Set field properties in the lower portion of the table window.
2.2.7 Field properties
You can control the appearance of data, specify default values and speed up searching and sorting by setting field properties in table’s design view. Field size: Suppose the EMP_NAME should not exceed 20 characters, set the field size to 20 or limit the range of allowable values in case it is a number field. Format: You can specify the number or date fields in any of the following formats:
Decimal places: Display a certain number of places after the decimal point when using a format for a number or currency field. Default value: Suppose if the user does not enter a value for a field, some value should be taken for that field. In such a case use the default value. For example, if DOJ is not entered by the user, current date should be taken as DOJ. Use of default value will automatically fill the current date in DOJ field, in new records. Indexed: Data is indexed on this field (default is NO)
2.2.8 Save and close a table
Save the table design before you can add any records.
To save and name a table.
1. 2. Select Save from the file menu. If you are saving the table for the first time, type a name for the table and click Ok. Table name can be upto 64 characters.
To close a table.
Select Close from file menu.
2.2.9 Add and save records
After designing, you can add records to a table.
To add records
1. Select table to add records from the database window. 2. Click the Open button from the database window to open table in datasheet view. 3. Enter a value in each field pressing Tab key to move to the next field.
4. After you fill in all the fields, press Tab key to move to the new blank record. When you move to the next record, Microsoft Access saves the record added to the data sheet. When you finish adding records, close the data sheet, you don’t have to save your work.
2.2.10 Edit records and close a table To edit / change the value in a field
When you open a data sheet, the first field of the first record is selected. Use the mouse to select the contents of the field you want to modify. Type the new value for the field. To cancel all editing changes to a field, press Esc key. To close a table, select close from File menu.
2.2.11 Modify fields in a table
If any modifications to fields in a table are desired, you can rearrange them, edit them, delete them or insert new fields also.
To edit a field
1. 2. 3. 4.
Select the field to edit. Edit name, data type or description of the field in the upper portion of the table window. Modify the field properties in the lower portion of the table window. Save it and close the table.
To move a field
Select the field by clicking the field selector to the left of the field name. 1. Click the field selector again and hold the mouse button and drag it to the new 2. location.
Save it and close the table.
To delete a field.
1. 2. 3. Select the field by clicking the field selector to left of the field name. Press DEL key or select Delete row from the edit menu. Save it and close the table.
To insert a field
1. 2. 3. 4. 5. Select Insert row from the edit menu. It inserts an empty row before the current row. Enter field name and other information in empty row. Set field properties in the lower portion of the table window. Save it and close the table.
2.2.12 Modify columns and rows in datasheet
If the columns in a data sheet don’t fit the field values they display, the width of each column, the height of each row can be changed. Also you can rearrange the data sheet columns.
Change the width of a column
1. 2. Position the mouse pointer at the right side of the field selector for the column to be resized. When mouse changes shape, you can resize the column. Drag the column border to the desired size or select column width from the format menu and select best fit to fit the data it displays.
Change the row height.
1. 2. Position the pointer between two records selectors at the left side of the data sheet. When mouse changes shape, you can change height of row. Drag the row to the desired size. All rows in the data sheet change to the new row height.
Move a column.
1. 2. Select a column you want to move by clicking the field selector. Click the field selector again and drag the column to its new position. As you drag the column a solid bar between columns indicates its destination.
Save and close data sheet Layout
1. 2. Select Save from the file menu to save data sheet. Select Close from the file menu to close data sheet.
2.2.13 Validation rule to a field
Microsoft Access automatically validates values based on field’s data type. For example, a text cannot be entered in a number field. You can set more specific rules for data using validation rules. You can set validation rule, property for the field. When a validation rule property is set, it specifies the requirements for data that is entered into a field. For example, employee name should not be left blank for which a validation rule can be specified. If the validation rule is violated when an entry is being made, some kind of message to be displayed is specified in the validation text. This text is displayed when an entry in the field breaks the validation rule.
Examples Validation rule
“M” or “F” <> 0 > 100
Enter “M” or “F” Enter a non zero value Value must be greater than 100
To set validation rule
1. 2. 3. 4. Open table in design view. Select field to attach validation rule. Set the rule to the validation rule and validation text of the field properties in the lower portion of the table window. Save and close the table.
Create a table ‘STUDENT’ to store the details of marks of a student. Field Student_ID Name Class English Hindi Maths Science Social_science Type Numeric Text Numeric Numeric Numeric Numeric Numeric Numeric Width 5 20 2 3 3 3 3 3
Create a table ‘TRANSACTION’ to have the following fields. Field Trans_No Item_No Item_name Trans_date Quantity Type Numeric Numeric Text Date Numeric Width 5 5 25 5
After creating the tables, do the following: 1. 2. 3. Set field properties of each field. Modify fields in the table. Modify the table ‘STUDENT’ to include the following fields: Type Numeric Numeric Width 4 5
Field Aggregate Average 4. 5.
2 decimal places
Apply necessary validation rules to each field. Add records.
2.3 Data Manipulation in DBMS
Table is used to store data. Stored data can be retrieved whenever required. There are many ways in which data stored in a table can be viewed based on some criteria. Let us learn find, filter, query and sort to view data.
2.3.2 Find a value
Suppose you require the details of an employee where employee number is ‘1234’. One way of getting the details is to open the table in open mode and browse through all the records one by one. The other way is to use the find option. When you want to find the specific record or find certain values within the fields, you can use the find option to go directly to a record. You can also use the find option to navigate through records and find one record after another.
To find a specific value in a field
1. 2. Select the field you want to search Select Find from the edit menu
3. 4. 5. 6.
In the find what box, type the value you want to find Click the Find first button to move to the record if it exists. Click the Find next button to find the next occurrence of the specified value At the end click the Close button to close the dialog box.
2.3.3 Find and replace a value
It may be required to make the same change in several places of the data. Microsoft Access provides to find occurrences of specific text and to replace them with different
text by using the replace command. Replacements can be made either individually or globally.
To find and replace occurrences of specified text
1. 2. Select the field where you want to search and replace in the open view. Select Replace from the edit menu The replace dialog box is shown below:
3. 4. 5.
Type the text in the find what box. Type replace text in the replace with box. Now, click the Replace All button to replace all occurrences of the specified text or click the Find Next button to replace occurrences of the specified text one at a time. When you finish replacing, click the Close button to close the dialog box.
2.3.4 Create and apply a filter
Microsoft Access provides two ways to create a customized view of data in tables. A query or a Filter for a table can be created. A filter is like a simple query except that it applies only to an open table. A filter is best for temporarily changing the set of records being viewed. In Microsoft Access, Filter is used to view a subset of records in a table by specifying the criteria and the sort order in the filter window.
1. Open table in the data sheet view.
2. 3. 4.
Select filter from the Records Menu. Select the required option to filter the records Select Apply filter / Sort from the Records menu to display some filtered records in the table. To remove a filter, select Remove filter / Sort from the Records menu.
2.3.5 Sort records
Records in a table can be sorted in a different order than they are usually displayed by using the Sort command. Sorting records for display could be either Ascending or Descending order.
To sort records in a table
1. 2. 3. Select the column in a data sheet to Sort. Select sort from the Records menu and then select Ascending or Descending. The sorted records by Emp_name for the above datasheet view is as shown below.
2.3.6 Create a query
A query is a question about the data stored in the tables. The query tells exactly how the data is to be retrieved. Microsoft Access gives you a great deal of flexibility in designing queries.
Queries help to
1. 2. 3. 4. 5. 6. Choose fields. Choose records, that is specify criteria. Sort records, that is specify order. Look for data in several tables. Perform calculations. Make changes to data in tables.
To create a Query
1. Click the Queries button in the database window
2.Click the New button to display the new query dialog box.
Click the OK button to open a select query window and displays the Show table dialog box, which displays the Tables and the Queries in the database.
Select the table and click on Add to display a field list for each table.
Click the Close button.
2.3.7 Query window
As soon as you close ‘Add table’ window, you will see a ‘Query window’. The query window has three views – Design view, Datasheet view and SQL view.
Use this option to create a query or change the design of an existing query. You can use graphical query tools to create a query.
Use this option to see the data retrieved by query.
Use this option to enter SQL (Structured Query Language) statements to create or change a query.
The tool used to create a query in design view is called QBE (Graphical Query by Example). With Graphical QBE queries can be created by dragging fields from the field list in the upper portion of the query window to the QBE grid in the lower portion of the window.
In the QBE qrid, each column contains information about a field included in the query.
2.3.8 Join tables
To create a query from more than one table, you add the tables you want and make sure that the tables are joined to each other. We can join the tables by drawing the join lines between tables, although in many cases Microsoft Access creates join lines automatically. In most cases, a join lines Microsoft Access: ‘Select the records from both the tables that have the same values in the fields that are joined’. This is referred to as ‘inner join’. The fields join in this way are called join fields.
Suppose you have two tables: EMPLOYEE and DEPARTMENT. EMPLOYEE table contains EMP_NO, EMP_NAME, DOJ, SEX, BASIC_SALARY, QUALIFICATION and DEPT_CODE. DEPARTMENT table contains DEPT_CODE AND DEPT_NAME. If you want a query that contains DEPT_CODE, you will have to join the two tables.
To join two tables in the query window
Select a field in one table and drag it to the equivalent field in the other table. It draws a join line from one table to another.
To delete a join between two tables in the query window
Select the join line and press DEL key
2.3.9 Select fields
After adding tables to the query, fields can be included in the query. The fields selected determine the output of the query in the datasheet view. If you add more than one table, field can be seen for each table.
To add a field to a query
1. 2. Drag the field from the field list to a cell in the field row of the QBE grid. Repeat the same until all the fields of the query are shown in the QBE grid.
2.3.10 Specify criteria
To limit the query’s Dyanset (records displayed as output) to certain records, specify criteria is defined. For this, an expression is used. An expression tells Microsoft Access which records to include in the query’s Dynaset.
To specify criteria for a field
2. 3. 4. 5. To define an expression, select the criteria cell in the QBE grid. Type the expression in the criteria box. To check the results, select datasheet from the view menu. Use the sort option in the criteria box to view the data in sorted order.
2.3.11 Calculate totals To calculate totals
1. 2. 3. Select Totals from the View menu to display the totals row in the QBE grid. It automatically fills ‘Group By’ in each box. Select the field to total on it Select sum from the list of total cell
Select Datasheet from the view menu to see the results.
2.3.12 Modify and save a query
You can easily move, delete columns in the query.
To move a column in a query
1. Click the field selector (column heading) of the column in design view.
Click the field selector again, hold down the mouse button and drag the column to its new location.
To delete a column in a query
1. 2. Click the field selector (column heading) of the column in design view. Press DEL key
To exclude a field from the query’s Datasheet
1. Clear the field’s Show box by clicking it.
To save a query
1. 2. 3. Select Save from the File menu to display Save as dialog box (if first time) Type name in query name box Click Ok to save query in the database.
For the tables created in EXERCISE 1: 1. 2. 3. 4. 5. Apply filters to list students with marks greater than 60. Apply filters to get transactions for a date. Sort students by name. Sort transactions by date. Create queries to list students with marks > 60, Total transaction quantity for a date.
2.4 Creating and Customizing Forms
A Query or a Filter is used to view the records in raw form from a table. To view the data in customized way we use ‘Forms’. A Form provides an easy way to view data and all the values for one record. Switch to datasheet view of the form to see all the records for that form. A Form offers the most convenient layout for entering, changing and viewing the records in the database. The form design tools in Microsoft Access help to design forms that present data in an attractive format with special fonts, and other effects.
2.4.2 Creation with Form Wizard
Forms can be created with or without the aid of Form Wizards. Form Wizards speed up the process of creating forms. When you use a form to enter or display data in the database, connect the form to the table or query which forms the source of data. If all the data is in one table, base the form on that table. If the data is in more than one table, base the form on a query.
To create a Form by using a Form Wizard
1. 2. Click Form button in the database window Click the New button to display a New Form dialog box
Select a Table / Query in the list box
Click Ok to create the form by choosing required fields (double click on the required fields ), a format (say tabular) and title for the form At the end, click on finish button to save and open the form . The form displays the first record in the table.
2.4.3. View, Add, Delete and Save records
The above form can be used to view, change, add, and delete records in the table. The objects on the form are called Controls. These controls are used to change and view the data. The controls are:
A label which displays text A text box provides a space to display or type text corresponding to the label to be stored in the database.
To switch to datasheet view, select datasheet from the view menu to display form’s data in datasheet view. To switch to form view, select forms from the view menu to display records in form view. To move from record to record in form view, use navigation buttons to go to first, last, next or previous records.
To add a new record, 1. 2. Select New Record from the Insert menu. A new blank record is displayed.
3. 4. 5. 6.
Type the value in the first text box. Press Tab key to move to the next field. Repeat to enter all other information. After all the fields are entered and Tab key is pressed to move to the next record, Microsoft Access saves the record in the table.
2.4.4 Close a Form
To close a form select close from the file menu.
2.4.5 Change Form Design
To make changes of a form in the forms design view, open the form in design view from the database window. Change this form ...
.. to look like this
To open a form in Design view 1. 2. 3. Click the form button in the database window. Select form from the forms list Click the Design button to open from in design view. Microsoft Access presents the form in three sections in design view:
• • •
Form Header contains the heading label of the form. It appears at the top of the window Detail section contains the fields from the table to view data. It repeats for each record Form footer appears at the bottom of the window.
All forms have a detail section but may or may not have form header and footer. A form in design view:
To add form header and footer, select Form Header / Form Footer from the list box.
2.4.6 Select, Resize, Move and Delete controls
Controls on the form are labels and text boxes. In design view, these controls can be selected and resized.
To select a control
1. 2. 3. 4. Click the text box, to display size and move handles around the control. Drag the handles on the top and bottom to size the text box vertically. Drag the handles on the left and right sides to size the text box horizontally. Drag the handles in the corners to size the text box both vertically and horizontally.
To resize a control
1. 2. Position the pointer at the corner of the text box. Drag the border to the resize the control. All the text box controls have attached label controls. They can be moved together or separately.
To move a control
1. 2. 3. Select the control to move. Position the pointer anywhere on the control and hold down the mouse button. Drag the control (text box and label together move)
Release the mouse button when the control is placed at the desired place.
To move the attached label separately
1. 2. 3. 4. Select the control Position the pointer at the left top corner of the label and hold down the mouse button. Move the label around Release the mouse button when the label is positioned at the desired place.
To delete a control
1. 2. Select the control to delete Press DEL key. It deletes the text box and its attached label.
2.4.7 Change fonts, size and color of Text
Microsoft Access provides choices for the appearance of controls on forms. You can change the size, font and color of the text. 1. 2. 3. 4. 5. 6. Select the label Click bold or italic button to change text Select font from the font list to change the appearance of the text Select size from the size list to resize the control. Click the palette button on the toolbar to display the palette. Select the Fore color or Back color or Border color, to change the text color, fill color and border color respectively. Click the palette button to close the palette.
2.4.8 Showing data from more than one table
Forms could be customized using more than one table or a Query. Using a subform is one way to include information from more than one table in a form. A subform is a form within a form. When a subform is used, relationship is made between records from two or more tables. The main form and the subform are linked so that the subform displays only records that are related to those in the main form. When you create a Form/Subform using the wizard, data can be viewed in the subform in either datasheet view or form view. To use a query to include fields from more than one table A form can be based on a query. Query is used to display limited or sorted information from one or more tables.
To create a query
1. 2. 3. 4. 5. Click the query button; click the new button to open the new query window. Add the two tables, to display data in the form. Connect the tables with join line. Drag the fields from the field list to the QBE grid. Save and close the query.
To base a form on a query
2. 3. 4. 5. Click form button in the database window. Click new button to display New form dialog box. Select the query just created from the list box. Click Ok to create the form by choosing the required fields, a format and a title for the form. At the end click on Finish button to save and open the form.
Using the tables created in EXERCISE 1: 1. 2. Create forms to view data. Add, delete and save records through the forms created. Change the structure of the form in design view.
2.5 Creating Reports
Reports are used to present data on paper. A report is information organized and formatted to fit some specification. Examples are employee details, department details, etc. With Microsoft Access different design elements such as text, data, pictures, lines, boxes and graphs are used to create reports. You can create a design for a report and save it. It can be used again and again. Current data at that time is printed. 1. 2. 3. 4. You can create reports that Organize and present data in groups. Calculate running totals, group totals, grand totals, and percentage of totals. Include subreports and graphs. Present data in an attractive format with pictures, lines and special fonts.
Create a Report
Click the report button in database window. Click new button to display the new report dialog box.
Choose Report Wizard from the dialog box and Click OK.
Make the following choices through the dialog box. Choose the fields you want on the report. Fields can be from more than a. one table or query. For example Emp_no, Emp_name, Basic_salary from Employee table Dept_name from Department table.
Make a choice to view the data. For example By department.
Add grouping levels.
Select the sort order and summary options for the detail records.
Choose ascending order of Emp_no, Emp_name and descending order of Basic_salary and Summary options Sum, Min, Max. e. Choose a layout for the report.
Select a style for the report.
Give a title for the report and click on Finish button to create and open the report in Print Preview.
Report in print Preview:
2.5.3 Preview, Print and Save a report
After the wizard creates the report, Microsoft Access displays the report, as it would appear in print. To see a whole page in report, position the pointer over the report in Print Preview, click the report to display a view of the whole page. Click the report again to zoom back and view data. To scroll in a page, click the horizontal and vertical scroll bars and to scroll through pages, click the page buttons to scroll in other pages. To print a report, select print from file menu. A Print dialog box is shown. Choose the appropriate options in the box. Click on Ok to print. To close the report, choose the close option from the file menu.
2.5.4 Report in Design View
The design of the report can be modified in design view.
To open a report in design view
1. 2. 3. Click the report button in the database window. Select the report to modify. Click the Design button to open the report in design view. In design view, the report is divided in sections such as report header and footer, page header and footer, group header and footer, detail section. Reporter header and footer prints information once in the report. Page header and footer print the information on every page. Group header and footer prints information on change of every group (group by which the report is grouped). Detail section prints each record.
Using the tables created in EXERCISE 1 and / or related queries, generate the following reports: 1. 2. 3. 4. List of students with marks greater than 60 in English. List of students whose average is greater than 80. List of Items for a Transaction date. Day-wise transactions for each month under the month’s heading showing total transaction at the end.
This action might not be possible to undo. Are you sure you want to continue?
We've moved you to where you read on your other device.
Get the full title to continue listening from where you left off, or restart the preview.