1. In this approach each application program has its own separate data file. 2. It is used in small systems that require few files to represent it. So it has simple structure and costs less. 3. Lot of duplication of data is there due to creation of different files for different application programs. This may lead to inconsistency. 4. In this approach it is difficult to apply security constraints since application programs are developed in an adhoc manner i.e. when requirement arises. 5. There is poor coordination of application programs, so multiple user access is difficult. 6. It is difficult to change application programs whenever the integrity constraints are applied. 7. Its examples are C, C++ etc. DATABASE SYSTEM APPROACH Major features of database system approach are: 1. In this system there is collection of interrelated files. Different application programs can access and modify these files. 2. It is used in large systems which inter-relate many files. So it is complex to design and costs more. 3. Due to centralization of database the problem of data redundancy as well as inconsistency is controlled. 4. With the centralization of data, security constraints are easy to apply so that unauthorized users can not access the database. A user is allowed to access only concerned part of the database and not the entire database. 5. In this approach integrity constraints are easy to apply. 6. Its examples FoxPro, dBase, Oracle, SQL Server etc. DBMS DBMS is software that manages the databases i.e. it defines database, stores the data, supports a query language, manipulates data and produces results. Major database level operations performed by DBMS are: Makes access to the data easy for the user. Provides mass storage of relevant data Provide prompt response to user requests for data. Makes the latest modifications to the database available immediately. Eliminates redundant data. Allows multiply users to be active at one time. Protects the data from unauthorized access. Allows for growth in the database system. Some commonly used DBMS are: MS-Access, FoxPro, Oracle OBJECTIVES OF DATABASE SYSTEM Major objectives of database system are: 1. Data Redundancy: In file system, each application program has its own independent data file. These data files are in different formats and may contain same data (up to some extent). This duplication of data at several places is known as data redundancy. This redundancy leads to wastage of space and inconsistency. One of major objective of database is to reduce the redundancy without any harmful loss. 2. Data Inconsistency: Due to redundancy, it is quite difficult to handle the data. If the data changes at one place then it must be applied to all the places wherever it is duplicated. If it is not done then it leads to data inconsistency i.e. various copies of same data may not remain same. This is known as data inconsistency. Second major purpose of the database is to remove the data inconsistency. 3. Difficulty in accessing data: Traditional file approach is convenient and efficient to access the stored data. Third major purpose is to provide efficient and convenient data accessing system. 4. Data isolation: In traditional file system, data is isolated i.e. it is scattered in different files in different formats. Objective of database is to provide all the required data at one place i.e. in a single database. 5. Integrity problems: The data stored must agree with some constraints or conditions. These can be put by adding certain code to the application programs. If it is done for the first time then it is okay but as and when new constraints are added, it is difficult to change programs. The problem gets multiplied when data items are in more than one file. The purpose of the database is to remove these integrity problems. 6. Atomicity problems: The computer system is also anticipated for the failures. In many applications, it is crucial to ensure that, once a failure is detected, the data are restored to the consistent state that the system had prior to failure. If the database is not recovered to its proper state it loses its consistency. Maintaining this consistency is called as atomicity. It is difficult to ensure this property in conventional file-handling. The purpose of the database is to provide solution to atomicity problems to ensure the consistency of the database. 7. Problems in concurrent-access: Many systems permit multiple users to update the data at the same time. Accessing ad updating data simultaneously is called as concurrent data processing. When two processes simultaneously change one data item, there is every chance of inconsistency of data. The system must supervise to take care of this possibility. However, in conventional file processing system, it is very difficult to provide such supervision. The purpose of the database is to provide concurrent access to multiple users under the supervision of DBMs. The part of DBMS, concurrency-control manager takes care of this. 8. Problems with Security: Every user of the database of DBMS must not be given access to all the data. As the application programs are added to the system according to situations, enforcement of security constraint is difficult. The purpose of the database system is to provide the security with ease. DISADVANTAGES OF TRADTIONAL FILE SYSTEM Major disadvantages of traditional file system: 1 Duplicate Data: Since all the files are independent of each other, so there is every chance that some fields may be available in more than on files. Thus there is lot of duplicity of data i.e. same data is stored at more than one place 2. Inconsistency: In file system, same data is stored in number different independent files. When we update some data, we miss updating in some files. Due to this, same data may have different values at different values (updated values in some file while original vales in miss files). This is known as inconsistency. It is quite common in file system. 3. Separated and Isolated Data: In file system data is separated and isolated. To get some information, it is quite possible to extract data from two or files. This isolation of data demands that increase burden for analysts and programmers, as they have to first determine the specific required data and the relation between them. 4. Format Dependency: In file system, the physical formats of the files are entered in the programs. Any changes in the file format results in change in application program and vice versa. This increases the cost and is time-consuming. 6. Poor Data Security: In file system, all files are stored in flat or text files, which can be easily located and trapped. So there is poor data security in file system. 7 Difficulty in Complex Objects: File system has capability of dealing with fixed length records only and no facility for variable length record. The complex object representation may need variable length records. So in file system, it is very difficult to deal with complex objects. 8. Data Inflexibility: Due to data isolation and program-data interdependency, very little flexibility is provided to users in file processing systems. DISADVANTAGES OF DATABASE SYSTEM Major disadvantages of database system are: 1. Complex: A database system is required to support multiple applications and provide many other features like data independence, minimum redundancy, interface with past, interface with future etc. All this makes it a quite complex system. 2. High Cost of Software: Data base system is quite complex and it requires specialized software like DBMS, which is quite costly. 3. High Hardware Cost: To run specialized software like DBMS and to deal with large amount of data in data base, database systems required high capability hardware which is quite costly. 4. High Conversion Cost: When system is changed from file system to data base system, a lot of data need to be converted. This conversion cost is quite high. 5. Data Security: Due to this direct accessibility of database by large number of remote users, there is strong need of security to protect it from unauthorized users. 6. Data Integrity: Since a large number of users can access data concurrently, so there is every chance of simultaneous updating attempt by many users. So there may be problem of data integrity i.e. same data is different at different places.
ADVANTAGES OF DATABASE SYSTEM
Major advantages of database system are: 1. Minimum Redundancy: In database system, entire data is centralized. So there is either no or minimum redundancy of data. 2. Efficient Updating: In database system, data is available with minimum redundancy, so there is almost no chance of inefficient updating. It ensures data integrity. 3. Sharing Of Data: Data in a database can be shared by many applications without any problem. This saves a lot of time and money that would have been invested for creating different files for each application. Even a development of a new application is easy as it does not require new data and may share existing database. 4. Standard Can be Enforced: Since all access to the database is under the control of DBMS and DBA, so it is quite easy to enforce standards such as naming and formatting of data. Such standards result in uniformity in the database as well in its usage. It also makes security easy. 5. Flexibility: To make any required change in data, database system is more flexible as compare to file system. 6. Conflict Resolution: Since database is under the control of DBA, so any conflicts and if occurs, can b easily handled by the DBA and user need not to consider these. 7. Simplified Approach: In data base system, a lot of built-in functions are available for some common tasks such as searching, sorting etc. So user finds it very easy to perform such tasks and need not to perform programming as it happens in case of file system. COMONENTS OF DATABASE SYSTEM Database system is a computer based record keeping system to manage and process large volume of data of entire organization. There are following four major components of database system: 1. Data: Data is collection of facts in raw form. It is not in directly useful form and need some processing. It is an important component of the database system. The data acts a bridge between the machine parts i.e. the hardware, software and the users. 2. Hardware: Physical components of the computer system (such as CPU, VDU, I/O devices etc.) are known as Hardware. Since database can range from a single user with a desktop computer to thousands of users on mainframe computers, therefore proper care should be taken to select appropriate hardware devices. 3. Software: Database system includes software like DBMS that acts as a bridge between the user and the database. It interacts with the users, application programs, and database to access and update data. Some commonly available DBMS are Oracle, Access, SQL FoxPro, dBase etc. Major functions of DBMS are: (i) It manages organization, dictionary and location of the data. (ii) It manages data processing. It also manages extract answers that are not available in regular reports. (iii) It speeds up programming making programmer free to concentrate on logic of the program. (iv) It includes special user friendly query languages. (i) It performs authorization services like log on to the DBMS, start the database, stop the Database etc. (ii) It performs import and export of Data. (iii) It performs user’s monitoring 3. Users: Users are the persons who need the information from the database. Different categories of the users are provided access to the database totally or partially. Users are of mainly following three categories: (i) Database Administrator (DBA): DBA is a person or a group of persons who is responsible for the management of the database. The DBA is responsible to decide the attribute to be included in the data base, their type, size, storage media to be used, backup and recovery techniques to be used, who will be allowed to access the database and up to what extent. (ii) Application Programmers: Application programmers are users responsible to develop application programs to use and manipulate the data base. These application programs operate on the data to perform various operations such as retaining information, creating new information, deleting or changing existing information. (iii) End Users: End users are the users that interact with the database through application programs. End users are of two types: (a) Casual End Users: These are the users who occasionally access the database but may need different information each time. They use some query language to specify their requests. Example of such users includes high level managers who access the data weekly or biweekly. (b) Naive End Users: These are the users who have least knowledge about the system. They simply use only standard types of queries. These can perform limited operations and effect precise portion of the database. An ATM user is an example of naïve user as he has access to only standard (limited) portion of the database. DATA BASE ADMINISTRATOR (DBA) DBA is a highly skilled person (or group of persons) who is responsible for the management of the database. He has most of the powers such as defining schemas, storage structures and access method strategies, physical organization, authorization and integrity constrains etc. Major functions (responsibilities) of DBA are: 1. Deciding the information content of the database: DBA identifies what kind of information is to be stored in the database. He identifies entities (tables) as well as the fields to be recorded about those entities. 2. Define the storage structure and access strategies: DBA decides the representation of the data in the database. He also specifies the mapping between the physical schema (storage structure definition) and the conceptual schema. 3. Defines external schemas for the users: DBA writes necessary external schemas and ensures that required data should be made available, in the required format. He also specifies the mappings between the external schemas and the conceptual schemas. 4. Defines integrity constraints specifications: DBA is responsible for the accuracy of the data in the database by specifying integrity constraints and checks. For example- An employee cannot work for more than 8 hours a day. 5. Granting of authorization for data access: DBA ensures that only authorized user can access the data. For this he defines the authorization checks to be carried out whenever access to sensitive data is attempted. So it is the job of DBA to decide which part of the database should be accessible by whom. 6. Define strategy for backup and recovery: DBA specifies the strategy for backup and quick recovery of data lost due to the failure in the hardware or software or due to human failure. 7. Monitoring performance: DBA monitors the performance and makes sure to get the optimum performance. He plans future storage requirements for the database system. He also installs and upgrades the DBMS and the applications tools.
THE STEP BY STEP GUIDE FOR SUCCESSFUL IMPLEMENTATION OF DATA LAKE-LAKEHOUSE-DATA WAREHOUSE: "THE STEP BY STEP GUIDE FOR SUCCESSFUL IMPLEMENTATION OF DATA LAKE-LAKEHOUSE-DATA WAREHOUSE"