Q. What do you mean by data and its management? How data are organized? A. What is data?

• Data generally refers to fact and figures. It may be the name, object, situation or anything that one is interested to know. • Data is a raw material to computer for producing information. • Data when processed to present in a meaningful way, we have information. • Data are simply values or set of values. • Data and information are used interchangeably, as information at one level can be used as data at another level. Data is vital to human being. Every moment one is eager to know something. An organization survives on data. It has to record data on stock at hand, demand for the product, marketing strategy, personnel information and others for effective decision making. Database and database systems have become an essential component of everyday life in modern society. Data item refers to a single unit of values called elementary items. For eg. rollno, customernumber,pincode etc. Data items that are divided into sub-items are called group items. For eg. 98-C, Main Road,GandhiNagar,Berhampur It is difficult to find the persons who belong to GandhiNagar from group items. It needs to be divided into elementary items as follows Room No. 98-C Location Main Road Street GandhiNaga r town Berhampur

Data must be managed. Data management consists of the followings Data capture: Collecting data as and when they originate. Data classification: Classification of captured data based on the nature and intended usage. Data storage: the classified data has to be store permanently so as to access it whenever required. Data arranging : Presenting the stored data in a particular order based on users requirement.

Data retrieval : retrieval of data for further processing / decision making. Data maintenance :keeping data up-to-date, this involves modify, insertion or deletion of data from storage. Data verification: process of checking for valid and accurate data before storing. Data coding: representing data in another way for easy reference. Coding helps in reducing amount of storage. Data editing: modifying the existing data. Data transcription: activity where the data is converted from one form into another. Data transmission: forwarding data and information from one place to another place for further processing via network. Organization of data • • • • • • • • Collection of data may be organized in hierarchy of fields, records and files. Data are organized into the hierarchy of fields, records and files reflects the relationship between attributes, entities and entity set. Entity is something that has certain attributes or properties. Each attribute may be assigned a value, and the value may be numeric or nonnumeric. Entity set is a group of entities with similar attributes. Field is a single elementary unit of information representing an attribute of an entity. A record is the collection of filed values of a given entity. A file is the collection of records of the entities in a given entity set.

Q. What is Database? Write its characteristics. Generally, database is a collection of related data. For effective decision making one must record and maintain data of various facts of its interest. For example an organization records data of its products manufactured, marketed and about customers, college on course, student, library and hospital on patient and doctors. Properties of Database: A database has the following properties • A database represents some aspects of real world. • A database is collection of logically related data. • A database is build and maintained for a specific purpose.

every user group maintains its own files for handling its data processing applications. Centralized control of data avoids unnecessary Users/programmer Application programs Software to process queries Software to access stored data Meta data database Application 1 Application 2 Offline data online data Offline data online data . storing and retrieving the information contained in database among various users and applications. Other important function of DBMS is to exert centralized control of the database including protection against both hardware/software malfunction and secure these from unauthorized users. What is DBMS? Write its advantage and disadvantages. Advantages and disadvantages of database Approach Advantages The advantages of DBMS over traditional (maintaining manually) database is as follows: Redundancy can be reduced: In a non database system.Q. What is DBMS? A database management system is a software system that allows user to create and maintain a database. The objective of DBMS is to facilitate the process of defining.

. Besides. as the two entries will not agree that is.duplication of data and effectively reduces the total amount of data storage required. Transaction support can be provided: A transaction is a logical unit of work. when one of the two entries has been updated the other may not. Inconsistency can be avoided: Duplication of data may leads to inconsistency. the system fails. then the DBMS can guarantee that the database is never inconsistent. if the redundancy is removed.. At such times the database is said to be inconsistent. even if in any abnormal situations e.. Data can be shared: The centralized database allows sharing of data among number of application program and users. Application1 Application2 .. it also allows new applications to operate against the same data. if the redundancy is not removed but is controlled. Alternatively.g. This process is known as propagating updates. Applicationn Database Management System Online data Online data Inconsistency can be reduced. Integrity can be maintained: It implies data contained in the database is both accurate and consistent. then the system can effectively operate either both of them or neither. The transaction atomicity feature guarantees that if two or more operations are part of the same transaction. DBA implements integrity constraint to check to ensure that they fall within a specified range and are of correct format. typically involving several database operations. by ensuring that any changes made to either of the two entries is automatically applied to the other one as well.

This increase the potential severity of security breaches and disruption of the operation of the organization because of downtime and failure. Cost of software/hardware and training: This includes the cost of purchasing or developing software. 2. while permitting less critical applications to continue to use database. installation. I/O device etc. Disadvantages Problems associated with centralization: Centralization means the data is accessible from a single source. national and international. Standards can be enforced: With Centralized database. What is Database System? A database system is a collection of four major components. Data : facts and figures on which database is maintained. she/he has to choose the best file structure and access method to get optimal performance for the response-critical applications. 3. Hardware :Consists of processor. commonly known as DBMS. corporate. memory. The DBA has the responsibility to ensure that proper access procedures are followed including proper authentication schemes for access to the DBMS and additional checks before permitting access to sensitive data. The backup and recovery operations are fairly complex in a DBMS environment. upgradation cost of hardware to allow for extensive programs and others. and sharing of the data causes a degradation of the response and throughput time. It is desirable as an aid to data interchange. or movement of data between systems.Security can be enforced: Data is vital to an organization and it is confidential. Software: a software system that work between user and physical database. Q. the DBA can ensure that all applicable standards are observed in the representation of the data. Conflict resolution: Since database is under the control of DBA. Applicable standards include any or all of the followings: departmental. . These are 1. integrity. Such confidential data must not be accessed by unauthorized user. secondary storage device. recovery and security: The processing overhead introduced by DBMS to implement security. Complexity of backup. industry.

4. depending on their degree of expertise or mode of interaction with DBMS. using standard types of queries and updates –called canned transactions. Application programmer and system analysts: The system analysts determine the requirement of the end users. There are number of users of database. Write a short note on various users of DBMS. and generating reports. End users: peoples whose jobs require access to the database for querying. Naïve end user: they form a sizable portion of database end users. User. updating. Application programmer implements these specifications as programs. Q. and to come up with design that meets these requirements.that have been carefully programmed and tested. It is exerted by a person or group of persons under the supervision of a high-level administrator. there must be a centralized control of the database to administrate the resources. 1. This person or group is referred to as the DBA. application programmers and database administrator. 3. They can be classified into following groups. b. It is the responsibility of database designers to communicate with all prospective database users in order to understand their requirements.4. Database Administrator: When database is used by many users. There are several categories of end users. and develop specifications for canned transactions that meet these requirements.: end users. These tasks are mostly undertaken before the database is actually implemented and populated with data. modifying. . the database primarily exists for their use. They are the users who are most familiar with the database and are responsible for creating. a. Database Designers: responsible for identifying the data to be stored in the database and for choosing appropriate structures to represent and store this data. Casual end user: occasionally access to the database. and maintaining its three levels. 2. but they may need different information each time. Their man job function revolves around constantly querying and updating the database.

Stand-alone end user: maintain personal databases by using ready-made program packages that provide easy-touse menu-based interfaces. Sophisticated end user: include engineers. Defining security and integrity constraints: The DBA is accountable for problems such as breach of security or poor system response time. sets up the definition of the conceptual level of the database. scientists. Defining dump and reload policies: DBA is also responsible for defining procedures to recover the database from failures due to human. DBA is also responsible for granting permission to the users of the database and stores the profile of each user in the database. and for acquiring software and hardware resources as needed.c. d. Working on three levels architecture: DBA administers the three levels of the database and. DBA is the custodian of the data and controls the database structure. Monitoring performance and responding to changing requirements: DBA is responsible for organizing the system in such a way as to get the best performance. Besides. he/she must implement procedures for integrity constraints. Q. . The DBA is responsible for authorizing access to the database. business analysts. natural or hardware causes with minimal loss of data. for coordinating and monitoring its use. Briefly define the role / task of Database Administrator A. as well as between conceptual and external levels. in consultation with the overall user community. The DBA further specifies the external view of the various users and applications and is responsible for the definition and implementation of the internal level. DBA is a person or group of persons who has centralized control over database. Mapping between the internal and conceptual level. are also defined by the DBA. and others who thoroughly familiarize themselves with the facilities of the DBMS so as to implement their applications to meet their requirement.

3.The view at each of these Q. commonly which means a systematic known as ANSI/SPARC1 plan for attaining some goal. which not only defines the various stored record types but also specifies what indexes exist. which describes the physical storage structure of the database. the internal view is described by means of the internal schemas. A scheme is an has to be mapped? outline or a plan that According to different categories of describes the records and users. In this architecture. architecture. schemas can be defined at the following three levels. in a form that is somewhat abstract in comparison with the way in which the data is physically stored. what physical sequence the stored records are in . how stored fields are represented. relationship. 1. Logica l view External Logical /user Employee Name view Employee Name view/view Employee Designation Employee Address Employee Salary level includes number of Employee Name: String Conceptual external Employee Address: String view Employee Designation: string schema. The word scheme. Conceptual/Global level has a conceptual schema. Internal level has an internal schema. What is the three-level levels is described by a architecture of DBMS? How it scheme. which describes the structure of the whole database for a community of users. and so on. level architecture. database. data type. It hides the details of physical storage structures and concentrates on describing entries. 2. user operations and constraints. The Employee Salary: integer external level Employee No : Key is individual 1 American National Standards Institute/Standards Planning and Requirements Committee Name: String length 25 offset 0 Address: String length 50 offset 25 Designation: string length 10 offset 35 Employee Salary: 10. It is representation of the entire information content of the database.2 offset 45 Employee No : 4 offset 57 Internal view . It is a low-level representation of entire database. The goal of this is used interchangeably in architecture is to separate user the database literature with applications and the physical the word schema. and the abstraction at each relationships existing in the level ANSI/SPART developed threeview.

Q. Hence. the View 1 View 2 View n architecture involves certain Mapping by DBMS mappings. The process of transforming requests and results between levels are called mappings. each Mapping by DBMS/OS user group refers only to its own Internal Schema external schema. 2.user level. and then into a request on the internal schema for processing over the stored database. it specifies how conceptual records and fields are represented at the internal level. based on the three Conceptual Schema schema architecture. 1. A mapping between conceptual and internal: defines the correspondence between the conceptual view and the store database. Mapping between views In addition to the three levels. If the request is database retrieval. the DBMS must transform a request specified on the external schema into a request against the conceptual schema. An external view is thus the contents of database as seen by some particular user. Two mappings are required in a database system with three different views. the data extracted from the stored database must be reformatted to match the user’s external view. Each external schema describes the part of the database that a particular user group is interested in and hides the rest of database from that user group. In DBMS. A mapping between the external and conceptual: defines the correspondence between a particular external view and the conceptual view. What is Data independence? .

. 2. Three levels architecture. and the technique used to access it. Applications of such type are data-dependent because it is impossible to change the physical representation or access technique without affecting the application. 1. This change is necessary to maintain the conceptual level invariant. Altering the physical database organization. Logical data independence is achieved by providing the external level or user view of the database. are both dictated by the requirements of the application under consideration. Different applications will require different views of the same data. . without having to modify existing applications. along with the mappings provide two distinct level of data independence. The application program or users see the database as described by their respective external views. Physical data independence indicates that the physical storage structures or devices used for storing the data could be changed without necessitating a change in the conceptual view or any of the external views. Q. a change is required in the transformation functions between the physical and conceptual levels. We may change the conceptual schema to expand the database. If there is a need to change the file organization or the type of physical device used as a result of growth in the database or new technology. These are as follows. Differentiate between DDL and DML. It also insulates application programs from operations such as combining two records into one or splitting an existing record into two or more records. Data independence can be defined as the capacity to change the schema at one level of a database system without having to change the schema at the next higher level. A DBMS must provide languages and interfaces for its various users. The DBA must have the freedom to change the physical representation or access technique in response to changing requirements. however can affect the response and efficiency of existing application programs. Logical data independence indicates that the conceptual schema can be changed without affecting the existing external schemas.In pre-database systems data is physically represented in secondary storage.

Typically manipulations include retrieval. The application program could use a subset of the conceptual data definition language or separate language. directory. insertion. which can be used to define the conceptual scheme and also give some details about how to implement this scheme in the physical devices used to store the data. Data Manipulation Language (DML): Once the database schemas are compiled and the database is populated with data. It also provides a method whereby the application programs indicate their data requirements. or system catalog. The database system also contains mapping functions that allows it to interpret the stored data for the application system. and modification of the data. The definition of the internal view is compiled and maintained by the DBMS. The language used to manipulate data in the database is called Data Manipulation Language. deletion. DBMS provides a facility known as Data Definition Language. including the constraint on the value that can be assigned to a given attribute and the constraint on the values assigned to different attributes in same or different records. Data manipulation involves retrieval . including the access method employed. The DBMS maintains the information on the file structure.Data Definition Language (DDL): Once the design of a database is completed and a DBMS is chosen to implement the database. users must have some means to manipulate the database. The internal schema is specified in a somewhat similar data definition language called data storage definition language. The definition also includes any constraints that have to be maintained. This definition includes all the entity set and their associate attributes as well as the relationships among the entity sets. are expressed in the DDL of the DBMS and maintained in a compiled form. The DBMS provides a set of operations or a language called the data manipulation language. The compiled internal schema specifies the implementation details of the internal database. the first thing is to specify conceptual and internal schemas for the database and mappings between the two. which can be described as meta-data about the data in the database. The compiled form of the definitions is known as data dictionary. These definitions. the method used to efficiently access the relevant data.

low-level or procedural DML 1. The DML provides commands to select and retrieve data from the database.( high or low level) are embedded in a general purpose programming language. such as SQL. high-level or nonprocedural DML 2. The DML can be procedural. the user has to indicate only what is to be retrieved. procedural DML: must be embedded in a general purposeprogramming language. The latter would be replaced by appropriate procedure calls by either a preprocessor or the compiler. the user indicates not only what to retrieve but how to go about retrieving it. . Nonprocedural DML: It can be used on its own to specify complex database operations in a concise manner. This type of DML typically retrieves individual records from the database and process each separately. Low-level DMLs are also called record-at-atime DMLs. High-level DMLs. If the DML is nonprocedural. Whenever DML commands. 1.of data from the database. hence. In the latter case. that language is called the host language and the DML is called the data sublanguage. and deletion or modification of existing data. They could be used in an interactive mode or embedded in conventional programming languages. 2. A query in a high-level DML often specifies which data to retrieve rather than how to retrieve it. insertion of new data into the database. Hence it needs to use programming language construct. such as looping. The first of these data manipulation operation is called query. The Data Manipulation functions provided by the DBMS can be invoked in application program directly by procedure calls or by preprocessor statements. Commands are also provided to insert. update and delete records. DML statement must be identified within the program so that they can be extracted by a pre-compiler and processed by the DBMS. to retrieve and process each record from a set of records. A query is a statement in the DML that requests the retrieval of data from the database. Many DBMSs allow high-level DML statements either to be entered interactively either from terminal or to be embedded in a general-purpose programming language. can specify and retrieve many records in a single DML statement and are hence called set-at-a-time or setoriented DMLs. such languages are also called declarative. There are two main types of DMLs. The subset of the DML used to pose a query is known as a query language.

Q. Semantic modeling is known by many names. Semantic network were developed to organize and represent general knowledge. Data models can be classified as follows: File-based system or primitive models: Entities or objects of interest are represented by records that are stored together in files. Most data representation models provide mechanisms to structure data for the entities being modeled and allow a set of operations to be defined as well as constraints to be enforced on them. The hierarchical model evolved from the file-based system and the network mode is a superset of the hierarchical model. They provide a simple and readily understood means of communicating the salient features of the design of any given database. Entity-relationship model: Earlier commercial systems were based on the hierarchical and network approach. E/R diagram constitute a technique for representing the logical structure of a database in a pictorial manner. Semantic data models: this model is influenced by semantic networks developed by artificial intelligence researchers.On the other hand. Semantic data models are able to express greater interdependencies among entities of interest. Briefly describe about Data Modeling and its classification? A. These models differ in their method of representing the associations amongst entities and attributes. Relationships between objects are represented by using directories of various kinds. In this model object of similar entity are collected into an . network and relational models. A number of models for data representation have been developed. The E-R model is a generalization of these models. entity modeling and object modeling. A data model is a mechanism that provides this abstraction for database applications. Traditional data models: Traditional data models include hierarchical. a high-level DML used in a stand-alone interactive manner is called a query language. such as entity/relationship modeling. Data modeling is used for representing entities of interest and their relationships in the database. A model is an abstraction process that hides superfluous details while highlighting details pertinent to the applications. The relational model is based on the mathematical concept of relation.

All entities or relationships of a given type have certain kinds of properties in common. . data fields. etc. or M:N. For ex. job. A weak entity is an entity that is existence-dependent on some other entity. organization. STRONG ENTITY WEAK ENTITY student Attributes: Each entity has attributes that describe it. Single or multi-valued :Most attribute have a single value for a particular entity. Entity:. The relationship between entity sets is represented by a named E-R relationship and is 1:1. To store data on an entity set. mapping from one entity set to another. The employee name is considered as composite attribute where as first name. middle name and last name are called simple attribute. A regular entity on the other hand is an entity that is not weak. The properties that characterize an entity set are called its name class sex roll attributes. the border of the rectangle is doubled.entity set. dog. Attributes are also known as data element. Entities are the basic units used in modeling classes. An entity is a thing which can be distinctly identified. An entity may be an object with a physical existence such as employee. in the sense that is can not exist if that other entity does not also exist. for a weak entity type. An entity set is a group of similar object with same set of attributes or properties. middle name and last name. or may be an object with a conceptual existence such as college. such attributes are called single-valued. For example the employee name contains first name. we have to create a model for it by gathering important properties concern to organization. Entities can be classified into regular entities (strong entities) and weak entities. Attributes can be age Simple or composite :A composite attribute contains numbers of simple attributes. person etc.1:M. data item. Each entity type is shown as a rectangle containing the name of the entity type in question.

property Derived property Derived property Multivalued property Key property Composite property Relationships: Entity can be related to other by means of relation. one attribute is depend on some other attributes for example DA attribute is dependant on salary. connected to the ellipse for the composite property in question by means of further solid lines. The DA attribute is hence called a derived attribute where as salary is base or stored attribute. still a particular attribute uniquely identifies the entity type. unknown/not applicable Base or derived :in some cases. age is single valued attribute of an entity. Room type in a lodge. The association could have certain properties represented by the attributes of the relationship set. That is an entity type usually has an attribute whose values are distinct for each individual entity in the entity set. Often some attributes require different values. Key properties are underlined. its component properties are shown as further ellipses.father’s name. Missing / null value :In some cases a particular entity may not have an applicable value for an attribute. qualification of a person etc. Each relationship type is shown as a diamond containing the name of the relationship type in question. An association among entities is called relationship. Attributes are shown as ellipses containing the name of the property in question and attached to the relevant entity or relationship by means of a solid line. The relationship set is used in data modeling to represent an association between entity sets. The number of participants in a given relationship is called the degree of that relationship. Such an attribute is called a key attribute. For ex. If the property is composite. The entities involved in a given relationship are said to be participants in that relationship. The diamond border is doubled . The ellipse border is dotted or dashed if the property is derived and doubled if it is multi-valued. Key: although an entity has number of attributes. For example account number in banking is said to be key. The value sets corresponding to properties are not shown.

For example. An Association between two attributes indicates that the values of the associated attributes are interdependent. many-to-one etc. that entity type EMPLOYEE IS super type of entity type the converse is not true. 1 M Relationship with weak entity relationship M Entity subtypes and supertypes: Any given entity is of at least one entity type. which provides details about a customer. This correspondence between attributes of an entity WORKS_FOR is a property of the information EMPLOYE DEPARTMENT E r1 that is used in modeling the object. each such line is labeled “1” or ”M” to indicate whether the relationship is one-to-one . The association among r2 various sets of data d1 e1 represented in the database is r3 e2 called a relationship. Then e draw a solid line from the A to B. For example in a bank it is account number. Which does not apply to Let entity type B is a subtype of entity type A. then we might say that entity type PROGRAMMER is a subtype of entity type EMPLOYEE. if some employees are programmers. The line denotes what is sometimes called “the isa relationship”. The participants in each relationship are connected to the relevant relationship by means of solid lines. Subtype entity Super type entity Association of data Information is obtained from data by using the context in which the data is obtained and made available. d2 e3 r4 Relationship types d3 e4 r5 e5 e6 r6 e7 r7 . The line is doubled if the participation is total.if the relationship in question is that between a weak entity type and the entity type on which its existence depends. marked with an arrowhead at the B end. but an entity can be of several types simultaneously.

… .. Each of the entity types E1.e3.En defines a set of relationship set among entities from these entity types. Hence.e3. alternatively. it can be defined as a subset of the Cartesian product E1 × E2 × E3 × … . Mathematically.…. the WORKS_FOR relationship is of e1 degree two. Case study A company database keeps track of a company’s employees. where each ri associates n individual entities (e1.e3.….en is said to participate in the relationship instance ri= (e1.e2.N:1.E2. Suppose that after the requirements collection and analysis phase. r7 1:N.E2.en). r6 The possible cardinality ratios for binary relationship types are 1:1. each of the individual entities e1.e2. departments and projects. meaning that each e3 department can be related to any r4 e4 number of employees. Relationship can generally be of any degree.….A relationship type R among n entity types E1.E3.…. and each entity ej in ri is a member of entity type Ej. 1≤j ≤n. department.E2.e2.E3.En. but an r5 employee can be related to only one department.E3. the relationship set R is a set of relationship instance ri.en) Relationship Degree: The degree of a EMPLOYE relationship type is the number of E participating entity types. For ex. e2 employee is of cardinality ratio r3 1:N. the database An M:N relationship PROJECT P1 P2 P3 P4 . but e2 the ones most common are e3 binary relationships. MANAGES DEPARTMENT r1 r2 r3 d1 d2 d3 Cardinality ratios for binary A 1:1 relationship relationships: The cardinality ratio for a binary relationship specifies the maximum number of WORKS_ON EMPLOYE relationship instances that an E r1 entity can participate in.En is said to participate in the relationship type R.. In the WORKS_FOR binary r2 e1 relationship type.…. and M:N.. a relationship type is a mathematical relation on E1. Hence. similarly.En.

We keep track of the start date when that employee began managing the department. birth date. We also keep track of the direct supervisor of each employee. each of which has a unique name. 3. We want to keep track of the dependents of each employee for insurance purposes. A department may have several locations. 1. and a single location. and relationship to the employee. We keep each dependent’s first name. An employee is assigned to one department but may work on several projects. address. a unique number. salary.designers provided the following description to be represented in the database. social security number. a unique number. 4. Fname Bdate Mname Name Lname N Salary Sex Startdate 1 MANAGES 1 DEPARTMENT 1 Hours CONTROLS WORKS_ON N N PROJECT 1 Numberof employees Name Location Number WORKS_FOR Address Ssn EMPLOYEE Supervisor Supervis ee M 1 1 SUPERVISION N DEPENDENTS_OF name Number Location N DEPENDENT name Sex Relation Birthdate . which are not necessarily controlled by the same department. sex. Each department has a unique name. 2. We keep track of the number of hours per week that an employee works on each project. and a particular employee who manages the department. A department controls a number of projects. We store each employee’s name. The company is organized into departments. sex and birth date.

k s a e hra e h . traditional data model and semantic data model.ra e h mh s . The following structure shows a relation EMPLOYEE. Among these relational data model is the fundamental on which the modern database system technology based. A. the relation is the only construct required to represent the associations among the attributes of an entity as well as the relationships among different entities. f e s t. network and relational data model. What is relational model? Define the structure of a relational database. Mn g r aae d m in o a s re tio la n Eo e cd E 1 E 2 E 3 E 4 E 5 sa e nm S ty a a R ms a eh Mh s a eh Rk s aeh Md u ah s la a ry d s n tio e ig a n 90 00 70 00 200 00 90 00 100 50 S p rv o u e is r O eA s ffic s t. 4 1 2 3e P a ky rim ry e S t a ms . E .e .7 0 . A model is an abstraction process that hides superfluous details while highlighting details concerning to the application at hand.e e u e is r x c u eofic A s tiv .e . Ee u e x c tiv S p rv o u e is r Mn g r aae tu le p s a rib te tt u s d ge er e Relational model concepts The relational model represents the database as a collection of relations. A relation may be visualized as a named table. The traditional data model includes hierarchical. In this model. 100 50 S p rv o. There are number of data models like file-based model. Relational data model is conceptually simple and based on sound theoretical principles.2 0 0 00 00 00. After more than a decade. ay . It consists of three basic components: • A set of domains and a set of relations • Operations on relations • Integrity rules Formal relational term Relation Tuple Cardinality Attribute Degree Primary key Domain Informal equivalents Table Row or records Number of rows Column or field Number of columns Unique identifier Pool of legal values c r in lit ad a y . Relational data model is based on the mathematically concept of relation.THE RELATIONAL DATA MODEL Q. it has emerged from the research as a commercial product and was introduced in 1970 by Ted Codd of IBM. ah md u 9 0 .

For example Employee_Age : each must be a value between 18 to 60.An. which specifies street number. Relations :A relation schema R.A relation is thought of as a table of values. is a set of values of the same data type. as these attributes draw their values form the same set. In relational database systems. A common method of specifying a domain is to specify a data type from which the data values forming the domain are drawn. The table name and column names are used to help in interpreting the meaning of the values in each row. EMP _ AGE : {x | x a positiveint eger ∧ 18 ≤ x ≤ 60} Employee_name : set of character stings that represent name of persons Phone_number : set of seven digit numbers valid within a particular area. such as the set of integers. city. denoted by R(A1. a row is called a tuple. All values in a column are of the same data type.. A domain. This set of allowable values for the attribute is the domain of the attribute. Domain Di. and the table is called a relation. character strings. attributes. . attributes correspond to fields. From relational model point of view. may be unstructured (atomic) or structured. D is called the domain of Ai and is denoted by dom(Ai). Attribute: Attributes are defined on some underlying domain. an attribute may only be allowed to take a value from a set of the permissible values. for instance. real numbers. tuples and relations.A2…. like a data type. Domain: An object or entity is characterized by its attributes. That is they can assume values from the set of values in the domain. For a given application. Atomic domains are general sets. state and zip or postal code is considered a composite domain. and so on. a column header is called an attribute.A2…. The domain for the attribute address. A relation schema is used to describe a relation .An) is made up of a relation name R and a list of attributes A1. The data type describing the types of values that can appear in each column is represented by a domain of possible values. Each attribute Ai is the name of a role played by some domain D in the relation scheme R. R is called the name of this relation. Attributes defined on the same domain are comparable. Domains. street name. each row in the table represents a collection of related data values. Domain Di is said to be simple if all its elements are non-decomposable.

1. Tuples are unordered.00 25. Attributes are unordered.00 10. : this property follows immediately from the definition of a tuple. top to bottom: this property also follows from the fact that the body of the relation is a mathematical set.00 . sets in mathematics are not ordered. 4. all of them immediate consequence of the definition of relation. PRODUCT Product_na me Sprinz Sprinz Ariel Ariel Sprinz Clinic Clinic Ponds Item Deodorant Soap Detergent power Soap Talcum power Clinic Plus sampoo Clinic Coconut Oil Talcum power Price 15.00 2. Hindustan Lever Ltd. 2.3…n) MANUFACTURER Product_na me Sprinz Ariel Clinic Ponds Manufacturer Cavinkare Pvt.00 27.a tuple is a set of n components or ordered pairs of the form ai:vi(i=1. 3. Ltd.00 25. Each tuple contains exactly one value for each attribute.Properties of relations Relations possess certain properties.00 3. sets in mathematics do not include duplicate elements. Thus there is no such thing as “the first attribute” or “the second attribute” because attribute are referenced by name not by position. Procter & Gamble Hindustan Lever Ltd. left to right: This property follows from the fact that the heading of a relation is also a set.2.00 70. There are no duplicate tuples: This property follows from the fact that the body of the relation is a mathematical set.

Hindustan Lever Ltd.9} A set is determined by its members. Given an object g and the set G exactly one of the statement “g is a member of G”or “g is not a member of G” is true. Hindustan Lever Ltd. and difference operations. Ltd. The intension of a set defines the permissible occurrences by specifying a membership condition. If G .3. The number 3 is a member of the set G and this is denoted by 3 ∈ G .00 25. Operations on sets include the union.0 0 2. Hindustan Lever Ltd. The union of two sets G and H ( G ∪ H ) is the set that contains all elements belonging either to set G or H. Procter & Gamble Procter & Gamble Cavinkare Pvt. The extension of the set specifies on of the numerous possible occurrence by explicitly listing the set members.7.0 0 70. the union will not duplicate those members.0 0 25. If sets G and H have any elements in common. Set theory A set is well-defined collection of objects.Join relations MANUFACTURER and PRODUCT Product_name Sprinz Sprinz Ariel Ariel Sprinz Clinic Clinic Ponds Manufacturer Cavinkare Pvt.0 0 10. Product_nam Item e Sprinz Deodorant Sprinz Ariel Ariel Sprinz Clinic Clinic Ponds Soap Detergent power Soap Talcum power Clinic Plus sampoo Clinic Coconut Oil Talcum power Pric e 15.5. Cavinkare Pvt.0 0 27.0 0 The number of tuples in the join of MANUFACTURER and PRODUCT is the same as those in PRODUCT because a tuple in MANUFACTURER has the same value of the product_name attribute as a tuple in PRODUCT.00 3. It is commonly represented by a list of its elements or by the specification of some membership condition. Ltd. The intersection of the set G and H ( G ∩ H ) is the set composed of all elements belonging to both G and H. Ltd. Cartesian product. For example Intension of set G: {g|g is an odd positive integer less than 10} Extension of set G: {1. intersection.

Therefore.(Sprinz. G= {Ariel. G= {Ariel} H= {Ponds.tuples. then the possible pairing relations of degree 2 are: (g. relations between sets can be of many kinds.(sprinz. . In the four relationships above. Should there be an element h such that h ∈ H but h ∉ G .soap). Ariel} Then the sets G-H and H-g are G-H = φ ( the null set) H-G = {Ponds. Sprinz.Soap). sprinz)} Note that the individual n-tuples in the Cartesian product are ordered. An ordered pair is conventionally denoted by enclosing it in parentheses (g.g.Deodorant). these Cartesian products are G ×G.h) ¬ ⊂ ). then G is included in H. Sprinz. The difference of two sets G and H (G-H) is the set that contains all elements that are members of G but not of H.h). Sprinz).and H are two sets. Ariel} G ∩ H = {Ariel} Note that G ∩ H ⊆ G and G ∩ H ⊆ H and in the above example G ⊂ H The Cartesian product of two sets G and H ( G × H ) is defined in terms of ordered pairs or 2.g.(Deodorant.h) for which g ∈G and h ∈ H .(Ariel. G × H and H × G are entirely different sets.h) (h.g) (g. Sprinz. ( G ⊆ H ). g ∈G and h ∈ H . We can see that a pairing relation must be a subset of the Cartesian product of the sets involved in the relationship. The product G × H is the set consisting of all ordered pairs(g. Sprinz} In set theory. G ×H and H ×G and H × H respectively. and so on. G= {Ariel} H= {Ponds.g) (h. Ariel} G ∪ H ={Ponds. e. If G and H are sets of objects. Pairing relations can also be defined in terms of Each is a relation.deodornat)} H × G ={( Soap. e.( soap. if and only if each member of G is also a member of H. ). Deodorant } G × H ={(Ariel.( deodorant. then G is a proper subset of H ( G ⊂ H ) e. Sprinz} H= {Soap. Ariel).g. such as a subset of ( complement of ( some specific criterion. Ariel).

Keys A key is a single attribute or combination of two or more attribute of an entity set that is used to identify one or more instances of the set. The remaining candidate keys would be considered alternate keys. The <constraint>s specify certain integrity constraints that apply to the base table in question. An attribute which is unique and will identify an instance of the entity set is known as primary key. They therefore do not necessarily have a primary key. Where each <base table element> is either a <column definition> or a <constraint>. In such case we must decide which of the candidate keys will be used as the primary key. A secondary key is an attribute or combination of attributes that may not be a candidate key but that classifies the entity set on a particular characteristic. a primary key is . If we add additional attribute to a primary key. Base tables are defined by means of the CREATE TABLE statement. SQL tables are considered to have a left-to-right column ordering. There may be two or more attributes or combinations of attributes that uniquely identify an instance of an entity set. . Such augmented keys are called super keys. For example department in employee entity is a secondary key. These attributes or combinations of attributes are called candidate keys. therefore a minimum super key.Base Tables SQL tables are allowed to include duplicate rows. the resulting combination would still uniquely identify an instance of the entity set. CREATE TBALE <base table name> (<base table element commalist>).

Basic operations Basic operations are the traditional set operations: union. The relational algebraic operations can be divided into basic set-oriented operations and relational-oriented operations. the latter. then the union of P(P) and Q(Q) is the set-theoretic union of P(P) and Q(Q).|Q|) ≤ |R| ≤ |P|+|Q| The result relation R contains tuples that are in either P or Q or in both of them. Two relations are union compatible if they have the same arity and one-to-one correspondence of the attributes with the corresponding attribute defined over the same domain.Q2…. it defines the complete scheme for each of the result relations. Three of these four basic operations –union . Relational algebra is a procedural language. Furthermore. The duplicate tuples are eliminated. the resultant relation. If P= {P1.2……n} Ex. The former are the traditional set operations.intersection. difference. projection and division. intersection and Cartesian product. Id Name 1 Satya 2 Syam 4 Nirmal 5 Ram 7 Hari 8 Madhu . those for performing joins. R= P ∪Q .Relational Algebra Relational algebra is a collection of operations to manipulate relations. selection.P2……Pn} and Q= {Q1. and difference –require that operand relations be union compatible.Qn} then Dom(Pi)= Dom (Qi) for I = {1. Two relations P(P) and Q(Q) are said to be union compatible if both P and Q are of the same degree n and the domains of the corresponding n attributes re identical. P Q Id Name 1 Satya 2 Syam 4 Nirmal 5 Ram 7 hari Id Name 2 Syam 5 Ram 8 madhu Union If we assume that P(P) and Q(Q) are two union-compatible relations. has tuples drawn from P and Q such that R = {t | t ∈ P ∨ t ∈ Q} and max( |P| .

From the above expression. the result relation is obtained by concatenating each tuple in relation P with each tuple in relation Q. while if the tuples in P and Q were disjoint. and R (R ) is the same.|Q|) The intersection operation is really unnecessary. then R = P and | R| = |P| .Q such that Id 1 R= and 4 7 {t | t ∈ P ∧ t ∉ Q) {t | t ∈ P ∧ t ∈ Q) Name Satya Nirmal Hari 0≤|R|≤|P| Intersection The intersection operation selects the common tuples from the two relations.The degree of the relation P(P) and Q(Q). R = P. R= P ×Q Where a tuple r ∈ R is given by {t1 || t 2 | t1 ∈ P ∧ t 2 ∈ Q} . It can be expressed as P ∩Q = P-(P-Q) Cartesian product The Cartesian product of two relations is the concatenation of tuples belonging to the two relations. i.g. || represents the concatenation operation. R= P ∩Q where R= Id 2 5 Name Syam ram and 0≤|R| ≤ min( |P|.e. A new resultant relation scheme is created consisting of all combination of the tuples. we can see that if all the tuples in Q were contained in P. Here. the |R|= |P| + |Q|. Difference The difference operation removes common tuples from the first relation. . The scheme of the result relation is given by: R = P || Q The degree of the result relation is given by: |R| = |P|+|Q| The cardinality of the result relation is given by: |R| = |P|*|Q| E. The cardinality of the resultant relation depends on the duplication of tuples in P and Q.

selection. join and division are binary.  . . The projection operation is used to either reduce the number of attributes in the resultant relation or to reorder attributes. the cardinality may also be reduced.. These operations are represented by the symbol π . in general. this is due to the deletion of duplicate tuples in the projected relation. which provide a very limited data manipulation facility. it yields a “vertical subset” of the relation. When the number of attributes in the relation is reduced...P Id 1 2 4 Q Name Satya Syam Nirmal Designati on Manager executive Id 1 1 2 2 4 4 Name Satya Satya Syam Syam Nirmal Nirmal Designation Manager Executive Manager Executive Manager Executive The union and intersection operations are associative and commutative. The difference operation.e. Projection and selection are unary operations. R ∩ ( S ∩T ) = ( R ∩ S ) ∩T ) = . T(T) : R ∪ ( S ∪T ) = ( R ∪ S ) ∪T = ( S ∪ R ) ∪T = T ∪ ( S ∪ R ) = .. Projection (π) The projection of a relation is defined as a projection of all its tuples over some set of attributes. given relations R( R ) and S(S). have been supplemented by the definition of the following operations: Projection. In the first case the arity (degree) of the relation is reduced... R-S ≠ S-R R-(S-T) ≠ (R-S)-T Additional relational algebraic operations The basic set operations. i. σ . join and division. is non-commutative and nonassociative. ÷ respectively. therefore.

Sign up to vote on this title
UsefulNot useful