You are on page 1of 5

In this video, we'll learn about the Relational Model.

The Relational Model is more than 35 years old, and it's really the foundation of database management systems. It's spawned a many billion dollar industry. The relational model underlies all commercial database systems at this point in time. It's actually an extremely simple model and that's one of its benefits. Furthermore, it can be queried. By that I mean we can ask questions of databases in the model using High Level Languages. High Level Languages are simple, yet extremely expressive for asking questions over the database. And finally, very importantly there are extremely efficient implementations of the relational model and of the query languages on that model. So let's move ahead and understand the basic constructs in the relational model. So, the primary construct is in fact, the relation. A database consists of a set of relations or sometimes referred to as "tables", each of which has a name. So, we're gonna use two relations in our example. Our example is gonna be a fictitious database about students applying to colleges. For now we're just gonna look at the students and colleges themselves. So we're gonna have two tables, and let's call those tables the Student table and the College table. Now, as an aside, there's a healthy debate in the database world about whether tables relations ought to be named using the singular or the plural. I personally don't have a stake in that debate. I'm going to use the singular. Next, we have the concept of attributes. So every relation and relational database has a predefined set of columns or attributes each of which has a name. So, for our student table, let's say that each student is gonna have an ID, a name, a GPA and a photo. And for our college table, let's say that every college is going to have a name, a state, and an enrollment. We'll just abbreviate that ENR. So those are the labeled columns. Now the actual data itself is stored in what are called the tuples (or the rows) in the tables. So let's put a couple of the data tables, data tuples in our tables. So let's start with the students and lets say that our first student has ID 123. Name is Amy -

GPA 3.9, and she's happy with that. So she has a smiley photo. And our second student is 234, Bob - his GPA is 3.4. He's not quite as happy. And typically of course a table will have thousands, maybe millions, even sometimes billions of rows, each row containing a value for each attribute. In our college table, let's suppose we have .... well of course we're going to start with Stanford in the state of California and Stanford's enrollment is 15,000. We'll include our cross-bay rival Berkeley again, in the state of California. Berkeley's enrollment is a whopping 36,000. And last of all, we are going to not be West Coast biased. We'll include MIT in the state of Massachusetts with an enrollment of 10,000. Now of course there's gonna be again many more tuples in the college table, and many more tuples in the student table. Okay, next, let me mention that in a relational database, typically each attribute or column has a type sometimes referred to as a domain. For example, the ID might be an integer, the name might be a string, GPA might be a float, photo might be a jpeg file. We do also in most relational databases have a concept of enumerated domain. So for example, the state might be an enumerated domain for the 50 abbreviations for states. Now, it's typical for relational databases to have just atomic types in their attributes as we have here, but many database systems do also support structured types inside attributes. Okay, a little bit more terminology. The schema of a database is the structure of the relation. So the schema includes the name of the relation and the attributes of the relation and the types of those attributes. Where the instance is the actual contents of the table at a given point in time. So, typically you set up a schema in advance, then the instances of the data will change over time. Now, I mentioned that most columns have types. But, there's also a special value

that's in any type of any column and that's a special value known as null, and nulls are actually quite important in relational databases. Null values are used to denote that a particular value is maybe unknown or undefined. And, so let's suppose, let's add another tuple to our database. Let's say 345, another student named Craig, and for, whatever reason Craig doesn't have a GPA. Maybe Craig is home schooled, maybe Craig doesn't want to reveal his GPA. So then the database would contain a null value for Craig and we'll just put a neutral face there. Or, for example, maybe Bob doesn't want to have his photo in the database, so then Bob would have a null value for his photo, again nulls can go anywhere. Now null values are useful but one has to be very careful in a database system when you run queries over relations that have null values. In a later video we'll go into this in more detail, but I just wanted to give a just sort of example of what can happen. So, let's suppose we're asking a query over our student table of all students whose GPA is greater than 3.5. So when we run that query on our database obviously we'll get Amy out, obviously we won't get Bob out, but should we get Craig? The answer is No. We don't know for a fact that Craig's GPA is greater than 3.5, so we'll only get one student out from that query. Now let's suppose we had another query, where we were gonna ask for the GPA less than or equal to 3.5. So, similarly where we would not have Amy in result and we would certainly have Bob in the result and similarly would not have Craig in the result because we don't know that his GPA is less than or equal to 3.5. So far so good, but it gets a little weird is when we add an or here in our query, we say I want everyone who's GPA is greater

than 3.5 or who's GPA is less than or equal to 3.5. And even though it looks like every tuple should satisfy this condition, that it's always true, that's not the case when we have null values. So, that's why one has to be careful when one uses null values in relational databases. Let me erase this now and let's move on to our next concept which is the concept of Key. Key is again another important concept in relational databases. And, a key is an attribute in of a relation where every value for that attribute is unique. So if we look at the student relation, we can feel pretty confident that the ID is going to be a key. In other words, every tuple is going to have a unique for ID. Thinking about the college relation, it's a little less clear. We might be tempted to say that the name of the college is an ID, that actually college names probably are not unique across the country. There's probably a lot of or several colleges named Washington college for example. You know what, we're allowed to have sets of attributes that are unique and that makes sense in the college relation. Most likely the combination of the name and state of a college is unique, and that's what we would identify as the key for the college relation. Now, you might wonder why it's even important to have attributes that are identified as keys. There's actually several uses for them. One of them is just to identify specific tuples. So if you want to run a query to get a specific tuple out of the database you would do that by asking for that tuple by its key. And related to that database systems for efficiency tend to build special index structures or store the database in a particular way. So it's very fast to find a tuple based on its key. And lastly, if one relation in a relational database wants to refer to tuples of another, there 's no concept of pointer in relational databases. Therefore, the first relation will typically refer to a tuple in the second relation by its unique key. As our videos develop we'll see the importance of keys. Okay, just to wrap up, I'll mention how one creates relations or tables in the SQL language. It's very simple, you just say "create table," give the

name of the relation and a list of the attributes. And if you want to give types for the attributes. It's similar except you follow each attribute name with its type. So to wrap up, the relational model has been around a long time. Has started a huge industry. It's used by all database systems. As you've seen it's a very simple model and will shortly see that it can be queried with very nice languages. And, finally, it's been implemented very efficiently.