Professional Documents
Culture Documents
Design
Claire Belliard
Introduction
Introduction
Data in files
Data in files
Introduction
users
Classified data
Users: Administrators,
programmers, end users
(application related)
Introduction
A bit of history
The concept of the database was born in the 1960s, with the widespread use of
magnetic disks allowing direct access to data
End of the 1960s: appearance of the first DBMSs, network and hierarchical systems
A bit of history
The main DBMSs today:
• Object-oriented relational DBMS: data is represented in different tables that can be
linked together.
• NoSQL DBMS: data is organised in other structures:
• key-value: for example, a dictionary that associates a definition (value) with each
word (key)
• graph-oriented: associates each element with related elements (e.g. a person's
friends)
• document oriented...
A bit of history
The main DBMS today :
2019
Introduction
A bit of history
The main DBMSs today :
The different DBMSs differ in cost, the volume of data that can be managed, the number
of users who can simultaneously query the database, the ease of interfacing with other
applications, etc.
Main publishers: IBM, Oracle, Microsoft
PC: Access, Foxpro, Paradox ...
Mainframe: Oracle, DB2, Sybase, SQL Server, ...
Freewares and Sharewares: MySQL, PostgreSQL or MariaDB .
A DBMS consists mainly of an engine and a graphical interface. For example, the
phpmyadmin web application for MySQL
DBMS server: computer hardware on which a DBMS is installed It must have the
qualities of a file server (good access to disks) and an application server (well-sized
central unit, sufficient RAM)
Object-oriented
relational DBMS
Object-oriented DBMS
Functional domain
We will organise our set of objects in a structured way
Multi customer
media
HIFI
tv category
house
garden
décoratio Garden
n lounge order
bill article
List of attributes
Object-oriented DBMS
Classe : Musicien
Object-oriented approach (OOA) : basic concepts
Name
Age
The concepts manipulated by the database Gender
Passions
Each row (also called tuple), represents an row 1 Bob Smith 32 Man
instance, giving the value of each attribute of the
object row 2 Diane 28 Woman
Cope
Object-oriented DBMS
Merise
MERISE is a French method born in the 1970s, developed by Hubert Tardieu. It was then
put forward in the 1980s, at the request of the Ministry of Industry, which wanted a
method for designing Information Systems.
MERISE is therefore a method of analysis and design of Information Systems, based on the
principle of the separation of data and processing.
It has a certain number of models (or schemas) which are divided into 3 levels:
• The conceptual level,
• The logical level,
• The physical level
The CDM is based on two main concepts: entities and associations, hence its second
name: the Entity/Association diagram.
Let's take the example of a developer who has to model a music group.
He is given the following management rules:
• For each musician, we must know the name, age, gender, passion
• For each group, we must know the name
• A group can be made up of one or more musicians, whose names are known
These rules are sometimes given to you, but you may have to establish them yourself.
You should ask the future users of your project to establish these rules yourself with
sufficient precision
Conceptual modelling
The id attribute
In the data dictionary, in order to be able to identify a row in a unique way, we will
generally add an attribute (a column) reserved to receive a unique identifier.
Thus, if we take our previous data dictionary, we schematize for example the entities
"Musician" and "Group" as follows:
Musician Group
id_m id_g
name_m name_g
age_m
gender_m
passion_m
Entity Group
Entity Musician
Conceptual modelling
Musician
Group
id_m
id_g
name_m
name_g
age_m
gender_m
passion_m
Association between the Musician entity and the Group entity
Conceptual modelling
Musician Group
id_m id_g
name_m form name_g
age_m
gender_m
passion_m
The relationship then reads as follows:
• "A musician forms groups "
• "A group is formed by musicians "
Conceptual modelling
0,0 0 No instance
0,1 None or only one instance
1,1 1 Exactly one instance
0,n 0,* None, one or more instances (no limit)
1,n 1,* At least one instance (no maximum limit)
n,n n * At least several instances (no maximum limit)
Binary associations are classified into three categories, according to the cardinalities
attached to them:
[1,1] (or one to one)
[1,n] (or one to many) (or many to one),
[n,n] (or many to many)
1- [1,1] : one-to-one
Person Chair
0,1
Sitting
0,1
musician group
0,n
form
1,n
woman children
Conceptual modelling
1,n children
woman many-to-one binary association
Give birth
1,1
•For one woman, how many children? 1 to n children. A woman can give birth to 1 or several children
•For one child, how many women? 1 to 1 woman. A child can be given to birth by a single woman
0,n
woman children
welcome many-to-many binary association
0,n
•For one woman, how many children? 0 à n children. A woman may have no children or several children in her home
•For one child, how many women? 0 à n woman. A child may be cared for by one or several women in a home
Conceptual modelling
Data types
If you try to store a value outside the range allowed by the type of your field, the DBMS will
store the nearest value.
For example, if you try to store 12457 in a TINYINT, the stored value will be 127
Data types
Numeric data types
Integers
attribut UNSIGNED
• You can also specify that your columns are UNSIGNED
• UNSIGNED allows you to say that you will always have a positive value.
• In this case, the length of the interval remains the same, but the possible values
are shifted, with the minimum being 0
This display size is usually used in combination with the ZEROFILL attribute.
This attribute adds zeros to the left of the number when it is displayed, so it changes the
default character to '0‘
In a text with accents and encoded in UTF-8, the accented characters take two bytes in
memory
It is therefore possible for a CHAR(5) to occupy more than 5 bytes in memory (but
impossible to store more than 5 characters).
Data types
Binary types are defined in the same way as text string types.
• VARBINARY(x) and BINARY(x) are used to store binary strings of up to x characters
(with the same memory management as VARCHAR(x) and CHAR(x)).
• For longer strings, there are the TINYBLOB, BLOB, MEDIUMBLOB and LONGBLOB types,
with the same storage limits as the TEXT types.
Data types
Warning
Databases cannot read or understand the unstructured contents of a binary string and
can only record them globally.
The database can only read the file name, file type and file size.
Database features such as sorting, filtering and searching for specific contents are
therefore not possible in a binary string.
Data types
The TIME type allows not only to store a precise time, but also a number of days.
It is therefore not limited to 24 hours.
As in DATETIME, the hour must be given first, then the minutes, then the seconds, each
part being separated from the others by the character.
When MySQL encounters an incorrect date/time, or one that is not within the validity range
of the field, the default value is stored instead.
This is the "zero" value of the type.
Data types
Warning
It is important to understand the uses and particularities of each data type, in order to
choose the best possible type when defining the columns of your tables.
Indeed, choosing the wrong data type could lead to :
wasting memory (ex: if you store very small data in a column designed to store large
amounts of data) ;
performance problems (ex: it is faster to search on a number than on a string);
behaviour contrary to that expected (ex: sorting on a number stored as an INT, or on a
number stored as a VARCHAR will not give the same result);
the impossibility of using features specific to a data type (ex: storing a date as a string
deprives you of the many temporal functions available).
Data types
Data types
For these different fields, find the most appropriate data type
• The type of value returned should match the values you are recording, so that their
treatment is optimal.
• The limits should be as restrictive as possible, but you must remember to leave enough
place for the column to meet all your needs.
• The weight should be as small as possible (number of bytes)
IP Addresses
E-mail address
Postcode
Name of a city
Currency
Phone number
Age
Data types
Data types
IP addresses
Most often, they are stored in a CHAR(15) column (15 bytes)
However, space can be saved by using an INT UNSIGNED integer (4 bytes), but we lose the
information on the : or the .
Value examples:
100.64.0.0/10
255.255.255.255
2000::/3
ip v4 : char(15)
ip v6 : char(11)
Data types
Data types
E-mail address
An e-mail address consists of three parts:
• a local part: RFC 2821 indicates that it must not exceed 64 characters
• and a domain name: limited to 255 characters
• These two elements are separated by the @ sign.
The maximum length would therefore be 64 + 1 + 255, i.e. 320 characters. We must
therefore use a TEXT.
Data types
Data types
Postcode
• In France, it is 5 digits. We can therefore use a TINYINT UNSIGNED which takes 1 byte
and can store values from 0 to 255.
We could have chosen CHAR(5) but it takes more memory space than a TINYINT
• Internationally, postcodes can also contain letters and be up to 10 characters long (for
example, the ZIP+4 code in the USA).
A CHAR(10) will therefore be more suitable.
Data types
Data types
Name of a city
According to Wikipedia, the longest city name in France is 45 characters long.
For the rest of the world, we should be wide enough with a field sized to accommodate 60
letters.
Data types
Data types
Currency
The ISO 4217 codes can be used.
For the Euro, the value EUR will be used and USD for the American dollar.
The column type in MySQL will therefore be a CHAR(3).
Data types
Data types
Phone number
INT(10) UNSIGNED ZEROFILL
the ZEROFILL attribute is used to add the missing zeros to make 10 digits.
Data types
Data types
Age
TINYINT UNSIGNED