You are on page 1of 10

Data Types

Data Types

Introduction A data type is category of data in computer programming. There are many types
so are clustered into four broad categories (numeric, alphanumeric (characters and
strings), dates, and boolean (logical true/false)). Within those categories the
differences between the data types are extremely important because the choice of
data type dictates what can be done with the data.

Data types are used when declaring variables, defining XML schema and RDBMS
fields. A variable is a kind of container to hold a value. Example: String userName
= "Smith" means a variable userName is a String data type and has been assigned
(=) a value of the sequence of characters S m i t h.

A list of data types appears at the bottom of this document.

When creating a relational database management system (RDBMS) database and defining the tables in the
DB, or when creating XML Schema, the creator must know what data to gather, how theyll be manipulated
and presented to users, and, to make this all possible, know how the data should be stored by the
computer, in other words, the data type. For example, a persons name consists of a first name, a middle
initial, and a last name, "Suky A. Cat." A street address usually has the street number, street name, and
possibly an apartment number, e.g., 8 Newton Street. The parts of the name and the address should be
stored as the "String" data type; think of a "String" as a continuous sequence of alphanumerics. [Note that
even if the data look like a number, say a telephone number, Social Security number, or street number,
store them as Strings. The only time numbers are saved as numbers is if there is going to be calculations
performed on them, for example "hours worked" or "pay rate".]

Dates The Date type conforms to an ISO standard for storing date information in this form: YYYY-MM-DD,
or year, month, and day, 2010-12-25 is Christmas Day, December 12, 2010. The storage of data (internal
representation) can be different from the users view of the data (external representation). Even though
we store data as YYYY-MM-DD we can convert the data by the computer program or script before
displaying it to the user. The date type is extremely useful and commonly found: JavaScript, Java, MySQL,
php - every programming and scripting language has a date object. [See JavaScript, Java, MySQL, php for
examples of the Data objects use.] Date objects also generate the time in hundreds of milliseconds. This
Java example shows how to create a date object and store it in a String variable:

String dateTimeStamp = new java.sql.Timestamp(System.currentTimeMillis()).toString();

The actual result in this example is 2009-07-10 02:55.192837. Take a moment to break the command into
pieces - it is a great way to learn how to think as programmers and computers (seem) to.

In the very middle of the command is System.currentTimeMillis(). Notice the word System. This is an
example of a reserve word meaning programmers should not use it because the term refers to an Object
that already exists in the code libraries the programmer uses. (Each language has its own its own set of
reserve words.) System always means "ask the operating system to do something". The parentheses ( )
always mean "do something or get something". So we can interrogate the command: Ask the System to do
something What? Get the current time.

Notice the .toString(). The . always suggests to you that there is an object and we want to get a method
(the function doing some work) from that object. (The . is called a dot operator. The same thing you see in

1
Data Types

SQL, e.g., table.fieldname, e.g., student_workers.idno.) In this case the object is "Timestamp" that resides in
the sql library, which is part of the Java programming languages library of pre-compiled code; the
Timestamp object has a command to convert the time into something we can understand, a String. Then
store the newly-converted data in a variable, called "dateTimeStamp".

This display is useful but not attractive. All programming languages have methods that will convert the
date stamp into other formats such as "Jan. 03, 2010" or "January 3, 2010" or "10.01.03", etc.

When storing the date in an SQL table, the date is set off by single quotes (as all String and Date objects are
for SQL). In this example, a field called start_date is assigned a date for user ID 100 in table student_staff:
UPDATE student_staff SET start_date = 2010-01-15 FOR userID = 100;

String To indicate a String use double quotes, e.g., "cat". To a computer, the data in the string is immaterial.
Computers fetch the String from RAM by the memory address of the first letter:
S u k y A . C a t
x0010 x0011 x0012 and so on ...

Strings can be concatenated to form new strings1. Say we have String a = "Hello"; and String b = "Tom" we
could join these strings into a new one. Using the command "print", which means send data to (or print on)
the standard output device (the default is the monitor) whatever is in the parentheses:
Command Display
System.out.print( a + b ); Hello Tom

We can mix-and-match variables with "string literals." Here we want to let Tom know hes late:
Command Display
System.out.println( a + ", " + b + ". Youre late."); Hello, Tom. Youre late.

It is common to create web pages and other documents on-the-fly ("in real time") by merging String literals,
variables, and data retrieved from database tables. In this example, we mix html tags, literals, and the
last_name field from a database into a single big String that is then streamed back to the user, via a Servlet
OutputStream object ("sos"). [See Topic web servers for specifics.]

sos.println("<b> Welcome, Ms. "+rs.getString("last_name") + " to our program. </b>");

Character (Byte) Characters and bytes are interchangeable in a computer. A character is a single
alphanumber and is limited to alphanumerics only. To create a character variable, use single quotes: char
myCharacter = a. This is important because you cannot concatenate Strings and characters. For instance,
say you have a String s = "My Grade is " and a char grade = A, you cannot issue the command
print(s + grade). However, you can convert the character into a String first and then join the strings,
called casting.

Numbers
Numbers can consume tremendous amounts of storage space and RAM so it is important to use
right type for your numbers. The most commonly found data type for numbers is int or integer. The int
type holds whole numbers only, e.g., 0, 39, 122. Two other important data types, useful in math and
finance, are float and double. These data types separate their values into two parts: the part before the
abscissa and the part after. For example, if you want to store pi () as 3, use int; if you want 3.1415... use

2
Data Types

float; if you want 3.141592...9 use double. Usually, the size of a float is 236; a double is 264. In other words
extremely large numbers!
As an example say you want to determine someones grade on a quiz. There are 100 questions and the
student answered 83 of them correctly. The percent correct is "number_correct/100" [83/100]. If we
declare two integers for the number_correct and the total_no_of_questions and store the results of their
division in another int we will not see the results we expect the value is shown as 0% instead of the
proper 83%.
int number_correct = 83;
int total_no_of_questions = 100;
int percent_correct = number_correct/total_no_of_questions; // = 0

The int type has no storage room for numbers after the abscissa (.83) leaving only the 0.

However, if we change percent_correct to the float data type, theres no problem:


float percent_correct = number_correct/total_no_of_questions; // = 0.83

Other important data types


There are many data types that are available from programming libraries and a limitless number of data
types created as objects. Here is a couple of useful data types and how they might be used.

Array An array is an "ordered arrangement of data elements." It is analogous to a egg crate - a


single container to a group of other things. Say youre creating a table of staff members. Youll want last
name, first name, middle initial, department number, and email. If the number of Strings arent going to
change, the array is a good choice to use. Here we create an array, called staff_member, and assign it five
slots (like 5 eggs in the egg crate). Each slot has its own number, starting with 0. The slots are empty at this
point...:

String[] staff_member = new String(5);

String[ ] [ ] means "this is an array"; the String says this is an array of strings
staff_member this is the name of the array variable
=
new get ready to save enough memory for this array
String(5) of five Strings.

To the computer, the array looks something like:


staff_member[0] = "";
staff_member[1] = "";

staff_member[4] = "";

Now, we can assign real data into our array:


staff_member[0] = "Smith";
staff_member[1] = "Nancy";
staff_member[2] = "A";
staff_member[3] = "English Dept.";
staff_member[4] = "smith@myschool.edu";

In the real world you would continue to add names, or more efficiently, create a "staff_member object" to
cluster all the Strings and other data. But for our purposes, you see how the array was declared and
instantiated; then data were stored in the array. To get the data back out of the array, we refer to the name

3
Data Types

of the array and the "index" of the data we want. Here, the last name data are stored in the first member of
the array, 0:

System.out.print("The teachers name is "+staff_member[1] +"


"+staff_member[0]);

would appear as The teachers name is Nancy Smith.

Arrays are useful, but may become unwieldy if there are a lot of data. But they are useful especially for
commonly-used data that are not going to be changed. For example, Dates are often displayed using
arrays:

String[] days_en = {"Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday",


"Saturday"};
String[] days_fr = {"dimanche", "lundi", "mardi", "mercredi", "jeudi", "vendredi", "dimanche"};

Now we might say on screen, System.out.println("Bonjour, cest "+days_fr[1]) to see Bonjour,


cest lundi.

Vector Strings are not usually changed after being created. There Vector data type is like an array
but can be easily dynamically resized and it is not limited to a single data type. Think of a vector as an egg
crate that can hold eggs, toast, coffee, anything Vectors are designed to hold objects and so vectors see
everything as an object, regardless of what you store. This means you store data as you want, but retrieve
the data from the array and cast it back into the data type you need.

Here are the steps:


Declare a vector and name it "vStaff": Vector vStaff = new Vector();
Store data in the vector: vStaff.addElement("Smith"); // this is a String
vStaff.addElement("Nancy");
vStaff.addElement( 25 ); // this is an integer
Retrieve the data: String name = (String)vStaff.elementAt(1) + " " +
(String)vStaff.elementAt(0);

Here, the (String) is an explicit cast, converting the selected elements data from Object to String. Now we
might say System.out.println("The teacher is "+name); and see "The teacher is Nancy Smith"

Object For completeness sake, here is the idea of a staff_member expressed as our own object.
[See Topic Objects for fundamentals about objects.] Like the vector example, we want to store
heterogeneous data types all of which refer to the same entity, the "staff member". So we create an object,
StaffMember. In this example in Java, the class is declared public, meaning the class can be used by other
parts of the program.

class StaffMember {
String last_name;
String first_name;
String dept_name;
float hourly_wage;
float hours_worked;

4
Data Types

boolean checkPrinted;

public void StaffMember() {


}
public void StaffMember(String lname, String fname, String dept) {
this.last_name = lname;
this.first_name = fname;
this.dept_name = dept;

public void setHourlyWage(float wage) {


this.hourly_wage = wage;
}
public void setHoursWorked(float hoursworked) {
this.hours_worked = hoursworked;
}
public float getWeeklyPay() {
return hourly_wage * hours_worked;
}
public boolean isCheckPrinted() {
return checkPrinted;
}
public void printCheck() {
printTheCheck();
checkPrinted = true;
}
protected void printTheCheck() {
// this part would print the check.
// it cannot be accessed from outside the program.
}
}

To use this object, we create an "instance" of the object. For example, say we have two staff members,
Smith and Jones. We can create an object for each of them:

StaffMember smith = new StaffMember("Smith", "Nancy", "English Dept");


StaffMember jones = new StaffMember("Jones", "Bill", "French Dept");

Later in the program we could issue commands that make sense both to the computer and to the user:
smith.setHourlyWage( 25.15 );
smith.setHoursWorked(15.5);
jones.setHourlyWage(15.95);

We can imagine now someone printing checks:


if ( !(smith.isCheckPrinted() )) { // ! means "not"
smith.printCheck();
}

To automate the process completely, lets put all the staff members in a single array and complete payroll:
String[] staffers = { "Smith", "Jones", "Benot", "Hussey", "Bix" };
String tempName = "";
for (int i = 0; i < staffers.length(); i++) {
tempName = staffers[i];
if ( !(tempName.isCheckPrinted()) {

5
Data Types

tempName.printCheck();
}
}

_________________________________________________________________________________________

Related topics Scripting


Objects
SQL

Readings None

What to know Encapsulation, inheritance, polymorphism


how objects are vital to information processing

Demonstration None

_________________________________________________________________________________________
Data types in Java (and C++):

This is an optional reading. The first part is from Sun Microsystems Java website. The second part is from
the LIS458, Relational Database Class class materials.

Primitive Data Types


The Java programming language is strongly-typed, which means that all variables must first be declared before they
can be used. This involves stating the variable's type and name, as you've already seen:

int gear = 1;
Doing so tells your program that a field named "gear" exists, holds numerical data, and has an initial value of "1". A
variables data type determines the values it may contain, plus the operations that may be performed on it. In
addition to int, the Java programming language supports seven other primitive data types. A primitive type is
predefined by the language and is named by a reserved keyword. Primitive values do not share state with other
primitive values. The eight primitive data types supported by the Java programming language are:
byte: The byte data type is an 8-bit signed two's complement integer. It has a minimum value of -128 and a
maximum value of 127 (inclusive). The byte data type can be useful for saving memory in large arrays,
where the memory savings actually matters. They can also be used in place of int where their limits help to
clarify your code; the fact that a variable's range is limited can serve as a form of documentation.
short: The short data type is a 16-bit signed twos complement integer. It has a minimum value of -32,768
and a maximum value of 32,767 (inclusive). As with byte, the same guidelines apply: you can use a
short to save memory in large arrays, in situations where the memory savings actually matters.
int: The int data type is a 32-bit signed two's complement integer. It has a minimum value of
-2,147,483,648 and a maximum value of 2,147,483,647 (inclusive). For integral values, this data type is
generally the default choice unless there is a reason (like the above) to choose something else. This data type
will most likely be large enough for the numbers your program will use, but if you need a wider range of
values, use long instead.
long: The long data type is a 64-bit signed two's complement integer. It has a minimum value of
-9,223,372,036,854,775,808 and a maximum value of 9,223,372,036,854,775,807 (inclusive). Use this data
type when you need a range of values wider than those provided by int.

6
Data Types

float: The float data type is a single-precision 32-bit IEEE 754 floating point. Its range of values is
beyond the scope of this discussion, but is specified in section 4.2.3 of the Java Language Specification. As
with the recommendations for byte and short, use a float (instead of double) if you need to save
memory in large arrays of floating point numbers. This data type should never be used for precise values,
such as currency. For that, you will need to use the java.math.BigDecimal class instead. Numbers and
Strings covers BigDecimal and other useful classes provided by the Java platform.
double: The double data type is a double-precision 64-bit IEEE 754 floating point. Its range of values is
beyond the scope of this discussion, but is specified in section 4.2.3 of the Java Language Specification. For
decimal values, this data type is generally the default choice. As mentioned above, this data type should
never be used for precise values, such as currency.
boolean: The boolean data type has only two possible values: true and false. Use this data type for
simple flags that track true/false conditions. This data type represents one bit of information, but its "size"
isn't something that's precisely defined.
char: The char data type is a single 16-bit Unicode character. It has a minimum value of '\u0000' (or 0)
and a maximum value of '\uffff' (or 65,535 inclusive).
In addition to the eight primitive data types listed above, the Java programming language also provides special
support for character strings via the java.lang.String class. Enclosing your character string within double quotes will
automatically create a new String object; for example, String s = "this is a string";. String objects
are immutable, which means that once created, their values cannot be changed. The String class is not technically
a primitive data type, but considering the special support given to it by the language, you'll probably tend to think of
it as such. You'll learn more about the String class in Simple Data Objects
Default Values
Its not always necessary to assign a value when a field is declared. Fields that are declared but not initialized will be
set to a reasonable default by the compiler. Generally speaking, this default will be zero or null, depending on the
data type. Relying on such default values, however, is generally considered bad programming style.
The following chart summarizes the default values for the above data types.
Data Type Default Value (for fields)
byte 0
short 0
int 0
long 0L
float 0.0f
double 0.0d
char '\u0000'
String (or any object) null
boolean false
Local variables are slightly different; the compiler never assigns a default value to an uninitialized local variable. If
you cannot initialize your local variable where it is declared, make sure to assign it a value before you attempt to use
it. Accessing an uninitialized local variable will result in a compile-time error.

Literals
You may have noticed that the new keyword isn't used when initializing a variable of a primitive type. Primitive
types are special data types built into the language; they are not objects created from a class. A literal is the source
code representation of a fixed value; literals are represented directly in your code without requiring computation. As
shown below, it's possible to assign a literal to a variable of a primitive type:
boolean result = true;
char capitalC = 'C';

7
Data Types

byte b = 100;
short s = 10000;
int i = 100000;
The integral types (byte, short, int, and long) can be expressed using decimal, octal, or hexadecimal number
systems. Decimal is the number system you already use every day; it's based on 10 digits, numbered 0 through 9.
The octal number system is base 8, consisting of the digits 0 through 7. The hexadecimal system is base 16, whose
digits are the numbers 0 through 9 and the letters A through F. For general-purpose programming, the decimal
system is likely to be the only number system you'll ever use. However, if you need octal or hexadecimal, the
following example shows the correct syntax. The prefix 0 indicates octal, whereas 0x indicates hexadecimal.
int decVal = 26; // The number 26, in decimal
int octVal = 032; // The number 26, in octal
int hexVal = 0x1a; // The number 26, in hexadecimal
The floating point types (float and double) can also be expressed using E or e (for scientific notation), F or f
(32-bit float literal) and D or d (64-bit double literal; this is the default and by convention is omitted)
double d1 = 123.4;
double d2 = 1.234e2; // same value as d1, but in scientific notation
float f1 = 123.4f;
Literals of types char and String may contain any Unicode (UTF-16) characters. If your editor and file system
allow it, you can use such characters directly in your code. If not, you can use a "Unicode escape" such as
'\u0108' (capital C with circumflex), or "S\u00ED se\u00F1or" (S Seor in Spanish). Always use 'single
quotes' for char literals and "double quotes" for String literals. Unicode escape sequences may be used
elsewhere in a program (such as in field names, for example), not just in char or String literals.
The Java programming language also supports a few special escape sequences for char and String literals: \b
(backspace), \t (tab), \n (line feed), \f (form feed), \r (carriage return), \" (double quote), \' (single quote),
and \\ (backslash).
There's also a special null literal that can be used as a value for any reference type. null may be assigned to any
variable, except variables of primitive types. There's little you can do with a null value beyond testing for its
presence. Therefore, null is often used in programs as a marker to indicate that some object is unavailable.
Finally, there's also a special kind of literal called a class literal, formed by taking a type name and appending
".class"; for example, String.class. This refers to the object (of type Class) that represents the type itself.

[String: TBA]

Now lets see some of MySQLs data types. MySQL (and all relational database management systems)
supports different kinds of data with the same consequence for design: how much room with the data occupy on the
hard drive? We could classify SQL data into numeric, date, time, and string types. Just to give you the idea of the
how much storage the different values require, here is a list. [See MySQL or your database vendors homepage or
help site to know the exact sizes for your operating system and database version.]
Numeric types:
1. bit (same as TINYINT(1)) and can be used as a boolean: SELECT IF(0, true, false); means
"If we select some data that is true, give us a value (or return) the number 1, otherwise
show the number 0.
2. smallint (signed or unsigned); range from -32768 to 32767; unsigned range is 0-65535
3. mediumint (signed or unsigned); range from -8388608-8388607; unsigned 0-16777215
4. int (normal sized integer; signed or unsigned); range -2147483648 to 2147483647; unsigned
is from 0 to 4294967295
5. bigint (a large integer!); unsigned range is -9223372036854775808 to 9223372036854775808;
unsigned range is 0 to 18446744073709551615.
6. float -3.402823466E+38 - 1.175494351E-38; or unsigned up to 3.4202823466E+38

8
Data Types

7. double
8. double precision
9. decimal
Date types:
1. date 1000-01-01 to 999-12-31, in YYYY-MM-DD format
2. datetime 1000-01-01 00:00:00 to 9999-12-31 23:59:59 in YYYY-MM-DD HH:MM:SS
3. timestamp 1970-01-01 00:00:01
4. time -838:5959 to 838:5959
5. year in 2 or 4 year formats, 1901 to 2155; 70 to 69 (referring to 1970 to 2069)
One reason you need to know this is because certain database functions, such as SUM() and AVG() do not
work with temporal values without converting the numbers first.
String types:
charset Identifies which encoding to use (synonym for "Character Set"). This is important
because weve moved away from ASCII to Unicode. This shows how you would create a
database table and specify that the database will accept UTF8 characters:
CREATE TABLE t (
c1 VARCHAR(20) CHARSET utf8
);
If you want ASCII, specify CHARACTER SET latin1
For Unicode, specify CHARACTER SET ucs2
char a fixed-length string that is always right-padded with spaces to the specific length when
stored; the range is 0-255 characters
varchar creates a character column, 0-255 characters
binary similar to char but stores binary byte data
varbinary
tinyblob a blob column with a maximum length of 255 (28-1) bytes
tinytext a text column with a maximum length of 255
blob a column of up to 65,535 (216-1) bytes
mediumblob 224-1 bytes, 16,777,215
longblob 4,294,967,295 or 4GB, (232-1)
longtext 4,294,967,295 or 4GB, (232-1)
enum an enumeration, or a strong object, that can have one value chosen from a list of values
that are assigned to it during creation (e.g., ENUM(value1,value2,)
SET a string object that can have zero of more values, each of which must be chosen from the
list of values assigned during creation (e.g., SET(value1,value2,). A set
column can have a maximum of 64 members; theyre represented internally as integers.

Datatypes in MySQL

CHAR[length] A fixed-length field from 0 to 255 characters long


VARCHAR[length] " "
TINYTEXT A string with maximum length of 255 characters.
TEXT A string with maximum length of 65,535 characters.
MEDIUMTEXT A string of max length 16,777,215 characters
LONGTEXT A string of max length of 4,294,967,295 characters.

9
Data Types

TINYINT[length] Range of -128 to 127, or 0-255 unsigned (not positive or negative sign)
SMALLINT[length] Range of -32,768 to 32,767 or 0-65,535 unsigned.
MEDIUMINT[length] Range of -8,388,608 to 8,388,607 or 0 to 16777,215 unsigned.
INT[length] Range of -2,147,483,648 to 2,147,483,647 or 0-4,294,967,295 unsigned.
BEGINT[length] Range of -9,223,372,036,854,775808 to 9,223,372,036,854,775808 or
0-18,446,744,073,709,551,615 unsigned
FLOAT A small number with floating decimal point.
DOUBLE[length, decimals] A large number with floating decimal point.
DECIMAL[length, decimal] A DOUBLE stored as a string, allowing for a fixed decimal point
DATE In YYYY-MM-DD format.
DATETIME In YYYY-MM-DD HH:MM:SS format.
TIMESTAMP In YYYYMMDDHHMMSS; note that the last possible date is 2037.
TIME In HH:MM:SS format
ENUM "Enumeration", that means each column can have one of several possible values
SET Like ENUM except each column can have more than one of several possible values.
1
[Advanced point: It is computationally more efficient to use a StringBuffer object.]

10

You might also like