You are on page 1of 24

Understanding data types

with Andrei Pociu

C++

Contents
Introduction
Books purpose What you should know What you should have 6 6 6

Author

Me Me and my projects BITZone Contact

7 7 7 7

What data types

Data types Adequate data type for adequate value

8 8

Signed / Unsigned

Zero does make a difference Positive and negative Signed and unsigned Advantages and disadvantages

10 10 11 11

10

Integers

int Positive / Negative Signed / Unsigned Unsigned int INT_MIN / INT_MAX / sizeof Overflows short SHRT_MIN / SHRT_MAX long LONG_MIN / LONG_MAX char Manipulating char bool

13 14 15 16 16 18 19 18 20 20 21 22 23

13

6 Introduction

Introduction
Books purpose This is a short book for those beginners in C++ or even in programming who didnt understand data types in C++. This book will only cover the C++ own types and will not teach you how you can create your own data types. There will be examples of how you should use data types and how you shouldnt. This book is divided in separate parts, this is part 1. It will teach you C++ integers. Float will be covered in part 2 of this book. What you should know This book is for those who know the basics of C++. You can still learn data types from this book without knowing the basics of C++, but I recommend you to read a complete book first, and then, when you reach the data types lesson in that book, and after you read it, you read this short book too, to be sure you completely understand data types. What you should have Im using Microsoft Windows XP as the operating system, and that means that this book applies only to Windows users, because data types differ in other operating systems like UNIX or Mac OS. You must have a compiler of course, and in this book, we use Visual C++ 6 Enterprise Edition. It makes a difference which compiler you are using, because data types often differ from one compiler to the other.

Author - 7

Author
Me I dare to write a whole page about the author, Andrei Pociu. If you want, you may skip this page because it has nothing to do with Understanding C++ data types. Im a 16-years old IT enthusiast who likes to program and wants to start a career in programming. I must excuse my English if I made any mistakes (and I certainly did), its not my native language, but Im improving it all the time. Me and my projects I like to write books and tutorials about everything I learn, immediately after I know all about that subject. This helps me completely understand all about the subject and I can teach others too. After part 1 of this book, I will develop part 2, which deals with float. I will write many other books for various programming languages BITZone BITZone is my IT enthusiasts community. Any beginner or experienced IT enthusiast, a programmer for example, can join our community. In this community, we make different projects like websites, programs, games, etc. You can freely join the team. BITZone had several sites, but know Im building a major site, with improved design and improved PHP MySQL engine. At the same time as Im writing this book I also work at the building of the new BITZone site, so its still under construction. Anyhow, you may visit BITZone at www.BITZone.tk and join the forums for now, that are online. Contact You can contact me on my website, or by mail at pociu@go.ro. You can also find me on IRC, on the Undernet server, on channel #BITZone. Or by phone at +40723.359.419.

8 What data types are

What data types are


Data types Many programming languages use different data types. This data types can hold different values. There are some data types that can hold small values, and use less memory, and there are data types that can hold bigger values, but use much more memory. Adequate data type for adequate value There is no reason for you to store a small value in a big data type, because the space that it occupies in the computer memory is wasted. Lets suppose you have a five characters word, like aloha, and the smallest data types you have can hold one character, five, or ten. Which should you use? Of course, if you cant use the data type that holds one character because it has room to store only the first character, a. You can use the data type that holds ten characters, and the situation would look like this:

Not too clever. Youre wasting five additional characters. The five characters that will remain empty will be reserved, because you declared 10 characters to be used. Therefore, the remaining 5 empty spaces cannot be used by another instance of a data type. Its reserved for this data type instance only. On the other side, you can expand the word later, if you wish, to a maximum of 10 characters. For example, you can write alohaaaa! and fill all the empty spaces. However, if you know this value doesnt change why should you use the ten characters data type and not the five characters one?

You save memory space and at the same time, your program gains

performance.

What data types are - 9

This was only an example to demonstrate you why data types are important. This is not a real-life example from C++. In the next lessons, we will work with more real-life examples.

10 Signed / Unsigned

Signed / Unsigned
Zero does make a difference Lets suppose we have a data type that can hold 6 different numbers. That means that it can hold numbers 0, 1, 2, 3, 4 and 5. Many make the mistake and think that 6 numbers, are the numbers from 1 to 6, or other make a bigger mistake, and say that it can hold numbers from 0 to 6. Computers use the number 0 as any other number. You must remember that. 0 is the first positive number. 1 is the second. We have 6 spaces for 6 numbers, this numbers start from 0 and end with 5:

I hope you agree now, that if we say a data type holds x values, it actually holds the values from 0 to x-1. For example if a data type holds 1000 values, it actually holds the numbers from 0 to 999. Positive and negative Until now, we only saw fictive examples of data types that hold positive values. However, in computer programming negative numbers are also used often. Are there any special data types that hold only negative numbers? No. Usual data types, that you will learn about in the following lessons, can be of two types, ones that hold only positive numbers, and ones that hold positive and negative numbers. Lets take our previous example, the data type that holds 6 numbers, from 0 to 5. Lets suppose that this fictive data-type can hold negative numbers too. However, there are only 6 slots that can hold numbers, and all are positive, there is no more room for negative numbers. No problem, we take one slot and reserve it for the - (minus) sign. Because negative numbers look the same as positive numbers, just that they have the minus sign in front of them. Therefore, in our example, with the numbers 0, 1, 2, 3, 4 and 5 we have:

You can observe that 0 is considered here a positive number, just as I said earlier. Signed and unsigned You probably realized that you have just learned what signed and unsigned data types are. Lets review the previous examples:

Signed / Unsigned - 11

This is an unsigned data type, because it accepts only positive values. Now look at this example:

This is a signed data type, it splits in two types of numbers, positive and negative. When you use an unsigned data type, you must specify it, and youll see later in this book how this can be done. When you use an unsigned data type, you arent supposed to specify it, because data types that dont have this specified, are unsigned by default (they support positive and negative numbers too). Advantages and disadvantages Probably you already understand what advantages / disadvantages the use of unsigned / signed data types involve. Using unsigned data types its better for some instances. For example if you have a program in which there are calculated the number of guests, tables, chairs, plates, etc., for a party. You can never have a negative number of guests, therefore you should use the unsigned data type, for gaining more numbers in positive field. You dont need the useless negative numbers. On the other side, if you make a program that holds your bank account information, with your daily income, spending and debt, you should use

12 Signed / Unsigned

signed version of that data type. Because if you are in debt, the money left in your account will be represented with minus (ex.: -12530$). For the computer it makes no difference if you use signed or unsigned version of the data type, regarding the memory space or performance.

Integers

Integers - 13

What are they If you look in a dictionary for a definition of integers you will find something like this: whole number: any positive or negative whole number or zero. This is the definition you probably had in the manuals from elementary school. I hope that after the above definition, you agree that integers are numbers, like 12, 3, 2134, 0, -230, etc. There are five types of integers: int, short, long, char, bool These integers can also be of two types, signed or unsigned. That makes eight different types for you to use. Data types dont have a fixed size every time. Thats because they depend on the machine. We talked about this earlier at the introduction of this book, but Im reminding you that if you use the code in this book on a different compiler than Microsoft Visual C++ 5 or 6, and on a different operating system than Windows (this may be NT, 95, 98, Me, 2000, XP, 2003) you might get different results, and therefore youll get confused. If you compile most of the code we use for examples in an older compiler like Borland C++ 3.1, you will notice big differences. We will mention some of them, but not all. Before we move on The following lessons will make you understand and use different types of data types. For you to understand these, you must know how the computer stores data, you must know what bits and bytes are and other important knowledge about how computers store data and programming. The int data type is used often. It store numerical values only. It has the size of 4 bytes on a Windows OS, with VC++ 5.0 and 6.0. On the old Borland C++ 3.1 for DOS int has the size of 2 bytes. However, thats not our subject here. As you should know, a byte holds 8 bits. int data type can hold a maximum value of 4 bytes, that means 32 bits (4 bytes * 8 bits). Again, as you should already know, 32 bits mean 4294967296 different combinations, because 32 bits = 2^32

int

14 Integers

(that means 2 to the power of 32). And the result is: 2^32 = 2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2 *2*2*2*2*2*2*2 = 4294967296 That means that you can store any number from 0 to 4294967295 in an int data type. Why not until 4294967296 but until 4294967295? Because zero is a number too, and he occupies a space in this int data type. You must get used to including 0 everywhere, just like 1 and 12 and any other number, when programming. For example, lets imagine there would be a data type that can store 6 numbers. That means that it can store the numbers from 0 to 5. 0, 1, 2, 3, 4, 5 these are 6 numbers So, with int we can store numbers from 0 to 4294967295. Yet, not exactly. Remember, there are negative and positive numbers. If we use int for storing numbers from 0 to 4294967295 and we need a negative number, what shell we do? Positive / Negative We reached the signed / unsigned chapter in which youll learn how negative numbers are represented. I must take an example for demonstrating you how signed / unsigned data types work. Because you have just learned about the int data type, lets take it as an example. We said that the int data type can hold a maximum of 4 bytes, that is 32 bits, and 32 bits mean 32^2 different values. Moreover, 32^2 equals 4294967296 positive numbers (zero is considered positive). Earlier we asked ourselves How can we represent negative numbers?. Well, lets think negative numbers are the same as positive numbers, with the exception of the (minus) sign in front of them. 12 is the same as -12 with the - sign exception, right? What if we take one bit from that 32 that an int can hold, and put the - sign on it to denote that the number is negative (or positive). Yes, thats the way programming languages use negative and positive numbers. If we take that one bit, we have 31 more left (32bits 1 bit). There are 31 left for representing the numbers. Because a bit can have two values (1 and 0, true and false) we have 2^31 different values for representing the numbers. 2^15 =

2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2 *2*2*2*2 = 2147483648 Half values from what we earlier had. But now we have two types of values with 2147483648 combinations each. One with plus, and what with minus, remember we reserved a bit for negative values and one for positive (- and +). Therefore, we can hold values from -2147483648 to +2147483647. If you are asking yourself why not +2147483648 rather than +2147483647, it means you have big problems with math and memory. Didnt we just said that 0 is considered positive? Hope we cleared that out once and for all. Finally, you can see we have again the same amount of 4294967296 values, but now they are divided in two equal parts, positive and negative.

Integers - 15

Signed / Unsigned You just learned what signed and unsigned data types are. I did? You may ask yes, you did. Negative and positive values, this are signed and unsigned data types. Lets review. Signed values are the ones that have that bit of sign, + or -, that splits the 4294967296 values in two pieces. All data types are signed by default, therefore there is no need to specify if you want the data type to be signed. Unsigned data types are the ones that hold only positive values. They can hold 4294967296 values (numbers from 0 to 4294967295). Why dont unsigned data types need a sign to denote that they are positive, and use all 32 bits? Because if the number is not negative, how can it be? Positive, of course. You usually dont write a + in front of positive numbers because numbers that are not signed (unsigned), are positive by default, agree? Opposite to signed, unsigned data types are not the defaults in programming languages. If you want your data type to be unsigned, just specify it. You will see in the next lesson how this can be done. Unsigned int I just said data types are signed by default, which means that we dont need to specify to the data type anything. x;

We have just defined a signed int variable named x. Its signed because we didnt specified anything, and signed is the default value for the int type. Its int because int is the default data type, and if you dont specify any special data type like short or long, C++ uses the default data type, int. Of course, if you want, you can specify this, but its definitely a waste of code and time: signed int x; This variable can hold any number from -2147483648 to +2147483647. However, in this lesson we talk about how to define an unsigned int variable. I believe you already guessed how this is done: unsigned int y; Im telling you again, that because int is the default data type, you can also write: unsigned y; This variable, y, can hold any number from 0 to +2147483647. INT_MIN / INT_MAX / sizeof How can you find the minimum and maximum numbers a int data type can hold? You might say that you already know, a signed int can hold a minimum of -2147483648 and a maximum of +2147483647, and an unsigned int has the minimum 0 and maximum 4294967296. However, this is only on PCs with Windows OS and new compilers, like VC++ 5 or 6, or Borland C++ Builder 6. What if you use a different compiler, OS, or even some other type of computer? We make a little program that tells us the minimum and maximum values for that data type. For int, we have INT_MIN and INT_MAX for that. Lets see the complete code for our little program. The program is compiled and executed on a PC with Windows XP and compiled with VC++ 5.0, VC++ 6.0, Borland C++ Builder 6 and with the old Borland C++ 3.1 for DOS. Here you will some major differences: #include <iostream> using namespace std; #include <climits>

16 Integers

Integers - 17

int main() { cout << "int has " << sizeof(int) << " bytes.\n"; cout << "Minimum int value = " << INT_MIN << "\n"; cout << "Maximum int value = " << INT_MAX << "\n"; return 0; } Now its time to see the result of the compiled code on different compilers. In Visual C++ 5 and 6, and in Borland C++ Builder 6 we have the same output, because these are the latest compilers: int has 4 bytes. Minimum int value = -2147483648 Maximum int value = 2147483647 Compile this with Borland C++ 3.1 for DOS and note the difference: int has 2 bytes. Minimum int value = -32768 Maximum int value = 32768 Thats because when Borland C++ 3.1 was used, the int data type had 2 bytes. I believe that now you can see the importance of using INT_MIN and INT_MAX when dealing with a compiler that you are not familiar with. You never know what the compiler may produce, and the compiler may throw some errors that you dont know from where they came. It often happens that the error to be caused by a different size assigned by the compiler for the data type. In this case the error compiler displays, is an overflow error Didnt I forget something? Its sizeof. Sizeof is an operator that returns the dimension in bytes of a type or variable. If you use sizeof with a variable, you can use it like this:

18 Integers

sizeof x; Where x is a variable, for example. However, if you use it to find the dimension of the data type directly, you must specify the data type like in the following example: sizeof(int); In our above example we used the second option. Overflows Lets see again what the dictionary says about this word: flow or pour over: to pour out over the limits or edge of a container because the container is too full of liquid. Its pretty the same what happens when the compiler shows this error. This is the result of assigning a number too big for that data type. For your recently learned data type, int, lets put a bigger value than 2147483647, lets say 3424783547. #include <iostream> using namespace std; #include <climits> int main() { int x = 3424783547; cout << "variable x holds value " << x << ".\n"; return 0; } The VC++ 6.0 compiler gives the following output: variable x holds value -870183749. This is, by far, wrong because of an overflow problem. By the way, if the int is unsigned, like this: #include <iostream> using namespace std;

#include <climits> int main() { unsigned int x = 3424783547;

Integers - 19

cout << "variable x holds value " << x << ".\n"; return 0; } The output is now the right one: variable x holds value 3424783547.

short

Even its name denotes that short is a data type shorter than int. That means it can store less. Actually, the full name is short int (a shorter int). A short is 2 bytes long, that means 16 bits. Therefore, if it is unsigned, it can hold values from 0 to 65535 (65536 different values). By default, like all the other data types, it is signed, and it can hold numbers ranging from -32768 to +32767. To find out what are the minimum and maximum values a short can hold on different compilers, you can use: SHRT_MIN / SHRT_MAX Same as INT_MIN and INT_MAX but for the short data type. #include <iostream> using namespace std; #include <climits> int main() { cout << "short has " << sizeof(short) << " bytes.\n"; cout << "Minimum short value = " << SHRT_MIN << "\n"; cout << "Maximum short value = " << SHRT_MAX << "\n"; return 0;

20 Integers

This is the complete code we use for testing the size of short. Again, lets see the result, this time only in VC++ 5 and 6, because you saw earlier the differences of the output when compiling with other compilers (usually older). Moreover, I told you at the beginning of the book that well work with VC++ only. short has 2 bytes. Minimum int value = -32768 Maximum int value = 32767 This is the output VC++ 5 and 6 shows in a DOS box under Windows XP. It stored more than int, and therefore much more than short. Just like in short, the full name is long (a longer int). Why did I say It stored more? Because it doesnt store more any longer. On the old days of DOS, int was 2 bytes long and the long, was really long, 4 bytes. Now int is 4 bytes, and long is 4 bytes too, at least on compilers like VC++ 5 or 6. Therefore, it doesnt make a big difference if you use int or long, because now they are the same. Whats hilarious is that on the old days of DOS, short was the same size as int, now short is smaller than int, and long is the same size as int. LONG_MIN / LONG_MAX Same as INT_MIN and INT_MAX and SHORT_MIN and SHORT_MAX but for the long data type. On many new compilers, it produces the same output as int data type. #include <iostream> using namespace std; #include <climits> int main() { cout << "long has " << sizeof(long) << " bytes.\n";

long

cout << "Minimum long value = " << LONG_MIN << "\n"; cout << "Maximum long value = " << LONG_MAX << "\n"; return 0; } And the output on VC++ 5 and 6: long has 4 bytes. Minimum int value = -2147483648 Maximum int value = 2147483647 This is the output VC++ 5 and 6 shows in a DOS box under Windows XP.

Integers - 21

char

Till now we only stored numbers. What about characters? We use char. char data type is used especially for characters. char uses at least 1 byte for storing characters. The 8 bits that char can hold permits him to have 256 different values. This means 256 characters, which is enough to hold characters for all types of languages probably. Computers use numerical codes for storing characters. Actually, they use integers. Every character has its own numerical code. You probably heard about the ASCII standard. American Standard Code for Information Interchange is the most frequently used character set. The A character has the code 65 in ASCII. B character has the code 66. Uppercase characters are completely different from lowercase. Therefore, a has a separate code, 97, while A is 65. Signs are also considered characters. Therefore, #, %, &, ?, are also characters. Event the space we leave between words is a character, that means is a character, and, by the way, it has the ASCII value 32. Manipulating char This code defines a variable of type char that holds the character X. The cout outputs the character that the variable holds: #include <iostream> using namespace std;

int main() { char x = 'X'; cout << "Variable x holds character: " << x << "\n"; return 0; } The following code will ask you to enter a character and then it will display it: #include <iostream> using namespace std; int main() { char x; cout << "Enter a character:\n"; cin >> x; cout << "You typed " << x << ".\n"; return 0; } You can also type the ASCII code of a character and make the output to be the character that is represented by that code. Alternatively, in reverse, type the character and see the ASCII code for it. Lets see example code: #include <iostream> using namespace std; int main() { int x; cout << "Enter ASCII character code:\n"; cin >> x; cout << "Character code is " << char(x) << ".\n"; return 0; } First, a usual int variable is declared. Then ask the user to type a

22 Integers

character, which is stored in the int variable named x. At the second output, char(x) is used to change the data type of the variable content into char. Therefore, char takes the code (lets suppose you typed 65), and considers 65 to be the ASCII code for a character (A). Now, to type a character and see the ASCII code: #include <iostream> using namespace std; int main() { char x; cout << "Enter character:\n"; cin >> x; cout << "ASCII code for the letter is " << int(x) << ".\n"; return 0; }

Integers - 23

bool

Its a new C++ data type thus very useful. It is named like this in the honor of the great mathematician Boole George. There are only two boolean values, true and false. The size of a bool data type is 1 bit, which means there are only two possible values, 0 and 1 (false and true). bool x = true; bool y = false; The bool data type can be assigned to an int: int x = true; int y = false; Now x is 1 and y is 0. Also, any positive or negative value is considered true. Only 0 represents a false value. int x = true; bool x = 1; bool y = 235; // true // true

bool z = -53; bool q = 0;

24 Integers

// true // false