You are on page 1of 12

INTRODUCTION TO C

C was developed by Dennis Ritchie at AT & T’s Bell Laboratories of USA in the year 1972. C is seen as a
programming language that is more reliable, simple and easy to use. C allows its programmers to directly
access memory locations.

C CHARACTER SET
Character set refers to all possible characters that are allowed in C programming language construct. These
characters can be used to form variables, expressions, statements and in C character set is classified into 4
categories:
1. Letters: a to z and A to Z, means all English alphabets in both upper and lower case.
2. Digits: All the decimals digits from 0 to 9.
3. White spaces: these refers to the blank area of spaces that are use to divide words or tokens in a C
program. White spaces can be given either by using space bar, tabs, new line or comments.
4. Special characters: These include special characters with certain predefined purposes. Some of
the examples of special characters in form of operators and punctuators are as follows

+, -, *, /, %, <, >, <= >= ,==, !=, =, @, [, ], (, ), $, “, ;, :, &, ^ etc.


TOKENS

A token is the smallest element or unit in a C program that is meaningful to the compiler. Before the
compiler can translate the C program into machine language, the compiler identifies various tokens that
are part of the program during its lexical analysis phase. C parser recognizes following category of
tokens: 1. Identifiers
2. Keywords
3. Literals
4. Operators
5. Punctuators
C program consists of tokens that need to be identified before they are parsed and the tokens are separated
with the help of any of the following known as “White Space”:
1. Blanks
2. Horizontal or vertical tabs
3. New lines
4. Comments
IDENTIFIERS
Identifiers are used to identify the name element of any user-defined object or variable. Identifiers are basic
building blocks of a program and used to assign a name of a variables, functions or symbolic constants.
Identifiers often referred to as user-defined elements are different from the keywords used in C language.
You need to take care of certain naming conventions before you name any variable, function or constant.

1. Name must start with an alphabet or under score.


2. Name can consist of alphabets, under score and numeric digits.
3. Name cannot be given separated by blank spaces.
4. Must be less than 31 characters.
5. No keywords can be used as identifier name.
6. Naming identifier is case sensitive. Some of the examples of valid identifiers name, obj1, n_1,
Name_student etc. Some of the examples of invalid identifiers
1name, n.g, int, float
Last two are also invalid as they are keywords used in C language and the concept of keywords is explained
next.

KEYWORDS
Keywords are reserved words of any programming language that have some special pre-defined meaning
understood by the compiler. Keywords cannot be used for naming user-defined functions, class or variables.
Following is the list of keywords available in C.
auto const double float int short struct

unsigned break continue else for long signed

switch void case default enum goto register

sizeof typedef volatile char do extern if

return static union while

List of keywords used in C language


Availability of these keywords also depends on compiler that you are using.

OPERATORS
Operators are certain special symbols that are used to perform some operations in C. For example, c
= a + b ; here ‘a’, ‘b’ and ‘c’ are operands and ‘a+b’ is an expression. Here we have used two operators.
First one is ‘+’ operator that is used to add the value of identifiers ‘a’ and ‘b’. second one is ‘=’ operators
that is used to assign the value of addition to identifier ‘c’.
There are many more operators available in C, discussed later in this book.
LITERALS/SYMBOLIC CONSTANTS
Literal is used to assign a constant value or can itself be considered as a constant value. The value
assigned in a literal is never changed during a program. In C there are 4 kinds of literals used and they are:
1. String literal
2. Integer constant
3. Floating constant
4. Character constant
String literal is simply an array of characters. String literal ends with special character '\0' called NULL
character. This character is assigned automatically at the end of the string.
For example:
"abhi" is a string literal.
Integer literal contains the Integer value. Integer value is a whole number like 20.

Float literal contains the real or float values. Float value is a number like 23.8. Float literal can be a float
or a double depending upon the precision is single or double.

Character literal contains single character as value and its takes 1 byte in memory.

Following are the categories of character literals


Character constants are simply single character values given in single quotes, like ‘a’, ‘b’ etc. Character
constants can also be escape sequences that are special character literals having predefined meaning that
are followed by ‘\’ backslash and are mainly used with cout. Following is the list of available escape
sequence characters in C
Escape Sequence Represents

\a Bell (alert)

\b Backspace

\f Formfeed

\n New line

\r Carriage return

\t Horizontal tab

\v Vertical tab

\’ Single quotation mark


\” Double quotation mark

\\ Backslash

\? Literal question mark


Table2.3 escape sequence used in C

A symbolic constant may be thought of a name that substitutes for a sequence of character that cannot be
changed. The character may represent a numeric constant, a character constant, or a string. When you wish
to compile a program, source file is pre-compiled and each occurrence of a symbolic constant is replaced
by its corresponding character sequence. They are usually defined at the beginning of the program using
#define preprocessor. The symbolic constants may then appear later in the program in place of the numeric
constants, character constants, etc.
For example,

#define num 15

Where ever in the program, num has been written will be replaced by 15 before the compilation of the
source file is done.

PUNCTUATORS
Punctuators are also called as separators and they are used to separate two identifiers two units of data. It is
also used to mark the beginning and end of programming construct. It is also used to separate a line of
codes.
Punctuator symbol Name Explanation

, Comma Variable separator, like int a, b

; Semicolon Used for terminating a statement, like a=b;

: Colon Used for defining labels of goto statements and used in


syntax of switch statement. Both are defined in control
statements chapter.

() Parenthesis Used in expressions, like (a+b) +c

[] Square brackets For declaration of arrays, like int a[12]

{} Curly braces For defining compound statements

DATA TYPES IN C
Figure 2.1 data types in C

Type name Description Bytes Range of Values

int short Contains both +ve and 2 –32,768 to 32,767


int signed int –ve whole numbers

unsigned int unsigned Contains only +ve 2 0 to 65535


short int whole numbers

long int signed long Contains both +ve and 4 –2,147,483,648 to 2,147,483,647
int –ve whole numbers

unsigned long int Contains only +ve 4 0 to 4294967925


whole numbers

char signed char Character or small 1 -128 to 127


integer

unsigned char Character with +ve 1 0 to 255


values only

Float Contains both +ve and 4 3.4E +/- 38 (7 digits)


–ve real numbers

double Contains both +ve and 8 1.7E +/- 308 (15 digits)
–ve real numbers

long double Contains both +ve and 10 3.4 X 10-4932 to 1.1 X 10+4932
–ve real numbers
Table 2.5 range and description of various data types

BASIC DATA TYPES INTEGER OR SHORT INTEGER


It is a group of 16 contiguous bits or 2 bytes in memory used to represent a whole number. Maximum
numeric value that can be represented with the help of 16 bits is 65535. A short integer is signed by default
and it can store a value that ranges from –32768 to 32767. If you declare an integer or short integer
explicitly as unsigned integer then it can store a value that ranges between 0 to 65535.

LONG INTEGER
An integer variable can also be declared with the long keyword. The long keyword designates a 32-bit
integer, and if positive its value ranges from 0 to 4,294,967,295. The long integer can also be declared as
singed or unsigned long integer and its functioning is similar to that of a simple integer.

CHARACTER

It is a group of 8 bits or 1 byte used to hold a single character. A character is an individual symbol and it
could be any of the following:
A lowercase letter: a, b, c, d, e, f, g, h, I, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, and z
An uppercase letter: A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, And Z A digit:
0, 1, 2, 3, 4, 5, 6, 7, 8, and 9;
A special characters : ` ~ # $ ! @ % ^ & * ( { [ ) } ] | \ : ; “ ‘ + - < _ ? > , / =.
To declare a variable as a character, use the C keyword char followed by a valid C name. Here is an
example: char option;

FLOATING POINT NUMBER


The most fundamental floating variable is declared with the float keyword. The float is a 4 Byte or 32 bit
real number that ranges from 3.4 x 10-38 to 3.4 x 1038. Float variables can be declared as follows:

float size, weight;


You can also initialize the variable while declaring it as follows:
float size = 10.7;
Out of 32 bits, 1 bit is used to store the sign, 12 bits to store the exponent and 23 bits to store
mantissa.
DOUBLE FLOATING POINT NUMBER
For a variable larger than the floats range and the one that requires more precision, you should use the
double identifier. The double-precision identifier is an 8 Byte or 64 bits decimal or fractional number
ranging from 1.7 x 10-308 to 1.7 x 10308.
W can declare a double precision number as follows:
double size;
Out of 64 bits, 1 bit is used to store the sign, 11 bits to store the exponent and 52 bits to store
mantissa.

VOID
Generally void means nothing or empty. Declaring a variable with void data type means it does not
correspond to any data type. Because it means no data type, we cannot declare an ordinary variable with
void data type. Void data type is used with pointers or functions. Pointers and functions are discussed later
in this book in great detail.

DERIVED DATA TYPE


This category of data types are derived from fundamental data types.
ARRAY
Array is a collection of similar or homogeneous data elements that belong to same data type. It is a set of
sequentially indexed elements starting from 0 having same data type. Arrays are discussed later in this book.
FUNCTION
Function can be thought of as a block consisting of compound statements to perform a specific operation.
Function can be of two types, either library or user-defined functions. Functions are also discussed later in
this book.
POINTER
A pointer is a special variable that can be used to hold the address of another variable, pointer or a function.
Pointer is the most important concept in C and again discussed later in this book.

USER-DEFINED DATA TYPE


This category of data types allows a programmer to define a new data type equivalent to existing data type.
STRUCTURE
Structure can be created using struct keyword. Structure is quite similar to an array except the fact that
structure is a collection of variables of heterogeneous or different data types.
Syntax:
struct structure_name
{ datatype member_1; datatype
member_2;
.
.
datatype member_n;
};
UNION
Union can be created using keyword union. Union is quite similar to structure except the fact that members
or data elements of union share the same memory space. Syntax union union_name
{ datatype member_1; datatype
member_2;
.
.
datatype member_n;
};
ENUM
It stands for enumerated data type. Members of enum data type are constants and written as identifiers that
can have signed integer values. Syntax
enum enum_tag ( member1, member2.., member);

TYPE SPECIFIER

Type specifiers are also known as type modifiers. These modifiers are used along with the basic data types
when declaring data elements.

1. short
2. long
3. signed
4. unsigned
Signed modifier indicates that the value can range from negative to positive and unsigned indicates that
the value can only be positive. For example, unsigned int a; //value can range from 0 to 65535 signed
int a; //value can range from -32768 to 32767
Use of short and long along with the basic data types change the amount of memory taken by the data
elements and hence it also changes the range of value it can store. The amount of memory taken also depends
upon the compiler that we are using. Short is only used with integer data type.
Memory taken by a short integer can be less or equal to that of a normal integer and memory taken
by long integer can be more or equal to that of a normal integer.

TYPE QUALIFIERS

Type qualifier refers to assigning properties to identifiers. There are two types of type qualifiers namely,
const and volatile. Usage of these qualifiers is as follows:
Const qualifier: This qualifier is used when we want the value a variable to remain constant throughout the
execution of the program. To declare a variable with const qualifier simply write const keyword before the
declaration statement. We can declare identifier with const qualifier using any of the two methods given
below:
const datatype identifier_name = value;
Example, const int a =10; datatype const
identifier_name = value;
Example, int const a = 10;
Identifier can also be declared as constant using macro #define. Macros are explained later in this book
along with the difference between creation of constant using const keyword and #define macro.
Volatile qualifier: it is used to refer that the value of the data must be updated asynchronously. We declare
an identifier with volatile keyword when we expect that the value of the identifier can be changed by any
external source like external process, hardware or interrupt at any given time. Value of volatile identifier
can be changed by any external source irrespective of the code written for that program. identifier with
volatile qualifier can be declared as follows:
volatile dataype identifier_name; datatype volatile identifier_name; volatile qualifier can also be used
along with the const qualifier. It means that the value of the identifier
cannot be changed by the code, but can be changed by any external factor. For
example,
Const int volatile a = 20;

VARIABLES
An entity that may vary during the execution of a program is called a variable. Variable names are
identifiers that provide reference to locations in memory. These locations can contain integer, real or
character constants. Types of variable decide the constants that can be stored at the location referenced by
a variable. For example, an integer variable can hold integer constant, a real variable can hold real constant
and a character variable can hold only a character constant.
For example: name, age, mark1 etc.
DECLARING A VARIABLE
Before a variable can be used, it must be declared. Declaration of the variable means defining the data type
and a valid identifier. For example, int age; char gender; float size;

Errors in C

In C or C++, we face different kinds of errors. These errors can be categorized into five
different types. These are like below −

• Syntax Error
• Run-Time Error
• Linker Error
• Logical Error
• Semantic Error

Let us see these errors one by one −

Syntax error
This kind of errors are occurred, when it violates the rule of C++ writing techniques or
syntaxes. This kind of errors are generally indicated by the compiler before compilation.
Sometimes these are known as compile time error.
In this example, we will see how to get syntax error if we do not put semicolon after one
line.

Example
#include<stdio.h> main() {

printf("Hello World") }

Output
Error] expected ';' before '}' token
Rumtime error
This kind of errors are occurred, when the program is executing. As this is not compilation
error, so the compilation will be successfully done. We can check this error if we try to
divide a number with 0.
Example

#include<stdio.h> main() { int x = 52; int y = 0;

printf("Div : %f", x/y);

Output
Program crashes during runtime.
Linker error
This kind of errors are occurred, when the program is compiled successfully, and trying
to link the different object file with the main object file. When this error is occurred, the
executable is not generated, For example some wrong function prototyping, incorrect
header file etc. If the main() is written as Main(), this will generate linked error.

Example

#include<stdio.h> main() { int x = 52; int y = 0;

printf("Div : %f", x/y);

Output
C:\crossdev\src\mingw-w64-v3-git\mingw-w64crt\crt\crt0_c.cundefined reference to
`WinMain'
Logical error
Sometimes, we may not get the desired output. If the syntax and other things are correct,
then also, we may not get correct output due to some logical issues. These are called the
logical error. Sometimes, we put a semicolon after a loop, that is syntactically correct, but
will create one blank loop. In that case, it will show desired output.

Example

#include<stdio.h> main() { int i; for(i = 0; i<5; i++); { printf("Hello World");

Output
Here we want the line will be printed five times. But only one time it will be printed for the
block of code.
Semantic error
This kind of error occurs when it is syntactically correct but has no meaning. This is like
grammatical mistakes. If some expression is given at the left side of assignment operator,
this may generate semantic error.

Example
#include<stdio.h>
main() { int x, y, z; x = 10; y = 20;
x + y = z;
}
Output
[Error] lvalue required as left operand of assignment

You might also like