You are on page 1of 13

Know C Language

1. ELEMENTS OF C LANGUAGE
1.1 INTRODUCTION
Now a days computers play a vital role in the field of space, research, engineering,
medicine, industry, business and even in music and painting. For example using InterContinental Ballistic Missiles (ICBM) in defense and launching of satellites in space cannot be
done without computers. Such applications cannot be even imagined without computers.
Computers are used for solving problems quickly and accurately, irrespective of the magnitude
of input. A sequence of instructions is communicated to the computer to solve a problem. To
communicate the instructions, programming languages are developed. Instructions written in
a programming language are called a program. A group of programs developed for certain
purposes are referred to as software where as electronic components of the computer are
referred to as hardware. Software activates the hardware of the computer to carry out the
desired task. In a computer, hardware without software is similar to a body without soul.
Software can be System software or application software. System software is a collection of
system programs. A system program is a program which is designed to control and utilize the
processing capabilities of the computer itself effectively. System programming is the activity
of designing and implementing system programs. Application Software is a collection of
prewritten programs meant for specific applications.
Computer hardware can understand instructions only in the form of machine codes i.e
0s and 1s. A programming language is used to communicate with the hardware of computer
called low level language or machine language. It is very difficult to understand the machine
language because the instructions contain sequences of 0s and 1s. Also it is difficult to
identify errors, if any in the machine language programs. Low level languages are machine
independent. To over come the difficulties of machine language, high level languages like
BASIC, FORTRAN, COBOL and C were developed.
C LANGUAGE
The programming language C was developed in 1972 by Dennis Ritchie at AT&T
Bell Laboratory, New Jersy. It was mainly influenced by the languages BCPL and B. It
was named as C to present it as the successor of B language which was designed earlier
by Ken Thompson in 1970 for the first UNIX System. The chronological order of the
development of C is given in Fig1.1.
C proved to be an excellent programming language for writing system programs.
Hence it got wide popularity among the programmers in research centers, universities and
colleges. The UNIX operating system, C compiler and all UNIX applications software are
written in C.

Know C Language

ALGOL60(Algorithmic language)
Developed by an International Committee, 1960

CPL(Combined Programming Language)


Developed by Cambridge and University of London, 1963

BCPL(Basic CPL)
Developed by Martin Ritchards, 1967
B
Developed by KenThompson, BellLabs,1970
C
Developed by Dennis Ritchie, BellLabs,1972
Fig1.1 chronological order of the development of C
The American National Standard Institute (ANSI) established a committee in 1983 to
standardize C language. The result is the ANSI standard for C.

1.2 FEATURES/CHARACTERSTICS OF C LANGUAGE


C is attractive and popular because of the following reasons:

C-Language is very simple to learn and use.


C-Programs are very efficient and have fast execution speed.
C-Language supports limited data types.
C-language is rich in built-in functions and standard functions.
C-language is highly portable. Portability means a C program written in one environment
can be executed in another environment. For example if you have a written C program in
DOS environment or Windows-95 environment, it can also be executed in UNIX
environment with out any or little modifications.
C-language is a structured programming language.
C-language has an important facility called extendibility. It means programmer can write
own file or functions and include in other programs.
C-language is also called as middle level language because it has both types of features
i.e high level as well as low level. High level languages are very easy and fast to learn for
readers. The syntax used for high level language is similar to English statements. Also
these programs are not machine dependent and it is very difficult to create and modify.
In C-language programs the memory addresses are directly associated and accessed by
using pointers.
In C-language the programs are made up of functions.
C-language also permits the recursion process.
C-language also treats upper case and lower case letters distinctively(case sensitive).

Know C Language
How ever C has got its own limitations as given below

Non uniformity in associativity.

Wrong precedence for some operators.

Usage of same operator for multiple purposes.

No direct Input/Output facility.

1.3 CHARACTER SET OF C


Any language like English, Hindi, Telugu, Tamil or German requires alphabets to form
words. Likewise, a programming language also needs a set of characters to write a program.
The set of characters used in a language is known as its Character Set. These characters can
be represented in the computer. Every language makes use of its character set to form words
or symbols that make up the vocabulary of the language. C has got its own character set. The
character set for ANSI Standard C (ANSI C) is given below.
Upper-case alphabets: A to Z
Lower-case alphabets: a to z
Decimal digits : 0 to 9
Special Characters: + - * / % = <>blank : ; , . ?! # \ ( ) { } _ [ ] & | ^ ~

1.4

IDENTIFIERS

Identifiers are symbolic names used to refer to entities such as data types, constants,
variables, functions, arrays etc in a program. An identifier is a sequence of characters. In C,
identifiers are formed by using alphabets, digits and an underscore ( _ ). The first character
must always be an alphabet or an underscore. The remaining characters may be one or more
alphabets, digits and underscores. Special characters except the underscore are not allowed
in an identifier. The upper case and lower case of an alphabet are treated distinctly by C.
An identifier may contain many characters. But a compiler may consider only the first
few characters and the remaining characters are ignored while using them. The first few
characters considered by the compiler are known as Significant Characters. Some of the C
compilers consider only the first eight characters of an identifier to be significant. Normally
functions and external variables use less than eight characters. But most of the compilers
consider up to 31 characters as per the ANSI standard.
Valid Identifiers
_BUFFER
Pay

X
A
Sum
net_pay_record_2

Area

Invalid Identifiers
3A -- The first character must be a letter
a -- Special character is not allowed
loan-no -- Special character - (hyphen) is not allowed
book name Blank space is not allowed.

Know C Language

1.5 VARIABLES
A variable is a data name that may be used to store a data value. A variable may take
different values at different times during execution. Variables are identifiers useful for storing
and manipulating the data in a program. They are the nouns of a programming language. They
are the entities that act or acted upon. Variable names may consist of letters and digits and
underscore( _ ). The underscore is used for improving the readability of long variable names.
An underscore( _ ) is also counted as a letter. The variable names area and AREA are not
identical because C distinguishes between upper case and lower case letters.

1.6 RESERVED WORDS


There are certain words that have a standard, predefined special meaning in C. They
are called reserve words or key words and they can be used only for their intended purposes.
The user cannot redefine them. Reserved words in C are also identifiers and are formed using
lower case letters only. The reserved words available in ANSI C are given below.

auto
break
case
char
const
continue
default
do

double
else
enum
extern
float
for
goto
if

int
long
register
return
short
signed
sizeof
static

struct
switch
typedef
union
unsigned
void
volatile
while

1.7 DATA TYPES


C supports several different types of data, each of which may be represented
differently with in the computers memory. The basic data types are listed below. Typical
memory requirements are also given. (Note that the memory requirements for each data type
may vary from one C compiler to another.)
ANSI-C supports four different data types which are further categorized.
1.Primary or scalar or simple data type
2.Secondary or Derived data type
3.User defined data type
4.Empty data type
Data type

Scalar

Int float

derived

userdefined

empty

char double array pointer func- structure


tion

Know C Language

1.7.1 Scalar Data Types


Scalar data type is used for representing a single value only. The four scalar data types
are

int (Integer)
char(character)
float(Single precision)
double(Double precision)

These are also known as Primitive or Primary or Fundamental or basic Data types.
Integer data type(int)
This data type is of integral type which is not followed by decimal point and the integer
quantities can be defined as short int, long int, unsigned int.
The range for signed short int be 128 to +127 where as unsigned short int have range
from 0 to 255 and having 16-bit(2-bytes)size. The original int can vary from 32,768 to
+32,767 and having either 16-bit(2-bytes) or 32-bit(4-bytes), the signed long int have the long
range from 2,147,483,648 to +2,147,483,647 having 32-bit(4-bytes) in size.
Character data type(char)
The char data type is used to represent individual characters. Hence, the char type will
generally require only one byte of memory. A char data type will permit a range of values
extending from 0 to 255 in case of unsigned char and 128 to +127 in case of signed char.
Float data type(float)
The data type is of real type, which has at least one digit with a decimal point. The
required storage space for the float type is 32-bits(4-bytes) and have range from 3.4E-38 to
3.4E+38 for inputting the data.
Double data types(double)
The type double can be used to increase the accuracy. A double type uses 64bits
giving precision of 14 digits. These are known as double precision numbers. To extend
precision further, user can use long double which uses 80bits. The range for double type is
1.7E-308 to 1.7E+308 and the range for long double type is 3.4E-4932 to 1.1E+4932.
The following table gives different data types having data size and ranges according to
their data.
Type
Size (bytes)
char(or)signed char
1
unsigned char
1
int (or) signed int
2
unsigned int
2
short int (or) signed short int
2
unsigned short int
2
long int (or) signed long int
4
unsigned long int
float
Double
long double

4
4
8
10

Range
128 to 127
0 to 255
32768 to 32767
0 to 65535
128 to 127
0 to 255
2,147,483,648 to
+2,147,483,647
0 to 4,294,967,295
3.4E-38 to 3.4E+38
1.7E-308 to 1.7E+308
3.4E 4932 to 1.1E+4932

Know C Language
1.7.2 Derived data types
Derived data types are derived from the scalar data type by adding some additional
relationship with the various elements of the primary or scalar data types. Note that derived
data type may be used for representing a single value or multiple values.

The main derived data types are:


Arrays
Structures
Unions
Pointers
Functions

1.7.3 User-Defined data types


User defined data types allow the users to define a variable or an identifier i.e. it
provides a way to define the users own data types and also can define the value of a variable
or an identifier stores in to the memory. The two categories involved in user defined data types
are

Type definition data type(typedef)

Enumerated data type(enum)


1.7.4 Void data type
Void means nothing. Void or empty data type is used in user defined functions. Void is
used when the function returns nothing. Also it is used when a function or any sub-program
dont have any argument in it.

1.8 TYPE QUALIFIERS


Short, long, signed or unsigned identifiers may precede some of the scalar data types
to specify the number of bits used for representing the respective type of data in memory.
They are known as qualifiers or modifiers. Suppose 16 bits are used to represent an
integer in a system, then the most significant bit is used as sign bit and other 15 bits are used
for representing the magnitude in the case of signed int as shown below.

15

14

Sign
Magnitude
Bit
0 14 bits
Fig: (a) Signed int representation
In the case of an unsigned int all the 16-bits are used for magnitude representation as
shown below.
15

14

Magnitude
0 15 bits
Fig: (b) unsigned int representation

Know C Language

The following table shows the qualifiers and data type representations.
Date Type
Int

Char
Double

Qualifier that may precede


Long
Short
Unsigned
Signed
Unsigned long
Unsigned short
Signed long
Signed short
Signed
Unsigned
Long

Resulting Data Type


Long int
Short int
Unsigned int
Signed int
Unsigned long int
Unsigned short int
Signed long int
Signed short int
Signed char
Unsigned char
Long double

1.9 CONSTANTS
Constants are the fixed values directly used in any program and they remain
unchanged during the execution of program. There are four basic types of constants in C.
They are

integer constants
floating point constants
character constants
string constants

1.9.1 Integer Constants


An integer constant is a sequence of digits with out decimal point representing a value.
Rules

Comma and blank space cannot be included with in integer constants.


An integer constant can be preceded by a plus sign, if desired, or a minus sign. If a sign
does not appear, the integer will be assumed to be positive.
Integer value should not exceed the maximum and minimum values that can be
represented by the specified number of bits used by the system for int data type
representation.
An integer constant may end with l or L representing long integer, u or U representing
unsigned int and ul or UL representing unsigned long int.
A Decimal integer constant is formed using digits 0 to 9.

Valid Decimal Integer Constants:0 +1 -1


625
-4236
9848382 600004 94137994UL

Know C Language
Invalid Decimal Integer Constants
345,123
34 51 23
34-51+23
75.123

Comma is not allowed


Blank spaces are not allowed
Special characters are not allowed
Decimal point is not allowed

An Octal (base 8) integer constant is formed from the octal number system 0 through 7, with
a leading 0(zero).
Valid Octal Integer Constants
00

01

0627

07757

Invalid Octal Integer Constants


627
0628
6.27

The first digit must be 0 (zero)


The digits 0 through 7 only to be used
Decimal point is not allowed

A Hexadecimal integer constant is formed from the hexadecimal number system 0 through
9 and A through F (either upper case or lower case) leading with OX or ox. The characters A
through F represent the values 10 through 15.
Valid Hexadecimal Integer constant
0x0 0X0
0XAB25

0X9A
0X5dff
0XABCD

Invalid Hexadecimal Integer Constants


626
OX6.26
Oxabpq

Not leading with ox or OX


Decimal point is not allowed
Only a through f and 0 to 9 are allowed

1.9.2 Floating Point Constants


Any number with a decimal point is called a floating point constant or a Single
Precision Constant. It can be written in the exponential form also. An exponent is written
as an e or E followed by a positive or negative integer. The value preceding the exponent is
known as mantissa .
For example 2.5e4 is equivalent to 2.5*104
And 5E-3 is equivalent to 5*10-3
Rules

Both the integer and fractional parts consist of a sequence of digits.


The exponent of the floating constants written in exponential form must be an integer.
Either the integer part or the fractional part(not both) may be missing.
Either the decimal point or the exponent (not both)may be missing.
Special characters except +,- and . are not allowed.
Floating constants should not exceed the specified limits of the values of the system
used.
Floating constants may end with f or F.

Valid Floating Constants


1.
0.5
.25
4E-6f
60000. 0.000415 527.415F 0.008e-4

Know C Language

Invalid Floating Constants


5
6000E-8.2
4E 12
5,123.45

Either a decimal or an exponent must be present


The exponent must be an integer quantity
Blank space is not allowed
Comma is not allowed

A double precision floating constant is similar to a single precision but with higher
range of values and greater precision. Floating point constants may end with l or L
representing long double. Normally a single precision is stored as a 32-bit number with 6
digits of precision where as a double precision has 14 or more digits of precision and is
stored as a 64-bit number.
1.9.3 Character Constants
A character written with in single quotes is called a character constant. A character
constant represents an integer value equal to the numeric value of the character in the
machines character code denoting a single character.
If the system uses ASCII character code, the character constant A represents the
integer value 65.
Valid Character Constants
X Z 8 ? !
(Blank space) +
Invalid Character Constants

c
k
sum

No character inside single quotes


No single quotes
Only single quotes are allowed
Only single character is allowed

1.9.4 String Constants


A string constant is a sequence of zero or more characters enclosed with in double
quotes. The quotes are not part of the string. A string constant is also known as a string
literal.
Valid String Constants
C Programming Language
X
626
(empty or null string)
Invalid String Constants
Programming
area
computer

Double quotes are missing


Only double quotes are allowed
One double quote is missing

The compiler automatically places a special character NULL(\0) at the end of each
string constant. The internal representation of a string constant has a NULL character at the
end and so it is easier to find the end of a string. NULL is known as delimiter of the string.

Know C Language
The difference between a character constant and a string constant is illustrated
below.
K-It gives the integer value of character K in the machines character code.
k-It is a string constant that contains the character k and NULL.
Escape Sequences
A character constant represents a single character and a string constant represents
zero or more characters. Escape sequence is a character representation that may appear in
a character constant or in a string constant. Certain non printing characters, as well as
backslash(\) and the apostrophe(), can be expressed in terms of escape sequences. An
escape sequence always begins with a backward slash and by one or more special
characters.
The commonly used escape sequences are listed below.
Character
Bell (alert)
Back space
Horizontal tab
Vertical tab
New line (linefeed)
Form feed
Carriage return
Quotation mark ()
Apostrophe ()
Question mark (?)
Backslash (\)
Null

Escape Sequence
\a
\b
\t
\v
\n
\f
\r
\
\
\?
\\
\0

ASCII Value
007
008
009
011
010
012
013
034
039
063
092
000

1.10 SYMBOLIC CONSTANTS


A Symbolic Constant is a name that substitutes for a sequence of characters. The
characters may present a numeric constant, a character constant, or a string constant. Thus
a symbolic constant allows a name to appear in the place of a numeric constant, a character
constant or a string constant. When a program is compiled, each occurrence of a symbolic
constant is replaced by its corresponding character sequence. The general syntax of
symbolic constant is

#define <symbolic name> symbolic_constant


and example is
#define name text
where name represents a symbolic name, typically written in upper case letters and
text represents the sequence of characters that is associated with the symbolic name.
There are some rules used to define the #define statement. The rules are described
below.

Symbolic names are written in upper case, to distinguish them from ordinary C variables.
Symbolic definition does not end with semicolon.
Special characters can not be used in symbolic name
No spaces are allowed between # and define.

10

Know C Language

Valid Symbolic Constants


#define PI 3.142857
#define MAXROW 10
#define ALARM \007
#define\t COL 10

Invalid Symbolic constants


#define MAX45
#define SIZE=20
#define STR NAME;

1.11 DECLARATION AND INITIALIZATION OF VARIABLES


A declaration associates a group of variables with a specific data types. All variables
must be declared before they can appear in executable statements. A declaration consists of
a data type, followed by one or more variable names, ending with a semicolon. Based on the
data type the memory space requirement differs. The general format of declaration is given
below.

data_type var1,var2,..varn;
Where data_type may be one of the data types such as int, char, float, double and
var1, var2,..varn are the variable names.
Examples
int x, y, count, year ;
char c, ch, s ;
float area, volume ;
double wave_length, light_speed ;
Qualifiers may precede the fundamental data type. The general format of declaring
variables with qualifier is

qualifier data_type var1,var2,..varn;


Where the qualifier may be long, short, signed and unsigned
Examples
long int k1, k2;
signed int s1, s2, s3;
unsigned int s1, u1;
long double ld1, ld2;
signed char cs;
unsigned float f1, f2;
Assigning a value to a variable for the first time in a program is known as
initialization. The value used for the assignment is known as initialiser. In C, a variable must
be initialized in its declaration itself. The intialiser must be a constant or an expression
including predefined values.
Examples
int x=10;
char ch=x;
char c1=A, c2=A+25;
int a=5, b=a*50;
int n={7};

11

Know C Language

1.12 EXPRESSIONS AND STATEMENTS


An expression represents a single data item, such as a number or a character. It may
also consist of some combination of constants and variables interconnected by one or more
operators.
Examples
a+b;
x=y;
x=a+b ;
x<=y;
x==y;
A statement causes the computer to carryout some action. When an expression is
followed by a semicolon(;) it becomes a Simple Statement. There are three different
classes of statements in C. They are Expression Statements, compound Statements and
Control Statements.
An expression statement consists of an expression followed by a semicolon
Examples
a=b;
c=a+b;
An compound statement consists of several individual statements enclosed with a
pair of braces { }.
Example
{

a=3; b=10; c=a+b;

Control Statements are used to create special program features, such as logical
tests, loops and branches.
Example
a=10;
if(a>5)
{
b=a+10; }

1.13 COMMENTS
A comment explains briefly what a program or that line of statement does.
Comments should be included in appropriate places to improve the readability of a program.
In C programs the characters between /* and */ are treated as comments and are ignored by
the compiler when the program is compiled. Also another character // is used to precede a
comment and this is applicable only for a single line.
Examples
0
1

//C programming is very good


/* C language was developed by Dennis Ritchie*/

12

Know C Language

1.14 C TOKENS
A token is an individual entity of a program. A compiler identifies and splits a program
into a number of tokens. A token may be a single character or a group of characters which
has a specific meaning. The following are the tokens that can be identified by a C compiler
during the translation process.

Identifiers
Keywords
Constants
String constant
Operators
Separators

13