You are on page 1of 45

Bsc IT 5th Sem Assignment Solved Answer

KU 5TH SEM ASSIGNMENT - BSIT (TA) - 51


(GRAPHICS & MULTIMEDIA)
Assignment: TA (Compulsory)
1.What is the meaning of interactive computer graphics? List the various applications of the
computer graphics.
The term interactive graphics refers to devices and systems that facilitate the man-machine
graphic communication, in a way, which is more convenient than the writing convention. For
example, to draw a straight line between two points, one has to input the coordinates of the
two end points. In interactive graphics with the help of graphical input technique by
indicating two end points on the display screen draws the line.
Various applications of the computer graphics are listed below :i). Building Design and Construction
ii). Electronics Design
iii). Mechanical Design
iv). Entertainment and Animation
v). Aerospace Industry
vi). Medical Technology
vii). Cartography
viii). Art and Commerce.
2. Explain in detail the Hardware required for effective graphics on the computer system.
The hardware components required to generate interactive graphics are the input device, the
outputdevice (usually display) and the computer system. The human operator is also an
integral part of theinteractive system. The text and graphics displayed act as an input to the
human vision system and, therefore, the reaction of the human being will depend on how
quickly one can see and appreciate thegraphics present on the display.
3. Compare Raster scan system with random scan system.
In raster scan display, the electron beam is swept across the screen, one row at a time from
top to bottom. The picture definition is stored in a memory area called refresh buffer or frame
buffer. In random scan display unit, a CRT has the electron beam directed only to the parts of
the screen where a picture is to be drawn. It draws a picture one line at a time and so it is
referred to as vector displays.
Raster scan
The Most common type of graphics monitor employing a CRT is the raster-scan Display,
based on television technology. In a raster- scan system; the electron beam is swept across the
screen, one row at a time from top to bottom. The picture definition is stored in a memory
area called the refresh buffer or frame buffer. Each point on the screen is called pixel. On a
black and system with one bit per pixel, the frame buffer is called bitmap. For systems with
multiple bits per pixel, the frame buffer is referred to as a pix map.

Refreshing on raster scan display is carried out at the rate of 60 to 80 frames per second.
Some displays use interlaced refresh procedure. First, all points on the even numbered scan
lines are displayed then all the points along odd numbered lines are displayed. This is an
effective technique for avoiding flickering.
Random scan display
When operated as a random-scan display unit, a CRT has the electron beam directed only to
the parts of the screen where a picture is to be drawn. Random scan monitors draw a picture
one line at a time and for this reason they are also referred as vector displays (or strokewriting or calligraphic displays). The component lines of a picture can be drawn and
refreshed by a random-scan system in any specified order. A pen plotter operates in a similar
way and is an example of a random-scan, hard-copy device.
Refresh rate on a random-scan system depends on the number of lines to be displayed.
Picture definition is now stored as a set of line- drawing commands in an area of memory
referred to as the refresh display file. Sometimes the refresh display file is called the display
list, display program, or simply the refresh buffer.
4. How many colors are possible if
a. 24 bits / pixel is used
b. 8 bits / pixel is used Justify your answer
a). 24 bit color provides 16.7 million colors per pixels, That 24 bits are divided into 3 bytes;
one each for the read, green, and blue components of a pixel.
b). 256, 8 bits per pixel = 2^8 colours.
Widely accepted industry standard uses 3 bytes, or 24 bits, per pixel, with one byte for each
primary color results in 256 different intensity levels for each primary color. Thus a pixel can
take on a color from 256 X 256 X 256 or 16.7 million possible choices. In Bi-level image
representation one bit per pixel is used to represent black-and white images. In gray level
image 8 bits per pixel to allow a total of 256 intensity or gray levels. Image representation
using lookup table can be viewed as a compromise between our desire to have a lower
storage requirement and our need to support a reasonably sufficient number of simultaneous
colors.
5. List and explain different text mode built-in functions of C Programming language.
The different text mode built-in functions of C Programming language are listed below :i). textmode( int mode);
This function sets the number of rows and columns of the screen, mode variable can take the
values 0, 1, 1, or 3.
0: represents 40 column black and white
1: represents 40 column color
2: represents 80 column black and white
3: represents 80 column color
Example: textmode(2); // sets the screen to 80 column black and white
ii). clrscr();
This function clears the entire screen and locates the cursor on the top left corner(1,1)
Example clrscr(); // clears the screen

iii). gotoxy(int x, int y);


This function positions the cursor to the location specified by x and y. x represents the row
number and y represents the column number.
Example: gotoxy(10,20) // cursor is placed in 20th column of 10th row
iv). textbackground (int color);
This function changes the background color of the text mode. Valid colors for the CGA are
from 0 to 6 namely BLACK, BLUE, GREEN, CYAN, RED, MAGENTA and BROWN.
Example: textbackground(2); Or //changes background color to blue textbackground(BLUE);
v). textcolor (int color);
This function sets the subsequent text color numbered between 0 to 15 and 128 for blinking.
Example : textcolor(3); // set the next text color to Green
vi). delline ();
It is possible to delete a line of text and after the deletion all the subsequent lines will be
pushed up by one line Example : /* deletes the 5th line*/ gotoxy (5,4); delline ( );
vii). insline()
Inserts a blank line at the current cursor position Example: /* inserts line at the 3rd row */
gotoxy (3,5); insline ( );
6. Write a C program to create Indian national flag.
#include"graphics.h"
#include"conio.h"
void main()
{
int gd=DETECT,gm,x,y;
initgraph(&gd,&gm,"c:\\tc\\bgi");
x=getmaxx();
y=getmaxy();
clearviewport();
setfillstyle(LINE_FILL,BLUE);
bar(0,0,639,479);
setcolor(6);
rectangle(50,50,300,200);
setfillstyle(SOLID_FILL,6);
bar(50,50,300,100);
setfillstyle(SOLID_FILL,WHITE);
bar(50,100,300,150);
setfillstyle(SOLID_FILL,GREEN);
bar(50,150,300,200);
setcolor(BLUE);
rectangle(45,45,50,300);
setfillpattern(0x20,MAGENTA);
bar(45,45,50,400);
setcolor(BLUE);
circle(175,125,25);
line(175,125,200,125);
line(175,125,175,150);
line(175,125,150,125);
line(175,125,175,100);
line(175,125,159,107);
line(175,125,193,143);

line(175,125,159,143);
line(175,125,193,107);
setcolor(YELLOW);
rectangle(0,0,640,43);
setfillstyle(SOLID_FILL,YELLOW);
bar(0,0,640,43);
setcolor(BLACK);
settextstyle(1,HORIZ_DIR,5);
outtextxy(150,0,"INDIAN FLAG");
getch();
}

KU 5TH SEM ASSIGNMENT - BSIT (TB) - 51


(GRAPHICS & MULTIMEDIA)
Assignment: TB (Compulsory)
1. What is the need for computer graphics?Computers have become a powerful tool for the
rapid and economical production of pictures. Computer Graphics remains one of the most
exciting and rapidly growing fields. Old Chinese saying One picture is worth of thousand
words can be modified in this computer era into One picture is worth of many kilobytes of
data. It is natural to expect that graphical communication, which is an older and more
popular method of exchanging information than verbal communication, will often be more
convenient when computers are utilized for this purpose. This is true because one must
represent objects in two-dimensional and three-dimensional spaces. Computer Graphics has
revolutionized almost every computer-based application in science and technology.
2. What is graphics processor? Why it is needed?To provide visual interface, additional
processing capability is to be provided to the existing CPU. The solution is to provide
dedicated graphic processor. This helps in managing the screen faster with an equivalent
software algorithm executed on the CPU and certain amount of parallelism can be achieved
for completing the graphic command. Several manufacturers of personal computers use a
proprietary graphic processor. For example, Intel 82786 is essentially a line drawing
processor; Texas Instruments 43010 is a high performance general-purpose processor.
3. What is a pixel?
Pixel (picture element): Pixel may be defined as the smallest size object or color spot that can
be displayed and addressed on a monitor. Any image that is displayed on the monitor is made
up of thousands of such small pixels. The closely spaced pixels divide the image area into a
compact and uniform two-dimensional grid of pixel lines and columns.
4. Why C language is popular for graphics programming?
Turbo C++ is for C++ and C programmers. It is also compatible with ANSI C standard and

fully
supports Kernighan and Ritchie definitions. It includes C++ class libraries, mouse support,
multiple overlapping windows, Multi file editor, hypertext help, far objects and error
analysis. Turbo C++ comes with a complete set of graphics functions to facilitate preparation
of charts and diagrams. It supports the same graphics adapters as turbo Pascal. The Graphics
library consists of over 70 graphics functions ranging from high level support like facility to
set view port, draw 3-D bar charts, draw polygons to bitoriented functions like get image and
put image. The graphics library supports numerous objects, line styles and provides several
text fonts to enable one to justify and orient text, horizontally and vertically. It may be noted
that graphics functions use far pointers and it is not supported in the tiny memory model.
5. Define resolution.
Resolution: Image resolution refers as the pixel spacing i.e. the distance from one pixel to the
next pixel. A typical PC monitor displays screen images with a resolution somewhere
between 25 pixels per inch and 80 pixels per inch. Pixel is the smallest element of a displayed
image, and dots (red, green and blue) are the smallest elements of a display surface (monitor
screen). The dot pitch is the measure of screen resolution. The smaller the dot pitch, the
higher the resolution, sharpness and detail of the image displayed.
6. Define aspect ratio.
Aspect ratio: The aspect ratio of the image is the ratio of the number of X pixels to the
number of Y pixels. The standard aspect ratio PCs is 4:3, and some use 5:4. Monitors are
calibrated to this standard so that when you draw a circle it appears to be a circle and not an
ellipse.
7. Why refreshing is required in CRT?
When the electron beam strikes a dot of phosphor material, it glows for a fraction of a second
and then fades. As brightness of the dots begins to reduce, the screen-image becomes unstable
and gradually fades out. In order to maintain a stable image, the electron beam must sweep
the entire surface of the screen and then return to redraw it number of times per second. This
process is called refreshing the screen. If the electron beam takes too long to return and
redraw a pixel, the pixel begins to fade results in flicker in the image. In order to avoid flicker
the screen image must be redrawn sufficiently quickly that the eye cannot tell that refresh is
going on. The refresh rate is the number of times per second that the screen is refreshed.
Some monitor uses a technique called interlacing for refreshing every line of the screen. In
the first pass, odd-numbered lines are refreshed, and in the second pass, even numbered
lines are refreshed. This allows the refresh rate to be doubled because only half the screen is
redrawn at a time.
8. Name the different positioning devices.
The devices discussed so far, the mouse, the tablet, the joystick are called positioning
devices. They are able to position the curser at any point on the screen. (We can operate at
that point or the chain of points) Often, one needs devices that can point to a given position
on the screen. This becomes essential when a diagram is already there on the screen, but
some changes are to be made. So, instead of trying to know its coordinates, it is advisable to
simply point to that portion of the picture and asks for changes. The simplest of such
devices is the light pen. Its principle is extremely simple.
9. What are pointing devices?
A pointing device is an input interface (specifically a human interface device) that allows a

user to input spatial (i.e., continuous and multi-dimensional) data to a computer. CAD
systems and graphical user interfaces (GUI) allow the user to control and provide data to the
computer using physical gestures point, click, and drag for example, by moving a handheld mouse across the surface of the physical desktop and activating switches on the mouse.
Movements of the pointing device are echoed on the screen by movements of the pointer (or
cursor) and other visual changes.
10. What is multimedia?
The word Multimedia seems to be everywhere nowadays. The word multimedia is a
compound
of the Latin prefix multi meaning many, and the Latin-derived work media, which is the
plural
of the world medium. So multimedia simply means using more than one kind of medium.
Multimedia is the mixture of two or more media effects-Hypertext, Still Images, sound,
Animation and Video to be interacted on a computer terminal.
11. What are sound cards?
Sound cards: The first sound blaster was an 8-bit card with 22 KHz sampling, besides being
equipped with a number of drives and utilities. This became a king of model for the other
sound cards. Next came the Sound Blaster Pro, again 8-bit sound but with a higher sampling
rate of 44 KHz, which supports a wider frequency range. Then there was Yamaha OPL3
chipset with more voices. Another development was built-in CD ROM interface through
which huge files could be played directly via the sound card.
12. What is sampling?
Sampling: Sampling is like breaking a sound into tiny piece and storing each piece as a small,
digital sample of sound. The rate at which a sound is Sampled can affect its quality. The
higher the sampling rate (the more pieces of sound that are stored) the better the quality of
sound. Higher quality of sound will occupy a lot of space in hard disk because of more
samples.
13. What is morphing?
Morphing: The best example would be the Kawasaki advertisement, where the motorbike
changes into a cheetah, the muscle of MRF to a real muscle etc.. Morphing is making an
image change into another by identifying key points so that the key points displacement, etc.
are taken into consideration for the change.
14. What is rendering?
Rendering: The process of converting your designed objects with texturing and animation
into an image or a series of images is called rendering. Here various parameters are available
like resolution, colors type of render, etc.
15. What is warping?
Warping: Certain parts of the image could be marked for a change and made to change to
different one. For examples, the eyes of the owl had to morph into the eyes of cat, the eyes
can alone be marked and warped.
16. Why we use scanner?
Photographs, illustrations, and paintings continue to be made the old fashioned way, even by
visual artists who are otherwise immersed in digital imaging technology. Traditional

photographs, illustrations, and paintings are easily imported into computers through the use
of a device called a scanner.
A Scanner scans over an image such as photo, drawing, logo, etc, converting it into
an image and it can be seen on the screen. Using a good paint programme, Image Editor we
can do adding, removing colors, filtering, Masking color etc.
17. What is ganut in Photoshop?
Write yourself...

18. What is a layer?


The concept of layering is similar to that of compositing as we make the different layers by
keying out the uniform color and making it transparent so that layer beneath becomes visible.
In case of future modifications we will be able to work with individual layers and need not
work with the image as a whole.

19. What are editing tools? Why it is needed?


You can use the editing tools to draw on a layer, and you can copy and paste selections to a
layer.
Many types of editing tools are:i).Eraser tool: The eraser tool changes pixels in the image as you drag through them. You can
choose to change the color and transparency of the affected pixels, or to revert the affected
area to its previously saved version.
ii).Smudge tool: The smudge tool simulates the actions of dragging a finger through wet
paint. The tool picks up color from where the stroke begins and pushes it in the direction in
which you drag.

20. What is file format?


File Format: When you create an image-either through scanning into your computer or
drawing it from scratch on your monitor or captured through a camera, recorded voice or
music from the two-in-one or recorded connecting a music instrument it must be saved to
your disk. Otherwise it would become an ethereal artifact that could never again be seen or
listened. Once the computers power is turned off, its gone forever unless it is saved. The
method by which the software organizes the data in the saved file is called the file format.

KU 5TH SEM ASSIGNMENT - BSIT (TA) - 52 (WEB


PROGRAMMING)
Assignment: TA (Compulsory)
1. What is the meaning of Web? Explain in detail the building elements of web
Web is a complex network of international , cross plateform, and cross cultural
communicating devices, connected to each other without any ordering or pattern.
There are two most important building blocks of web:
HTML and HTTP.
HTML: - HTML stands for Hyper Text Markup Language. HTML is a very simple language
used
to describe the logical structure of a document. Actually, HTML is often called
programming
language it is really not. Programming languages are Turing-complete, or computable.
That
is, programming languages can be used to compute something such as the square root of pi
or
some other such task. Typically programming languages use conditional branches and
loops
and operate on data contained in abstract data structures. HTML is much easier than all of
that.
HTML is simply a markup language used to define a logical structure rather than
compute
anything.
HTTP: - HTTP is a request-response type protocol. It is a language spoken between
web
browser (client software) and a web server (server software) so that can communicate with
each
other and exchange files. Now let us understand how client/server system works using HTTP.
A
client/server system works something like this: A big piece of computer (called a server) sits
in
some office somewhere with a bunch of files that people might want access to. This
computer
runs a software package that listens all day long to requests over the wires.
2. HTML is the Language of the Web Justify the statement
HTML is often called a programming language it is really not. Programming languages
are Turing-complete, or computable. That is, programming languages can be used to
compute something
such as the square root of pi or some other such task. Typically programming languages use
conditional
branches and loops and operate on data contained in abstract data structures. HTML is much
easier than

all of that. HTML is simply a markup language used to define a logical structure rather than
compute
anything.
For example, it can describe which text the browser should emphasize, which text should be
considered
body text versus header text, and so forth.
The beauty of HTML of course is that it is generic enough that it can be read and interpreted
by a web
browser running on any machine or operating system. This is because it only focuses on
describing the
logical nature of the document, not on the specific style. The web browser is responsible for
adding style.
For instance emphasized text might be bolded in one browser and italicized in another. it is
up to the
browser to decide
3. Give the different classification of HTML tags with examples for each category
LIST OF HTML TAGS :Tags for Document Structure
HTML
HEAD
BODY
Heading Tags
TITLE
BASE
META
STYLE
LINK
Block-Level Text Elements
ADDRESS
BLOCKQUOTE
DIV
H1 through H6
P
PRE
XMP
Lists
DD
DIR
DL
DT
LI
MENU
OL
UL
Text Characteristics
B
BASEFONT
BIG
BLINK

CITE
CODE
EM
FONT
I
KBD
PLAINTEXT
S
SMALL

4. Write CGI application which accepts 3 numbers from the user and displays biggest number
using GET and POST methods
#!/usr/bin/perl
#print "Content-type:text/html\n\n";
#$form = $ENV{'QUERY_STRING'};
use CGI;
$cgi = new CGI;
print $cgi->header;
print $cgi->start_html( "Question Ten" );
my $one = $cgi->param( 'one' );
my $two = $cgi->param( 'two' );
my $three = $cgi->param( 'three' );
if( $one && $two && $three )
{
$lcm = &findLCM( &findLCM( $one, $two ), $three );
print "LCM is $lcm";
}
else
{
print '
';
print 'Enter First Number
';
print 'Enter Second Number
';
print 'Enter Third Number
';
print '
';
print "
";
}
print $cgi->end_html;
sub findLCM(){
my $x = shift;
my $y = shift;
my $temp, $ans;
if ($x < $y) {
$temp = $y;
$y = $x;

$x = $temp;
}
$ans = $y;
$temp = 1;
while ($ans % $x)
{
$ans = $y * $temp;
$temp++ ;
}
return $ans;
}
5. What is Javascript? Give its importance in web.
JavaScript is an easy to learn way to Scriptyour web pages that is have them to do actions
that cannot be handled with HTML alone. With JavaScript, you can make text scroll across
the screen like ticker tape; you can make pictures change when you move over them, or any
other number of dynamic enhancement.
JavaScript is generally only used inside of HTML document.
i) JavaScript control document appearance and content.
ii) JavaScript control the browser.
iii) JavaScript interact with document content.
iv) JavaScript interact with the user.
v) JavaScript read and write client state with cookies.
vi) JavaScript interact with applets.
vii) JavaScript manipulate embedded images.
6. Explain briefly Cascading Style Sheets
Cascading Style Sheet (CSS) is a part of DHTML that controls the look and placement of the
element on the page. With CSS you can basically set any style sheet property of any element
on a html page. One of the biggest advantages with the CSS instead of the regular way of
changing the look of elements is that you split content from design. You can for instance link
a CSS file to all the pages in your site that sets the look of the pages, so if you want to change
like the font size of your main text you just change it in the CSS file and all pages are
updated.
7. What is CGI? List the different CGI environment variables
CGI or Common Gateway Interface is a specification which allows web users to run
program from their computer.CGI is a part of the web server that can communicate with other
programs running on the server. With CGI, the web server can call up a program, while
passing user specific data to a program. The program then processes that data and the server
passes the programs response back to the web browser.
When a CGI program is called, the information that is made available to it can be
roughly broken into three groups:i). Information about client, server and user.
ii). Form data that are user supplied.
iii). Additional pathname information.
Most Information about client, server and user is placed in CGI environmental variables.
Form data that are user supplied is incorporated in environment variables. Extra pathname

information is placed in environment variables.


i). GATEWAY_INTERFACE T he revision of the common Gateway interface that the server
uses.
ii). SERVER_NAME The Servers hostname or IP address.
iii). SERVER_PORT The port number of the host on which the server is running.
iv). REQUEST_METHOD The method with which the information request is issued.
v). PATH_INFO Extra path information passed to the CGI program
8. What is PERL? Explain PERl control structures with the help of an example
Perl control structures include conditional statement, such as if/elseif/else blocks as well as
loop like for each, for and while.
i). Conditional statements
- If condition The structure is always started by the word if, followed by a condition to
be evaluated, then a pair the braces indicating the beginning and end of the code to be
executed if the condition is true.
If(condition)
{condition to be executed
}
- Unless Unless is similar to if. You wanted to execute code only if a certain condition
were false.
If($ varname! = 23) {
#code to execute if $ varname is not 23
}
- The same test can be done using unless:
Unless ($ varname== 23) {
#code to execute if $ varname is not 23
}
ii). Looping Looping allow you to repeat code for as long as a condition is met. Perl has
several loop control structures: foreach, for, while and until.
- While Loop A while loop executes as long as a particular condition is true:
While (condition) {
#code to run as long as condition is true.
}
- Until Loop A until loops the reverse of while. It executes as long as a particular condition
is not true:
While (condition) {
#code to run as long as condition is not true.
}

KU 5TH SEM ASSIGNMENT - BSIT (TB) - 52 (WEB


PROGRAMMING)
Assignment: TB (Compulsory)

Part - A
a) What is the difference between Internet and Intranet?
Internet: Internet is global network of networks.Internet is a tool for collaborating academic
research,and it has become a medium for exchanging anddistributing information of all kinds.
It is aninterconnection between several computers of different types belongingto various
networks all over global.
Intranet: Intranet is not global. It is a mini web that islimited to user machines and software
program of particulars organization or company
b) List any five HTML tags.
Five HTML tags are:i). UL (unordered list): The UL tags displays a bulleted list. You can use the tags TYPE
attribute to change the bullet style.
ii). TYPE: defines the type of bullet used of each list item. The value can be one of the
following-CIRCLE, DISC, SQUARE
iii). LI (list item): The LI tag indicates an itemized element, which is usually preceded by
bullet, a number, or a letter. The LI is used inside list elements such as OL (ordered list) and
UL (unordered list).
iv). TABLES (table): The TABLE tag defines a table. Inside the TABLE tag, use the TR tag
to define rows in the table, use the TH tag to define row or column headings, and the TD tag
to define table cells.
v). HTML (outermost tag): The HTML identifies a document as an HTML document. All
HTML documents should start with the and end with the tags.
c) Write the difference between HTML and DHTML.
HTML: HTML stands for Hyper Text MarkupLanguage. It is a language. HTML cant
bedone after the page loads. HTML can be or not usedwith JavaScript.
DHTML: DHTML stands for Dynamic Hyper TextMarkup Language. DHTML isnt really
alanguage or a thing in itself its just a mix of thosetechnologies. Dynamic HTML is simply
HTMLthat can change even after a page has been loaded into a browser. DHTML can be used
with JavaScript.
d) Explain the different types of PERL variables.
Perl has three types of variables:
i). Scalars
ii). Arrays
iii). Hashes.
i). Scalars: A scalar variable stores a single (scalar) value.Perl scalar names are prefixed with
a dollar sign ($), so for example, $username, and $url are all examples of scalar variable
names. A scalar can hold data of anytype, be it a string, a number, or whatnot. We can alsouse
scalars in double-quoted strings: my $fnord = 23;my $blee = The magic number is $fnord.;
Now if you print $blee, we will get The magic number is 23.Perl interpolates the variables
in the string, replacingthe variable name with the value of that variable.
ii). Arrays: An array stores an ordered list of values. Whilea scalar variable can only store one
value, an array canstoremany. Perl array names are prefixed with a @-sign.e.g.:my @colors =
(red,green,blue); foreach my $i(@colors) { print $i\n; }
iii). Hashes: Hashes are an advanced form of array. One of the limitations of an array is that
the information contained within it can be difficult to get to. For example, imagine that you

have a list of people and their ages. The hash solves this problem very neatly by allowing us
to access that @ages array not by an index, but by a scalar key. For example to use age of
different people we can use thier names as key to define a hash.
e) How are JSPs better than servlets.
Java programming knowledge is needed todevelop and maintain all aspects of the
application,since the processing code and the HTML elements are jumped together.
Changing the look and feel of theapplication,or adding support for a new type of client,
requires theservlet code to be updated and recompiled.
Its hardto take advantage of web-page development tools whendesigning the application
interface. If such tools areused to develop the web page layout, the generatedHTML must
then be manually embedded into theservletcode, a process which is time consuming, error
prone,and extremely boring. Adding JSP to the puzzle wesolvethese problems.So JSPs better
than servlets.
Part - B
1. a) Explain GET and POST method with the help of an example.
When a client sends a request to the server, theclients can also additional information with the
URL todescribe what exactly is required as output from theserver by using the GET method.
The additionalsequenceof characters that are appended to URL is called a querystring.
However, the length of the query string islimited to 240 characters. Moreover, the query
string isvisible on the browser and can therefore be a securityrisk.to overcome these
disadvantages, the POST method can be used. The POST method sends the data as
packetsthrough a separate socket connection. The completetransaction is invisible because to
the client. Thedisadvantageof POST method is that it is slower compared to theGET method
because data is sent to the server asseparate packets.
b) Explain in detail the role played by CGI programming in web programming.
CGI opened the gates of more complex Web applications. It enabled developers to write
scripts,
which can communicate with server applications and databases. In addition, it enables
developers to write scripts that could also parse client's input, process it, and present it in a
user
friendly way.
The Common Gateway Interface, or CGI, is a standard for external gateway
programs to interface with information servers such as HTTP servers. A plain HTML
document
that the Web daemon retrieves is static, which means it exists in a constant state: a text file
that
doesn't change. A CGI program, on the other hand, is executed in real-time, so that it can
output
dynamic information.
CGI programming allows us to automate passing information to and from web pages. It can
also
be used to capture and process that information, or pass it off to other software (such as in an
SQL database).
CGI programs (sometimes called scripts) can be written in any programming language, but
the
two most commonly used are Perl and PHP. Despite all the flashy graphics, Internet
technology

is fundamentally a text-based system. Perl was designed to be optimal for text processing, so
it
quickly became a popular CGI tool. PHP is a scripting language designed specifically to
make
web programming quick and easy.
2. a) With the help of an example explain the embedding of an image in an HTML tag.
<HTML>
<HEAD>
</HEAD>
<BODY>
<IMG SRC="Images/123.jpg" ALT="Image" />
</BODY>
</HTML>
b) Create a HTML page to demonstrate the usage of Anchor tags.
<HTML>
<HEAD></HEAD>
<BODY>
<A NAME=section2>
<H2>A Cold Autumn Day</H2></A>
If this anchor is in a file called "nowhere.htm," you could define a link that jumps to the
anchor as follows:
<P>Jump to the second section <A HREF="nowhere.htm#section2">
A Cold Autumn Day</A> in the mystery "A man from Nowhere."
</BODY>
</HTML>
3. a) Explain the usage of script tags.
Using the SCRIPT Tag: The following example uses the SCRIPT tag to define a JavaScript
script in the HEAD tag. The script is loaded before anything else in the document is loaded.
The JavaScript code in this example defines a function, changeBGColor(), that changes the
documents background color.
The body of the document contains a form with two buttons. Each button invokes the
changeBGColor()
function to change the background of the document to a different color.
<HTML>
<HEAD><TITLE>Script Example</TITLE>
</HEAD>
<SCRIPT language="JavaScript">
function changeBGColor (newcolor) {
document.bgColor=newcolor;
return false;
}
</SCRIPT>
<BODY >
<P>Select a background color:</P>
<FORM>
<INPUT TYPE="button" VALUE=blue onClick="changeBGColor('blue');">

<INPUT TYPE="button" VALUE=green onClick="changeBGColor('green');">


</FORM>
<NOSCRIPT><I>Your browser is not JavaScript-enabled.
These buttons will not work.</I>
</NOSCRIPT>
b) What is Java script? List the use of Java script.
JavaScript is a scripting language (like a simple programming language). It is a language that
can be used for client-side scripting. JavaScript is only usedinside of HTML documents. With
JavaScript, we can make text scroll across the screen like ticker tape.
The uses of JavaScript are:
i). Control DocumentAppearance and Content
ii). Control the Browser
iii). Interact with Document Control
iv). Interact withUser
v). Read and Write Client State with Cookies
vi). Interact with Applets
vii). JavaScript is only usedinside of HTML documents.
4. a) With the help of an example explain any five CGI environment variables.
i). SERVER_NAME : The server's host name or IP address.
ii). SERVER_PORT : The port number of the host on which the server is running.
iii). SERVER_SOFTWARE : The name and version of the server software that is answering
the client request.
iv). SERVER_PROTOCOL : The name and revision of the information protocol that request
came in with.
v). GATEWAY_INTERFACE : The revision of the common gateway interface that the server
uses.
Example:#!/usr/local/bin/perl
print "Content-type: text/html", "\n\n";
print "<HTML>", "\n";
print "<HEAD><TITLE>About this Server</TITLE></HEAD>", "\n";
print "<BODY><H1>About this Server</H1>", "\n";
print "<HR><PRE>";
print "Server Name: ", $ENV{'SERVER_NAME'}, "<BR>", "\n";
print "Running on Port: ", $ENV{'SERVER_PORT'}, "<BR>", "\n";
print "Server Software: ", $ENV{'SERVER_SOFTWARE'}, "<BR>", "\n";
print "Server Protocol: ", $ENV{'SERVER_PROTOCOL'}, "<BR>", "\n";
print "CGI Revision: ", $ENV{'GATEWAY_INTERFACE'}, "<BR>", "\n";
print "<HR></PRE>", "\n";
print "</BODY></HTML>", "\n";
exit (0);
b) Write a CGI application which accepts three numbers from the used and display biggest
number using GET and POST methods.
#!/usr/bin/perl
#print "Content-type:text/html\n\n";

#$form = $ENV{'QUERY_STRING'};
use CGI;
$cgi = new CGI;
print $cgi->header;
print $cgi->start_html( "Question Ten" );
my $one = $cgi->param( 'one' );
my $two = $cgi->param( 'two' );
my $three = $cgi->param( 'three' );
if( $one && $two && $three )
{
$lcm = &findLCM( &findLCM( $one, $two ), $three );
print "LCM is $lcm";
}
else
{
print '
';
print 'Enter First Number
';
print 'Enter Second Number
';
print 'Enter Third Number
';
print '
';
print "
";
}
print $cgi->end_html;
sub findLCM(){
my $x = shift;
my $y = shift;
my $temp, $ans;
if ($x < $y) {
$temp = $y;
$y = $x;
$x = $temp;
}
$ans = $y;
$temp = 1;
while ($ans % $x)
{
$ans = $y * $temp;
$temp++ ;
}
return $ans;
}
5. a) List the differences between web server and application server.
The main differences between Web servers and application servers :-

A Web server is where Web components are deployed and run. An application server is where
components that implement the business logic are deployed. For example, in a JSP-EJB Web
application, the JSP pages will be deployed on the Web server whereas the EJB components
will
be deployed on the application servers.
A Web server usually supports only HTTP (and sometimes SMTP and FTP). However, an
application server supports HTTP as well as various other protocols such as SOAP.
In other word :Difference between AppServer and a Web server :i). Webserver serves pages for viewing in web browser, application server provides exposes
businness logic for client applications through various protocols
ii). Webserver exclusively handles http requests.application server serves bussiness logic to
application programs through any number of protocols.
iii). Webserver delegation model is fairly simple,when the request comes into the webserver,it
simply passes the request to the program best able to handle it(Server side program). It may
not
support transactions and database connection pooling.
iv). Application server is more capable of dynamic behaviour than webserver. We can also
configure application server to work as a webserver.Simply applic! ation server is a superset
of
webserver.
b) What is a war file? Explain its importance.
WAR or Web Application Archive file is packaged servlet Web application. Servlet
applications
are usually distributed as a WAR files.
WAR file (which stands for "web application_ archive" ) is a JAR_ file used to distribute a
collection of JavaServer Pages_ , servlets_ , Java_ classes_ , XML_ files, tag libraries and
static Web pages ( HTML_ and related files) that together constitute a Web application.
6. a) Explain implicit objects out, request response in a JSP page.
Following are the implicit objects in a JSP page:out: This implicit object represents a JspWriter that provides a stream back to the requesting
client. The most common method of this object is out.println(),which prints text that will be
displayed in the client's browser request: This implicit object represents the
javax.servlet.HttpServletRequest interface. The request object is associated with every HTTP
request. One common use of the request object is to access request parameters. You can do
this by calling the request object's getParameter() method with the parameter name you are
seeking. It will return a string with the values matching the named parameter. response: This
implicit object represents the javax.servlet.HttpServletRequest object. The response object is
used to pass data back to the requesting client. A common use of this object is
writing HTML output back to the client browser.
b) With the help of an example explain JSP elements.
JSP elements are of 3 types:Directive: Specifies information about the page itself that remains the same between requests.
For example, it can be used to specify whether session tracking is required or not,
buffering
requirements, and the name of the page that should be used to report errors.

<%@ page/include/taglib %>


Action: Performs some action based on information that is required at the exact time the
JSP
page is requested by a browser. An action, for instance, can access parameters sent with
the
request to lookup a database.
Scripting: Allows you to add small pieces of code in JSP page.

KU 5TH SEM ASSIGNMENT - BSIT (TA) - 53 (DATA


WAREHOUSING & DATA MINING)
.1. With neat diagram explain the main parts of the computer
A Computer will have 3 basic main parts
i). A central processing unit that does all the arithmetic and logical operations. This can be
thought of as the heart of any computer and computers are identified by the type of CPU
that they use.
ii). The memory is supposed to hold the programs and data. All the computers that we came
across these days are what are known as stored program computers. The programs are
to be stored before hand in the memory and the CPU accesses these programs line by line
and executes them.
iii). The Input/output devices: These devices facilitate the interaction of the users with the
computer.
The input devices are used to send information to the computer, while the output devices
accept the processed information form the computer and make it available to the user.
Diagram:-

2. Briefly explain the types of memories.


There are two types of memories Primary memory, which is embedded in the computer
and which is the main source of data to the computer and the secondary memory like floppy
disks, CDs etc., which can be carried around and used in different computers. They cost much
less than the primary memory, but the CPU can access data only from the primary memory.
The main advantage of computer memories, both primary and secondary, is that they can
store data indefinitely and accurately
3. Describe the basic concept of databases.
The Concept of Database :We have seen in the previous section how data can be stored in computer. Such stored data
becomes
a database a collection of data. For example, if all the marks scored by all the students of
a class are
stored in the computer memory, it can be called a database. From such a database, we can
answer

questions like who has scored the highest marks? ; In which subject the maximum number
of students
have failed?; Which students are weak in more than one subject? etc. Of course, appropriate
programs
have to be written to do these computations. Also, as the database becomes too large and
more and more
data keeps getting included at different periods of time, there are several other problems
about maintaining
these data, which will not be dealt with here.
Since handling of such databases has become one of the primary jobs of the computer in
recent years,
it becomes difficult for the average user to keep writing such programs. Hence, special
languages
called database query languages- have been deviced, which makes such programming easy,
there languages
help in getting specific queries answered easily.
4. With example explain the different views of a data.
Data is normally stored in tabular form, unless storage in other formats becomes
advantageous, we
store data in what are technically called relations or in simple terms as tables.
The views are Mainly 2 types .
i). Simple View
ii). Complex View
Simple view:
- It is created by selecting only one table.
- It does not contains functions.
- it can perform DML (SELECT,INSERT,UPDATE,DELETE,MERGE, CALL,LOCK
TABLE) operations through simple view.
Complex view :
-It is created by selecting more than one table.
-It can performs functions.
-You can not perform always DML operations through
5. Briefly explain the concept of normalization.
Normalization is dealt with in several chapters of any books on database management
systems. Here, we will take the simplest definition, which suffices our purpose namely any
field should not have subfields.
Again consider the following student table.
Here under the field marks, we have 3 sub fields: marks for subject1, marks for subject2 and
subject3.
However, it is preferable split these subfields to regular fields as shown below
Quite often, the original table which comes with subfields will have to be modified suitable,
by the
process of normalization.

6. Explain the concept of data ware house delivery process in detail.


The concept of data ware house delivery process :This section deals with the dataware house from a different view point - how the different
components that go into it enable the building of a data ware house. The study helps us in two
ways:
i) to have a clear view of the data ware house building process.
ii) to understand the working of the data ware house in the context of the components.
Now we look at the concepts in details :i). IT Strategy : The company must and should have an overall IT strategy and the data
ware housing has to be a part of the overall strategy.
ii). Business case analysis : This looks at an obvious thing, but is most often misunderstood.
The overall understanding of the business and the importance of various components there in
is a must. This will ensure that one can clearly justify the appropriate level of investment that
goes into the data ware house design and also the amount of returns accruing.
iii). Education : This has two roles to play - one to make people, specially top level policy
makers, comfortable with the concept. The second role is to aid the prototyping activity.
iv). Business Requirements : As has been discussed earlier, it is essential that the business
requirements are fully understood by the data ware house planner. This would ensure that the
ware house is incorporated adequately in the overall setup of the organization.
v). Technical blue prints : This is the stage where the overall architecture that satisfies the
requirements is delivered.
vi). Building the vision : Here the first physical infrastructure becomes available. The major
infrastructure components are set up, first stages of loading and generation of data start up.
vii). History load : Here the system is made fully operational by loading the required
history into the ware house - i.e. what ever data is available over the previous years is put into
the dataware house to make is fully operational.
viii). Adhoc Query : Now we configure a query tool to operate against the data ware house.
ix). Automation : This phase automates the various operational processes like a) Extracting and loading of data from the sources.
b) Transforming the data into a suitable form for analysis.
c) Backing up, restoration and archiving.
d) Generate aggregations.
e) Monitoring query profiles.
x). Extending Scope : There is not single mechanism by which this can be achieved. As and
when needed, a new set of data may be added, new formats may be included or may be even
involve major changes.
xi). Requirement Evolution : Business requirements will constantly change during the life
of the ware house. Hence, the process that supports the ware house also needs to be
constantly monitored and modified.
7. What are three major activities of data ware house? Explain.
Three major activities of data ware house are :i) Populating the ware house (i.e. inclusion of data)
ii) day-to-day management of the ware house.
iii) Ability to accommodate the changes.
i). The processes to populate the ware house have to be able to extract the data, clean it up,
and make it available to the analysis systems. This is done on a daily / weekly basis
depending on the quantum of the data population to be incorporated.

ii). The day to day management of data ware house is not to be confused with maintenance
and management of hardware and software. When large amounts of data are stored and new
data are being continually added at regular intervals, maintaince of the quality of data
becomes an important element.
iii). Ability to accommodate changes implies the system is structured in such a way as to be
able to cope with future changes without the entire system being remodeled. Based on these,
we can view the processes that a typical data ware house scheme should support as follows.
8. Explain the extract and load process of data ware house.
Extract and Load Process : This forms the first stage of data ware house. External physical
systems like the sales counters which give the sales data, the inventory systems that give
inventory levels etc. constantly feed data to the warehouse. Needless to say, the format of
these external data is to be monitored and modified before loading it into the ware house. The
data ware house must extract the data from the source systems, load them into their data
bases, remove unwanted fields (either because they are not needed or because they are
already there in the data base), adding new fields / reference data and finally reconciling with
the other data. We shall see a few more details of theses broad actions in the subsequent
paragraphs.
i). A mechanism should be evolved to control the extraction of data, check their
consistency
etc. For example, in some systems, the data is not authenticated until it is audited.
ii). ?Having a set of consistent data is equally important. This especially happens when we
are
having several online systems feeding the data.
iii). Once data is extracted from the source systems, it is loaded into a temporary data
storage
before it is Cleaned and loaded into the warehouse.
9. In what ways data needs to be cleaned up and checked? Explain briefly.
Data needs to be cleaned up and checked in the following ways :i) It should be consistent with itself.
ii) It should be consistent with other data from the same source.
iii) It should be consistent with other data from other sources.
iv) It should be consistent with the information already available in the data ware house.
While it is easy to list act the needs of a clean data, it is more difficult to set up systems that
automatically cleanup the data. The normal course is to suspect the quality of data, if it does
not meet the normally standards of commonsense or it contradicts with the data from other
sources, data already available in the data ware house etc. Normal intution doubts the validity
of the new data and effective measures like rechecking, retransmission etc., are undertaken.
When none of these are possible, one may even resort to ignoring the entire set of data and
get on with next set of incoming data.
10. Explain the architecture of data warehouse.
The architecture for a data ware is indicated below. Before we proceed further, we should be
clear about the concept of architecture. It only gives the major items that make up a data ware
house. The size and complexity of each of these items depend on the actual size of the ware
house itself, the specific requirements of the ware house and the actual details of
implementation.

11. Briefly explain the functions of each manager of data warehouse.


The Warehouse Manager : The ware house manager is a component that performs all
operations necessary to support the ware house management process. Unlike the load
manager, the warehouse management process is driven by the extent to which the operational
management of the data ware house has been automated.
The ware house manger can be easily termed to be the most complex of the ware house
components, and performs a variety of tasks. A few of them can be listed below.
i) Analyze the data to confirm data consistency and data integrity.
ii) Transform and merge the source data from the temporary data storage into the ware
house.
iii) Create indexes, cross references, partition views etc.,.
iv) Check for normalizations.
v) Generate new aggregations, if needed.
vi) Update all existing aggregations
vii) Create backups of data.
viii) Archive the data that needs to be archived.
12. Explain the star schema to represent the sales analysis.
Star schemes are data base schemas that structure the data to exploit a typical decision
support
enquiry. When the components of typical enquirys are examined, a few similarities stand out.
i) The queries examine a set of factual transactions - sales for example.
ii) The queries analyze the facts in different ways - by aggregating them on different bases
/
graphing them in different ways.
The central concept of most such transactions is a fact table. The surrounding references
are called dimension tables. The combination can be called a star schema.
13. What do you mean by partition of data? Explain briefly.
Partitioning of data :In most ware houses, the size of the fact data tables tends to become very large. This leads to
several problems of management, backup, processing etc. These difficulties can be over come
by partitioning each fact table into separate partitions.
Data ware houses tend to exploit these ideas by partitioning the large volume of data into data
sets. For example, data can be partitioned on weekly / monthly basis, so as the minimize the
amount of data scanned before answering a query. This technique allows data to be scanned
to be minimized, without the overhead of using an index. This improves the overall efficiency
of the system. However, having too many partitions can be counter productive and an optimal
size of the partitions and the number of such partitions is of vital importance.
Participating generally helps in the following ways.
i) Assists in better management of data
ii) Ease of backup / recovery since the volumes are less.

iii) The star schemes with partitions produce better performance.


iv) Since several hardware architectures operate better in a partitioned environment, the
overall
system performance improve.
14. Describe the terms data mart and Meta data.
Data mart :A data mart is a subset of information content of a data ware house, stored in its own data
base. The data of a data ware house may have been collected through a ware house or in some
cases, directly from the source. In a crude sense, if you consider a data ware house as a whole
sale shop of data, a data mart can be thought of as a retailer.
Meta data :Meta data is simply data about data. Data normally describe the objects, their
quantity, size, how they are stored etc. Similarly meta data stores data about how data (of
objects) is stored, etc.
Meta data is useful in a number of ways. It can map data sources to the common view of
information within the warehouse. It is helpful in query management, to direct query to most
appropriate source etc.,.
The structure of meta data is different for each process. It means for each volume of data,
there are multiple sets of meta data describing the same volume. While this is a very
convenient way of managing data, managing meta data itself is not a very easy task.

15. Enlist the differences between fact and dimension.


This ensures that key dimensions are no fact tables.
Consider the following example :-

Let us elaborate a little on the example. Consider a customer A. If there is a situation, where
the
warehouse is building the profiles of customer, then A becomes a fact - against the name A,
we can list his address, purchases, debts etc. One can ask questions like how many purchases
has A made in the last 3 months etc. Then A is fact. On the other hand, if it is likely to be used
to answer questions like how many customers have made more than 10 purchases in the last
6 months, and one uses the data of A, as well as of other customers to give the answer, then
it becomes a fact table. The rule is, in such cases, avoid making A as a candidate key.
16. Explain the designing of star-flake schema in detail.

A star flake schema, as we have defined previously, is a schema that uses a combination of
denormalised star and normalized snow flake schemas. They are most appropriate in decision
support data ware houses. Generally, the detailed transactions are stored within a central fact
table, which may be partitioned horizontally or vertically. A series of combinatory data base
views are created to allow the user to access tools to treat the fact table partitions as a single,
large table.
The key reference data is structured into a set of dimensions. Theses can be referenced from
the fact table. Each dimension is stored in a series of normalized tables (snow flakes), with an
additional denormalised star dimension table.

17. What is query redirection? Explain.


Query Redirection :One of the basic requirements for successful operation of star flake schema (or any schema,
for that matter) is the ability to direct a query to the most appropriate source. Note that once
the available data grows beyond a certain size, partitioning becomes essential. In such a
scenario, it is essential that, in order to optimize the time spent on querying, the queries
should be directed to the appropriate partitions that store the date required by the query.
The basic method is to design the access tool in such away that it automatically defines the
locality to which the query is to be redirected.

18. In detail, explain the multidimensional schema.


Multidimensional schemas :Before we close, we see the interesting concept of multi dimensions. This is a very
convenient
method of analyzing data, when it goes beyond the normal tabular relations.
For example, a store maintains a table of each item it sells over a month as a table, in each of
its 10 outlets..

This is a 2 dimensional table. One the other hand, if the company wants a data of all items
sold by its outlets, it can be done by simply by superimposing the 2 dimensional table for
each of these items one behind the other. Then it becomes a 3 dimensional view.
Then the query, instead of looking for a 2 dimensional rectangle of data, will look for a 3
dimensional cuboid of data.

There is no reason why the dimensioning should stop at 3 dimensions. In fact almost all
queries can be thought of as approaching a multi-dimensioned unit of data from a
multidimensioned volume of the schema.
19. Why partitioning is needed in large data warehouse?
Partitioning is needed in any large data ware house to ensure that the performance and
manageability is improved. It can help the query redirection to send the queries to the
appropriate partition, thereby reducing the overall time taken for query processing.
20. Explain the types of partitioning in detail.
i). Horizontal partitioning :This is essentially means that the table is partitioned after the first few thousand entries, and
the next
few thousand entries etc. This is because in most cases, not all the information in the fact
table needed all the time. Thus horizontal partitioning helps to reduce the query access time,
by directly cutting down the amount of data to be scanned by the queries.
ii). Vertical partitioning :As the name suggests, a vertical partitioning scheme divides the table vertically i.e. each
row is
divided into 2 or more partitions.
iii). Hardware partitioning :Needless to say, the dataware design process should try to maximize the performance of the
system. One of the ways to ensure this is to try to optimize by designing the data base with
respect to specific hardware architecture.
21. Explain the mechanism of row splitting.
Row Splitting :The method involved identifying the not so frequently used fields and putting them into
another table.
This would ensure that the frequently used fields can be accessed more often, at much lesser
computation time.
It can be noted that row splitting may not reduce or increase the overall storage needed, but
normalization may involve a change in the overall storage space needed. In row splitting, the
mapping is 1 to 1 whereas normalization may produce one to many relationships.
22. Explain the guidelines used for hardware partitioning.
Guidelines used for hardware partitioning :Needless to say, the dataware design process should try to maximize the performance of the
system. One of the ways to ensure this is to try to optimize by designing the data base with
respect to specific hardware architecture. Obviously, the exact details of optimization
depends on the hardware platforms. Normally the following guidelines are useful:i). maximize the processing, disk and I/O operations.
ii). Reduce bottlenecks at the CPU and I/O
23. What is aggregation? Explain the need of aggregation. Give example.
Aggregation : Data aggregation is an essential component of any decision support data ware
house. It helps us to ensure a cost effective query performance, which in other words means

that costs incurred to get the answers to a query would be more than off set by the benefits of
the query answer. The data aggregation attempts to do this by reducing the processing power
needed to process the queries. However, too much of aggregations would only lead to
unacceptable levels of operational costs.
Too little of aggregations may not improve the performance to the required levels. A file
balancing of
the two is essential to maintain the requirements stated above. One thumbrule that is often
suggested is that about three out of every four queries would be optimized by the aggregation
process, whereas the fourth will take its own time to get processed. The second, though
minor, advantage of aggregations is that they allow us to get the overall trends in the data.
While looking at individual data such overall trends may not be obvious, whereas aggregated
data will help us draw certain conclusions easily.
24. Explain the different aspects for designing the summary table.
Summary table are designed by following the steps given below :i). Decide the dimensions along which aggregation is to be done.
ii). Determine the aggregation of multiple facts.
iii). Aggregate multiple facts into the summary table.
iv). Determine the level of aggregation and the extent of embedding.
v). Design time into the table.
vi). Index the summary table.
25. Give the reasons for creating the data mart.
The following are the reasons for which data marts are created :i). Since the volume of data scanned is small, they speed up the query processing.
ii). Data can be structured in a form suitable for a user access too
iii). Data can be segmented or partitioned so that they can be used on different platforms
and
also different control strategies become applicable.
26. Explain the two stages in setting up data marts.
There are two stages in setting up data marts :i). To decide whether data marts are needed at all. The above listed facts may help you to
decide whether it is worth while to setup data marts or operate from the warehouse itself.
The problem is almost similar to that of a merchant deciding whether he wants to set up retail
shops or not.
ii). If you decide that setting up data marts is desirable, then the following steps have to be
gone
through before you can freeze on the actual strategy of data marting.
a) Identify the natural functional splits of the organization.
b) Identify the natural splits of data.
c) Check whether the proposed access tools have any special data base structures.
d) Identify the infrastructure issues, if any, that can help in identifying the data marts.
e) Look for restrictions on access control. They can serve to demarcate the warehouse
details.
27. What are disadvantages of data mart?
There are certain disadvantages :i). The cost of setting up and operating data marts is quite high.

ii). Once a data strategy is put in place, the datamart formats become fixed. It may be fairly
difficult to change the strategy later, because the data marts formats also have to be changes.
28. What is role of access control issue in data mart design?
Role of access control issue in data mart design :This is one of the major constraints in data mart designs. Any data warehouse, with its huge
volume
of data is, more often than not, subject to various access controls as to who could access
which part of data. The easiest case is where the data is partitioned so clearly that a user of
each partition cannot access any other data. In such cases, each of these can be put in a data
mart and the user of each can access only his data .
In the data ware house, the data pertaining to all these marts are stored, but the partitioning
are retained. If a super user wants to get an overall view of the data, suitable aggregations can
be generated.
29. Explain the purpose of using metadata in detail.
Metadata will be used for the following purposes :i). data transformation and loading.
ii). data management.
iii). query generation.
30. Explain the concept of metadata management.
Meta data should be able to describe data as it resides in the data warehouse. This will help
the warehouse manager to control data movements. The purpose of the metadata is to
describe the objects in the database. Some of the descriptions are listed here.
Tables
- Columns
* Names
* Types
Indexes
- Columns
* Name
* Type
Views
- Columns
* Name
* Type
Constraints
- Name
- Type
- Table
* Columns
31. How the query manager uses the Meta data? Explain in detail.
Meta data is also required to generate queries. The query manger uses the metadata to build a
history of all queries run and generator a query profile for each user, or group of uses.
We simply list a few of the commonly used meta data for the query. The names are self
explanatory.

o Query
o Table accessed
Column accessed
Name
Reference identifier
o Restrictions applied
o Column name
o Table name
o Reference identifier
o Restrictions
o Join criteria applied
o Column name
o Table name
o Reference identifier
o Column name
o Table name
o Reference identifier
o Aggregate function used
o Column name
o Reference identifier
o Aggregate function
o Group by criteria
o Column name
o Reference identifier
o Sort direction
o Syntax
o Resources
o Disk
o Read
o Write
o Temporary
32. Why we need different managers to a data ware house? Explain.
Need for managers to a data ware house :Data warehouses are not just large databases. They are complex environments that integrate
many
technologies. They are not static, but will be continuously changing both contentwise and
structurewise. Thus, there is a constant need for maintenance and management. Since huge
amounts of time, money and efforts are involved in the development of data warehouses,
sophisticated management tools are always justified in the case of data warehouses.
When the computer systems were in their initial stages of development, there used to be an
army of
human managers, who went around doing all the administration and management. But such a
scheme became both unvieldy and prone to errors as the systems grew in size and complexity.
Further most of the management principles were adhoc in nature and were subject to human
errors and fatigue.
33. With neat diagram explain the boundaries of process managers.
A schematic diagram that defines the boundaries of the three types of managers :-

34. Explain the responsibilities of each manager of data ware house.


Ware house Manager :The warehouse manager is responsible for maintaining data of the ware house. It should also
create
and maintain a layer of meta data. Some of the responsibilities of the ware house manager are
o Data movement
o Meta data management
o Performance monitoring
o Archiving.
Data movement includes the transfer of data within the ware house, aggregation, creation and
maintenance of tables, indexes and other objects of importance. It should be able to create
new aggregations as well as remove the old ones. Creation of additional rows / columns,
keeping track of the aggregation processes and creating meta data are also its functions.
25. What are the different system management tools used for data warehouse?
The different system management tools used for data warehouse :i). Configuration managers
ii). schedule managers
iii). event managers
iv). database mangers
v). back up recovery managers
vi). resource and performance a monitors.

KU 5TH SEM ASSIGNMENT - BSIT (TB) - 53 (DATA


WAREHOUSING & DATA MINING)
PART - A
I. Note: Answer all the questions.
a) What is Normalization? What are the different forms of Normalization ?
The usual approach in normalization in database applications is to ensure that the data is
divided into two or more tables, such that when the data in one of them is updated, it does not
lead to anamolies of data (The student is advised to refer any book on data base management
systems for details, if interested).
The idea is to ensure that when combined, the data available is consistent. However, in data
warehousing, one may even tend to break the large table into several denormalized smaller
tables. This may lead to lots of extra space being used. But it helps in an indirect way It
avoids the overheads of joining the data during queries.
b) Define Data warehouse. What are roles of education in a data warehousing delivery
process?
Data Warehouse: In its simplest form, a data ware house is a collection of key pieces of
information used to manage and direct the business for the most profitable outcome. It would

decide the amount of inventory to be held, the no. of employees to be hired, the amount to be
procured on loan etc.,.
The above definition may not be precise - but that is how data ware house systems are. There
are different definitions given by different authors, but we have this idea in mind and
proceed. It is a large collection of data and a set of process managers that use this data to
make information available. The data can be meta data, facts, dimensions and aggregations.
The process managers can be load managers, ware house managers or query managers. The
information made available is such that they allow the end users to make informed decisions.
Roles of education in a data warehousing delivery process:This has two roles to play - one to make people, specially top level policy makers,
comfortable with the concept. The second role is to aid the prototyping activity. To take care
of the education concept, an initial (usually scaled down) prototype is created and people are
encouraged to interact with it. This would help achieve both the activities listed above. The
users became comfortable with the use of the system and the ware house developer becomes
aware of the limitations of his prototype which can be improvised upon.
c) What is process managers? What are the different types of process managers?
Process Managers: These are responsible for the smooth flow, maintainance and upkeep of
data into and out of the database.
The main types of process managers are:i). Load manager: to take case of source interaction, data transformation and data load.
ii). Ware house manger: to take care of data movement, meta data management and
performance
monitoring.
iii). Query manager: to control query scheduling and monitoring.
We shall look into each of them briefly. Before that, we look at a schematic diagram that
defines the boundaries of the three types of managers.
d) Give the architectures of data mining systems.
e) What are the guidelines for KDD environment ?
It is customary in the computer industry to formulate rules of thumb that help information
technology (IT) specialists to apply new developments. In setting up a reliable data mining
environment we may follow the guidelines so that KDD system may work in a manner we
desire.
i). Support extremely large data sets
ii). Support hybrid learning
iii). Establish a data warehouse
iv). Introduce data cleaning facilities
v). Facilitate working with dynamic coding
vi). Integrate with decision support system
vii). Choose extendible architecture
viii). Support heterogeneous databases
ix). Introduce client/server architecture
x). Introduce cache optimization
PART - B

II. Answer any FIVE full questions.


1. a) With the help of a diagram explain architecture of data warehouse.
The architecture for a data ware is indicated below. Before we proceed further, we should be
clear about the concept of architecture. It only gives the major items that make up a data ware
house. The size and complexity of each of these items depend on the actual size of the ware
house itself, the specific requirements of the ware house and the actual details of
implementation.
Before looking into the details of each of the managers we could get a broad idea about their
functionality by mapping the processes that we studied in the previous chapter to the
managers. The extracting and loading processes are taken care of by the load manager. The
processes of cleanup and transformation of data as also of back up and archiving are the
duties of the ware house manage, while the query manager, as the name implies is to take
case of query management.
b) Indicate the important function of a Load Manager, Warehouse Manager.
Important function of Load Manager:
i) To extract data from the source (s)
ii) To load the data into a temporary storage device
iii) To perform simple transformations to map it to the structures of the data ware house.
Important function of Warehouse Manager:
i) Analyze the data to confirm data consistency and data integrity .
ii) Transform and merge the source data from the temporary data storage into the ware house.
iii) Create indexes, cross references, partition views etc.,.
iv) Check for normalizations.
v) Generate new aggregations, if needed.
vi) Update all existing aggregations
vii) Create backups of data.
viii) Archive the data that needs to be archived.
2. a) Differentiate between vertical partitioning and horizontal partitioning.
In horizontal partitioning, we simply the first few thousand entries in one partition, the
second few thousand in the next and so on. This can be done by partitioning by time, where
in all data pertaining to the first month / first year is put in the first partition, the second one
in the second partition and so on. The other alternatives can be based on different sized
dimensions, partitioning an other dimensions, petitioning on the size of the table and round
robin partitions. Each of them have certain advantages as well as disadvantages.
In vertical partitioning, some columns are stored in one partition and certain other columns of
the same row in a different partition. This can again be achieved either by normalization or
row splitting. We will look into their relative trade offs.
b) What is schema? Distinguish between facts and dimensions.
A schema, by definition, is a logical arrangements of facts that facilitate ease of storage and
retrieval, as described by the end users. The end user is not bothered about the overall
arrangements of the data or the fields in it. For example, a sales executive, trying to project
the sales of a particular item is only interested in the sales details of that item where as a tax
practitioner looking at the same data will be interested only in the amounts received by the
company and the profits made.
The star schema looks a good solution to the problem of ware housing. It simply states that

one should identify the facts and store it in the read-only area and the dimensions surround
the area. Whereas the dimensions are liable to change, the facts are not. But given a set of raw
data from the sources, how does one identify the facts and the dimensions? It is not always
easy, but the following steps can help in that direction.
i) Look for the fundamental transactions in the entire business process. These basic entities
are the facts.
ii) Find out the important dimensions that apply to each of these facts. They are the
candidates
for dimension tables.
iii) Ensure that facts do not include those candidates that are actually dimensions, with a set
of
facts attached to it.
iv) Ensure that dimensions do not include these candidates that are actually facts.
3. a) What is an event in data warehousing? List any five events.
An event is defined as a measurable, observable occurrence of a defined action. If this
definition is quite vague, it is because it encompasses a very large set of operations. The
event manager is a software that continuously monitors the system for the occurrence of the
event and then take any action that is suitable (Note that the event is a measurable and
observable occurrence). The action to be taken is also normally specific to the event.
A partial list of the common events that need to be monitored are as follows:
i). Running out of memory space.
ii). A process dying
iii). A process using excessing resource
iv). I/O errors
v). Hardware failure
b) What is summary table? Describe the aspects to be looked into while designing a summary
table.
The main purpose of using summary tables is to cut down the time taken to execute a specific
query.
The main methodology involves minimizing the volume of data being scanned each time the
query is to be
answered. In other words, partial answers to the query are already made available. For
example, in the
above cited example of mobile market, if one expects
i) the citizens above 18 years of age
ii) with salaries greater than 15,000 and
iii) with professions that involve traveling are the potential customers, then, every time the
query is to be processed (may be every month or every quarter), one will have to look at the
entire data base to compute these values and then combine them suitably to get the relevant
answers. The other method is to prepare summary tables, which have the values pertaining
toe ach of these sub-queries, before hand, and then combine them as and when the query is
raised.
Summary table are designed by following the steps given below:
i) Decide the dimensions along which aggregation is to be done.
ii) Determine the aggregation of multiple facts.
iii) Aggregate multiple facts into the summary table.
iv) Determine the level of aggregation and the extent of embedding.
v) Design time into the table.

vi) Index the summary table.


4. a) List the significant issues in automatic cluster detection.
Most of the issues related to automatic cluster detection are connected to the kinds of
questions we want to be answered in the data mining project, or data preparation for their
successful application.
i). Distance measure
Most clustering techniques use for the distance measure the Euclidean distance formula
(square root of the sum of the squares of distances along each attribute axes).
Non-numeric variables must be transformed and scaled before the clustering can take place.
Depending
on this transformations, the categorical variables may dominate clustering results or they may
be even
completely ignored.
ii). Choice of the right number of clusters
If the number of clusters k in the K-means method is not chosen so to match the natural
structure of the data, the results will not be good. The proper way t alleviate this is to
experiment with different values for k. In principle, the best k value will exhibit the smallest
intra-cluster distances and largest inter-cluster distances.
iii). Cluster interpretation
Once the clusters are discovered they have to be interpreted in order to have some value for
the data mining project.
b) Define data marting. List the reasons for data marting.
The data mart stores a subset of the data available in the ware house, so that one need not
always have to scan through the entire content of the ware house. It is similar to a retail
outlet. A data mart speeds up the queries, since the volume of data to be scanned is much less.
It also helps to have tail or made processes for different access tools, imposing control
strategies etc.,.
Following are the reasons for which data marts are created:
i) Since the volume of data scanned is small, they speed up the query processing.
ii) Data can be structured in a form suitable for a user access too
iii) Data can be segmented or partitioned so that they can be used on different platforms and
also different control strategies become applicable.
5. a) Explain how to categorize data mining system.
There are many data mining systems available or being developed. Some are specialized
systems dedicated to a given data source or are confined to limited data mining
functionalities, other are more versatile and comprehensive. Data mining systems can be
categorized according to various criteria among other classification are the following:
a) Classification according to the type of data source mined: this classification categorizes
data mining systems according to the type of data handled such as spatial data, multimedia
data, time-series data, text data, World Wide Web, etc.
b) Classification according to the data model drawn on: this classification categorizes data
mining systems based on the data model involved such as relational database, object-oriented
database, data warehouse, transactional, etc.
c) Classification according to the king of knowledge discovered: this classification
categorizes data mining systems based on the kind of knowledge discovered or data mining
functionalities, such as characterization, discrimination, association, classification, clustering,
etc. Some systems tend to be comprehensive systems offering several data mining

functionalities together.
d) Classification according to mining techniques used: Data mining systems employ and
provide different techniques. This classification categorizes data mining systems according to
the data analysis approach used such as machine learning, neural networks, genetic
algorithms, statistics, visualization, database oriented or data warehouse-oriented, etc.
b) List and explain different kind of data that can be mined.
Different kind of data that can be mined are listed below:i). Flat files: Flat files are actually the most common data source for data mining algorithms,
especially at the research level.
ii). Relational Databases: A relational database consists of a set of tables containing either
values of entity attributes, or values of attributes from entity relationships.
iii). Data Warehouses: A data warehouse as a storehouse, is a repository of data collected
from multiple data sources (often heterogeneous) and is intended to be used as a whole under
the same unified schema.
iv). Multimedia Databases: Multimedia databases include video, images, audio and text
media. They can be stored on extended object-relational or object-oriented databases, or
simply on a file system.
v). Spatial Databases: Spatial databases are databases that in addition to usual data, store
geographical information like maps, and global or regional positioning.
vi). Time-Series Databases: Time-series databases contain time related data such stock market
data or logged activities. These databases usually have a continuous flow of new data coming
in, which sometimes causes the need for a challenging real time analysis.
vii). World Wide Web: The World Wide Web is the most heterogeneous and dynamic
repository available. A very large number of authors and publishers are continuously
contributing to its growth and metamorphosis and a massive number of users are accessing its
resources daily.
6. a) Give the syntax for task relevant data specification.
Syntax for tax-relevant data specification:The first step in defining a data mining task is the specification of the task-relevant data, that
is, the data on which mining is to be performed. This involves specifying the database and
tables or data warehouse containing the relevant data, conditions for selecting the relevant
data, the relevant attributes or dimensions for exploration, and instructions regarding the
ordering or grouping of the data retrieved. DMQL provides clauses for the clauses for the
specification of such information, as follows:i). use database (database_name) or use data warehouse (data_warehouse_name): The use
clause directs the mining task to the database or data warehouse specified.
ii). from (relation(s)/cube(s)) [where(condition)]: The from and where clauses respectively
specify the database tables or data cubes involved, and the conditions defining the data to be
retrieved.
iii). in relevance to (attribute_or_dimension_list): This clause lists the attributes or
dimensions for exploration.
iv). order by (order_list): The order by clause specifies the sorting order of the task relevant
data.
v). group by (grouping_list): the group by clause specifies criteria for grouping the data.
vi). having (conditions): The having cluase specifies the condition by which groups of data
are considered relevant.
b) Explain the designing of GUI based on data mining query language.

A data mining query language provides necessary primitives that allow users to communicate
with data mining systems. But novice users may find data mining query language difficult to
use and the syntax difficult to remember. Instead , user may prefer to communicate with data
mining systems through a graphical user interface (GUI). In relational database technology ,
SQL serves as a standard core language for relational systems , on top of which GUIs can
easily be designed. Similarly, a data mining query language may serve as a core language for
data mining system implementations, providing a basis for the development of GUI for
effective data mining.
A data mining GUI may consist of the following functional components:a) Data collection and data mining query composition - This component allows the user to
specify task-relevant data sets and to compose data mining queries. It is similar to GUIs used
for the specification of relational queries.
b) Presentation of discovered patterns This component allows the display of the discovered
patterns in various forms, including tables, graphs, charts, curves and other visualization
techniques.
c) Hierarchy specification and manipulation - This component allows for concept hierarchy
specification , either manually by the user or automatically. In addition , this component
should allow concept hierarchies to be modified by the user or adjusted automatically based
on a given data set distribution.
d) Manipulation of data mining primitives This component may allow the dynamic
adjustment of data mining thresholds, as well as the selection, display and modification of
concept hierarchies. It may also allow the modification of previous data mining queries or
conditions.
e) Interactive multilevel mining This component should allow roll-up or drill-down
operations on discovered patterns.
f) Other miscellaneous information This component may include on-line help manuals,
indexed search , debugging and other interactive graphical facilities.
7. a) Explain how decision trees are useful in data mining.
Decision trees are powerful and popular tools for classification and prediction. The
attractiveness of tree-based methods is due in large part to the fact that, it is simple and
decision trees represent rules. Rules can readily be expressed so that we humans can
understand them or in a database access language like SQL so that records falling into a
particular category may be retrieved.
b) Identify an application and also explain the techniques that can be incorporated in solving
the problem using data mining techniques.
Write yourself...
8. Write a short notes on :
i) Data Mining Querying Language
ii) Schedule Manager
iii) Data Formatting.
i) Data Mining Querying Language
A data mining language helps in effective knowledge discovery from the data mining
systems. Designing
a comprehensive data mining language is challenging because data mining covers a wide
spectrum of
tasks from data characterization to mining association rules, data classification and evolution
analysis.

Each task has different requirements. The design of an effective data mining query language
requires a
deep understanding of the power, limitation and underlying mechanism of the various kinds
of data mining
tasks.
ii) Schedule manager
The scheduling is the key for successful warehouse management. Almost all operations in the
ware
house need some type of scheduling. Every operating system will have its own scheduler and
batch
control mechanism. But these schedulers may not be capable of fully meeting the
requirements of a data
warehouse. Hence it is more desirable to have specially designed schedulers to manage the
operations.
iii) Data formatting
Final data preparation step which represents syntactic modifications to the data that do not
change its
meaning, but are required by the particular modelling tool chosen for the DM task. These
include:
a). reordering of the attributes or records: some modelling tools require reordering of the
attributes
(or records) in the dataset: putting target attribute at the beginning or at the end, randomizing
order of records (required by neural networks for example)
b). changes related to the constraints of modelling tools: removing commas or tabs, special
characters, trimming strings to maximum allowed number of characters, replacing special
characters with allowed set of special characters.

Software Quality and Testing


Assignment: TA (Compulsory)
1.What is software testing? Software testing is tougher
than hardware testing, justify your answer.
Ans:-Software testing is the process of executing a program with the
intent of finding errors. It is used to ensure the correctness of a software
product. Software testing is also done to add value to software so that its
quality and reliability is raised.
Software testing is a critical element of software quality assurance and
represents the ultimate process to ensure the correctness of the product.
The quality of product always enhances the customer confidence in using
the product thereby increasing the business economics. In other words, a
good quality product means zero defects, which is derived from a
better quality processin testing.Testing the product means adding value to
it which means raising the quality or reliability of the program. Raising the

reliability of the product means finding and removing errors. Hence one
should not test a product to show that it works; rather, one should start
with the assumption that the program contains errors and then test the
program to find as many errors as possible.
2. Explain the test information flow in a typical software test life
cycle.
Ans:- Testing is a complex process and requires effort similar to software
development . a typicaltest information flow is show in figure

Predicted Reliability
Software Configuration includes a Software Requirements Specification, a
Design Specification, and source code. A test configuration includes a Test
Plan and Procedures, test cases, and testing tools. It is difficult to predict
the time to debug the code, hence it is difficult to schedule.
Once the right software is available for testing, proper test plan and test
cases are developed. Then the software is subjected to test with simulated
test data. After the test execution, the testresults are examined. It may
have defects or the software is passed with out any defect. The software

with defect is subjected to debugging and again tested for its correctness.
This process will continue till the testing reports zero defects or run out of
time for testing.
3.What is risk in software testing? How risk management
improves the quality of the software?
Ans:- The risk associated with a software application being developed are
called software risks. These risks can lead to errors in the code affecting
the functioning of the application.
Following are the factors that lead to software risks:
1.

Skills of software

ii. Disgruntled
iii. Poorly defined project objectives
iv. Project risks
v. Technical risks
5. Explain the black and white box testing? Explain with an example.
Which method better? List out drawbacks of each one.
Ans:- Black box testing treats the system as a black-box, so it doesnt
explicitly use Knowledge of the internal structure or code. Or in other
words the Test engineer need not know the internal working of the Black
box or application.
Main focus in black box testing is on functionality of the system
as a whole. The termbehavioral testing is also used for black box
testing and white box testing is also sometimes called structural
testing. Behavioral test design is slightly different from black-box test
design because the use of internal knowledge isnt strictly forbidden, but
its still discouraged.
Disadvantages of Black Box Testing
- The test inputs needs to be from large sample space.
- It is difficult to identify all possible inputs in limited testing time. So
writing test cases is slow and difficult
- Chances of having unidentified paths during this testing

White box testing involves looking at the structure of the code. When you
know the internal structure of a product, tests can be conducted to ensure
that the internal operations performed according to the specification. And
all internal components have been adequately exercised. Drawbacks of
WBT:
Not possible for testing each and every path of the loops in program. This
means exhaustive testing is impossible for large systems.
This does not mean that WBT is not effective. By selecting important
logical paths and data structure for testing is practically possible and
effective.
6. What is cyclomatic complexity? Explain with an illustration.
Discuss its role in software testing and generating test cases.
Ans:The cyclomatic complexity gives a quantitative measure of the logical
complexity. This value gives the number of independent paths in the basis
set, and an upper bound for the number of tests to ensure that each
statement is executed at least once. An independent path is any path
through a program that introduces at least one new set of processing
statements or a new condition
Cyclomatic Complexity of 4 can be calculated as:
1. Number of regions of flow graph, which is 4.
2. #Edges #Nodes + 2, which is 11-9+2=4.
3. #Predicate Nodes + 1, which is 3+1=4.
The above complexity provides the upper bound on the number of tests
cases to be generated or independent execution paths in the program.
The independent paths (4 paths) for the program shown in
Independent Paths:
1. 1, 8
2. 1, 2, 3, 7b, 1, 8
3. 1, 2, 4, 5, 7a, 7b, 1, 8
4. 1, 2, 4, 6, 7a, 7b, 1, 8

Cyclomatic complexity provides upper bound for number of tests required


to guarantee the coverage of all program statements.
7. What is coupling and cohesion? Mention different types of
coupling and cohesion. Explain their role in testing.
Ans:- cohesion is a measure of how strongly-related each piece of
functionality expressed by thesource code of a software module is.
Methods of measuring cohesion vary from qualitative measures classifying
the source text being analyzed using a rubric with a hermeneutics
approach to quantitative measures which examine textual characteristics
of the source code to arrive at a numerical cohesion score.
Types of cohesion
Coincidental cohesion (worst)
Coincidental cohesion is when parts of a module are grouped arbitrarily;
the only relationship between the parts is that they have been grouped
together (e.g. a Utilities class).
Logical cohesion
Logical cohesion is when parts of a module are grouped because they
logically are categorized to do the same thing, even if they are different
by nature (e.g. grouping all mouse and keyboard input handling routines).
Temporal cohesion
Temporal cohesion is when parts of a module are grouped by when they
are processed the parts are processed at a particular time in program
execution (e.g. a function which is called after catching an exception
which closes open files, creates an error log, and notifies the user).
Procedural cohesion
Procedural cohesion is when parts of a module are grouped because they
always follow a certain sequence of execution (e.g. a function which
checks file permissions and then opens the file).
Communicational cohesion
Communicational cohesion is when parts of a module are grouped
because they operate on the same data (e.g. a module which operates on
the same record of information).
Sequential cohesion

Sequential cohesion is when parts of a module are grouped because the


output from one part is the input to another part like an assembly line
(e.g. a function which reads data from a file and processes the data).
coupling or dependency is the degree to which each program module
relies on each one of the other modules. Coupling is usually contrasted
with cohesion. Low coupling often correlates with high cohesion, and vice
versa.
Types of coupling
Content coupling (high)
Content coupling (also known as Pathological coupling) is when one
module modifies or relies on the internal workings of another module
(e.g., accessing local data of another module).
Therefore changing the way the second module produces data (location,
type, timing) will lead to changing the dependent module.
Common coupling
Common coupling (also known as Global coupling) is when two modules
share the same global data (e.g., a global variable).
Changing the shared resource implies changing all the modules using it.
External coupling
External coupling occurs when two modules share an externally imposed
data format, communication protocol, or device interface.This is basically
related to the communication to external tools and devices.
Control coupling
Control coupling is one module controlling the flow of another, by passing
it information on what to do (e.g., passing a what-to-do flag).
Stamp coupling (Data-structured coupling)
Stamp coupling is when modules share a composite data structure and
use only a part of it, possibly a different part (e.g., passing a whole record
to a function that only needs one field of it).
This may lead to changing the way a module reads a record because a
field that the module doesnt need has been modified.

Data coupling
Data coupling is when modules share data through, for example,
parameters. Each datum is an elementary piece, and these are the only
data shared (e.g., passing an integer to a function that computes a square
root).
8. Compare and contrast between Verification and Validation with
examples.
Ans:- Verification testing ensures that the expressed user requirements,
gathered in the Project Initiation phase, have been met in the Project
Execution phase. One way to do this is to produce auser requirements
matrix or checklist and indicate how you would test for each requirement.
For example, if the product is required to weigh no more than 15 kg.
(about 33 lbs.), the test could be, Weigh the object does it weigh 15 kg.
or less?, and note yes or no on the matrix or checklist.
Validation testing ensures that any implied requirement has been met. It
usually occurs in theProject Monitoring and Control phase of project
management. Using the above product as an example, you ask the
customer, Why must it be no more than 15 kg.? One answer is, It
must be easy to lift by hand. You could validate that requirement by
having twenty different people lift the object and asking each one, Was
the object easily to lift? If 90% of them said it was easy,
you could conclude that the object meets the requirement.
8.What is stress testing? Where do you need this testing? Explain.
Ans:- Stress testing is a form of testing that is used to determine the
stability of a given system or entity. It involves testing beyond normal
operational capacity, often to a breaking point, in order to observe the
results
The Need of stress Testing:

A web server may be stress tested using scripts, bots, and various
denial of service tools to observe the performance of a web site during
peak loads.

Stress testing may be contrasted with load testing:

Load testing examines the entire environment and database, while


measuring the response time, whereas stress testing focuses on
identified transactions, pushing to a level so as to break transactions or
systems.

During stress testing, if transactions are selectively stressed, the


database may not experience much load, but the transactions are
heavily stressed. On the other hand, during load testing the database
experiences a heavy load, while some transactions may not be
stressed.

System stress testing, also known as stress testing, is loading the


concurrent users over and beyond the level that the system can handle,
so it breaks at the weakest link within the entire system
10. Software testing is difficult than implementation. Yes or no, justify your
answer.
Ans:- Software testing is any activity aimed at evaluating an attribute or
capability of a program or system and determining that it meets its
required results. Although crucial to software quality and widely deployed
by programmers and testers, software testing still remains an art, due to
limited understanding of the principles of software. The difficulty in
software testing stems from the complexity of software: we cannot
completely test a program with moderate complexity. Testing is more than
just debugging. The purpose of testing can be quality assurance,
verification and validation, or reliability estimation. Testing can be used as
a generic metric as well. Correctness testing and reliability testing are two
major areas of testing. Software testing is a trade-off between budget,
time and quality.
11. What are test cases? Explain the importance of Domain knowledge in
test case generation. Mention the difficulties in preparing test cases.
Ans:- A test case is a detailed procedure that fully tests a feature or an
aspect of a feature in a test process. These cases describe how to
perform a particular test . a test case should be developed for each type
of test listed in the test process. Domain knowledge helps to preparing
test design easily thats why it is important.
Difficulties in preparing test data :Test data set examples:
1) No data: Run your test cases on blank or default data. See if proper
error messages are generated.
2) Valid data set: Create it to check if application is functioning as per
requirements and valid input data is properly saved in database or files.
3) Invalid data set: Prepare invalid data set to check application
behavior for negative values, alphanumeric string inputs.

4) Illegal data format: Make one data set of illegal data format. System
should not accept data in invalid or illegal format. Also check proper error
messages are generated.
5) Boundary Condition data set: Data set containing out of range data.
Identify application boundary cases and prepare data set that will cover
lower as well as upper boundary conditions.
6) Data set for performance, load and stress testing: This data set
should be large in volume.

You might also like