You are on page 1of 70

Business Intelligence and Data

Visualization
Lab Manual
Department of Computer Science and Engineering
The NorthCap University, Gurugram
BIDV Lab Manual (CSL 232) | i
2021-22

Business Intelligence and Data


Visualization Lab Manual
CSL 232

Dr. Poonam
Chaudhary Dr. Srishti

Department of Computer Science and Engineering

The NorthCap University, Gurugram- 122001, India

Session 2022-23
BIDV Lab Manual (CSL 232) |
ii
2022-23

Published by:

School of Engineering and Technology

Department of Computer Science & Engineering

The NorthCap University Gurugram

• Laboratory Manual is for Internal Circulation only

© Copyright Reserved

No part of this Practical Record Book may be

reproduced, used, stored without prior permission of The NorthCap University

Copying or facilitating copying of lab work comes under cheating and is considered as use of
unfair means. Students indulging in copying or facilitating copying shall be awarded zero marks
for that particular experiment. Frequent cases of copying may lead to disciplinary action.
Attendance in lab classes is mandatory.

Labs are open up to 7 PM upon request. Students are encouraged to make full use of labs beyond
normal lab hours.

PREFACE

Business Intelligence and Data Visualization Lab Manual is designed to meet the course and
program requirements of NCU curriculum for B.Tech II year students of CSE branch. The
concept of the lab work is to give brief practical experience for basic lab skills to students. It
provides the space and scope for self-study so that students can come up with new and creative
ideas.
BIDV Lab Manual (CSL 232) |
iii
2022-23

The Lab manual is written on the basis of “teach yourself pattern” and expected that students
who come with proper preparation should be able to perform the experiments without any
difficulty. Brief introduction to each experiment with information about self-study material is
provided. The laboratory exercises will include installing Tableau Desktop or KNIME Analytics
Platform and familiarization with their interface; experiments on strengthening the basics of data
visualization. Then, students would be familiarized with different types of maps, charts,
parameters, trend lines, forecasting in Tableau. Building basic workflows for Data pre-
processing, analysis and mining tasks using KNIME are discussed. Finally, the students would
require to do guided and unguided project. Students are expected to come thoroughly prepared
for the lab. General disciplines, safety guidelines and report writing are also discussed.

The lab manual is a part of curriculum for the TheNorthCap University, Gurugram. Teacher’s
copy of the experimental results and answer for the questions are available as sample guidelines.

We hope that lab manual would be useful to students of CSE, IT, ECE and BSc branches and
author requests the readers to kindly forward their suggestions / constructive criticism for further
improvement of the work book.

Author expresses deep gratitude to Members, Governing Body-NCU for encouragement and
motivation.

Authors
The NorthCap
University
Gurugram, India

CONTENTS
S.N. Details Page No.

Syllabus 5

1 Introduction 8
BIDV Lab Manual (CSL 232) |
iv
2022-23

2 Lab Requirement 9

3 General Instructions 10

4 List of Experiments 12

5 List of Flip Assignment 15

6 List of Projects 16

7 Rubrics 17

8 Annexure 1 (Format of Lab Report)

9 Annexure 2 (Format of Project Report)

SYLLABUS
1. Department: Department of CSE

2. Course Name: 3. Course Code: 4. L-T-P 5. Credits


Business Intelligence
and Data Visualization Code: CSL 232 2-0-4 4
BIDV Lab Manual (CSL 232) |
v
2022-23

6. Type of Course (Check


Programme Core Programme Elective 
Open Elective
one):

7. Frequency of offering (check one): Odd Even Either Sem. Every Sem.

8. Brief Syllabus:
Introduction to data analysis, Data processing, Fundamental of Data Visualization Compare
and Contrast, Business Intelligenece, User Interface –Tableau Desktop . Dashboards and
Stories Building a Dashboard, Dashboard Layouts and Formatting , Exploratory vs.
Explanatory, Statistical test, Preprocessing, Multidimensional Visualization, Infographics, Level
of Details, Building Gapminder in Tableau, Basic Geo-Coding for Tableau, Animations,
Introduction to Knime Analytics Platform, Knime workbook, Data exploration, modeling and
reporting in Knime, Database operation, web, date and time, loops in knime, advance
reporting, Introduction to SQL, Joins, subqueries, store routine, SQL and Tableau problems.

9. Total lecture and Practical Hours for this course: 90 Hours


The class size for laboratory experiments is maximum 30 learners.

10. Course Outcomes (COs)


Possible usefulness of this course after its completion i.e., how this course will be practically
useful to him once it is completed

Understand and apply the data aggregation, extraction and data pre-processing related
CO 1
to descriptive analytics.
Describe, Implement and Analyze descriptive and predictive level of competency on the
CO 2
use of Tableau software for business data visualization.
Apply and Compare BI tools to generate Level of Details, Building Gapminder in Tableau,
CO 3
Calculated Field and develop informative reports for storytelling.
Use data Science skills using Knime Analytics and design and evaluate predictive models
CO 4
for business data.
Organize real world data with SQL Joins, subqueries, store routine, Knime Interface with
CO 5 Tableau visualization and develop a framework for building analytical KPIs and
communicating results effectively
11. UNIT WISE DETAILS No. of Units: -05

Unit Number: 1 Title: Introduction to Business Analytics and Business Intelligence (CO1)
No. of hours:4
Content Summary:
BIDV Lab Manual (CSL 232) |
vi
2022-23

Brief Introduction of course, Introduction to Data Analysis, Types of Business Analytics, Business
Intelligence, Business Analytics and Intelligence in decision making process, what is Data
Visualization and why is it important, Visual Perception, Brief History of Data Visualization, Design
Principles – Pre-attentive Attributes and Thinking Systems, Data Processing, Descriptive statistics
introduction for problem solving, correlation, Statistical test.
Unit Number: 2 Title: Information Design using Tableau Software (CO2)
No. of hours: 6
Content Summary: Visualization Introduction, Data Visual Analytic Pipeline, Types of Data
Visualization, Installation and configuration of Tableau Desktop, The Fundamental of Data
Visualization – Reviewing the Halloween Exercise, Compare and Contrast, Data Quality, User
Interface –Tableau Desktop. Dashboards and Stories Building a Dashboard, Dashboard Layouts
and Formatting, Exploratory vs. Explanatory, Preprocessing.

Unit Number: 3 Title: Advanced Visualization (CO3) No. of hours: 8


Content Summary: Multidimensional Visualization, Infographics, Level of Details, Building
Gapminder in Tableau, Basic Geo-Coding for Tableau, Animations, Capstone Calculated
Field, Capstone Story UK Bank, Capstone NYC Salary Viz, Capstone Grouping

Unit Number: 4 Title: Data Analytics Using KNIME Analytics Platform (CO4)
No. of hours: 8
Content Summary: Type of Analytics, Predictive analytics: regression, classification and clustering,
Introduction to KNIME Analytics Platform, data blending, Data manipulation and aggregation, data
mining, KNIME workbook, Data exploration, modeling and reporting in KNIME, Database
operation, web, date and time, Flow Variable, loops in KNIME, advance reporting.

Unit Number: 5 Title: Data Analysis using SQL and Tableau (CO5)
No. of hours: 4
Content Summary: Introduction to SQL, Joins, subqueries, store routine, SQL and Tableau
problems, Case Study.

12. Guided Project (No. of Hours): Capstone Projects (2)

13. Unguided Project (No. of Hours): Tableau and KNIME projects (10)

14. Brief Description of Self-learning component by students (through books/resource material


etc.): Topics: NA
BIDV Lab Manual (CSL 232) |
vii
2022-23

15. Suggested Readings:


Text Books:
1. James Evans, Business Analytics, Global Edition, Pearson, 2nd Edition , 2016

Reference Books:

1. U Dinesh Kumar, Business Analytics: The Science of Data-Driven Decision Making, WILEY
INDIA, First Edition, 2017
2. Donabel Santos, Tableau 10 Business Intelligence Cookbook, Packt Publishing Limited,
First Edition, 2016
3. Gábor Bakos, KNIME Essentials, Packt Publishing Limited, First Edition, 2013
BIDV Lab Manual (CSL 232) |
viii
2022-23

1. INTRODUCTION

That ‘learning is a continuous process’ cannot be over emphasized. The theoretical


knowledge gained during lecture sessions need to be strengthened through practical
experimentation. Thus, practical makes an integral part of a learning process.

OBJECTIVE:
The purpose of conducting experiments can be stated as follows:

 To familiarize the students with the basic concepts of data visualization and mining.
The lab sessions will be based on exploring the concepts discussed in class.
 Learning and understanding the interface of Tableau Desktop.
 Learning and understanding the interface of the KNIME Analytics Platform.
 Hands on experience
BIDV Lab Manual (CSL 232) |
ix
2022-23

2. LAB REQUIREMENTS

Requirements Details

Software Requirements Tableau Desktop, The KNIME Analytics Platform, SQL

server

Operating System Windows(64-bit), Linux, MAC OS

Hardware 8 GB RAM (Recommended)

Requirements 2.60 GHz (Recommended)

Required Bandwidth NA
BIDV Lab Manual (CSL 232) |
x
2022-23

3. GENERAL INSTRUCTIONS

3.1 General discipline in the lab

 Students must turn up in time and contact concerned faculty for the experiment
they are supposed to perform.
 Students will not be allowed to enter late in the lab.
 Students will not leave the class till the period is over.
 Students should come prepared for their experiment.
 Experimental results should be entered in the lab report format and
certified/signed by concerned faculty/ lab Instructor.
 Students must get the connection of the hardware setup verified before
switching on the power supply.
 Students should maintain silence while performing the experiments. If any
necessity arises for discussion amongst them, they should discuss with a very
low pitch without disturbing the adjacent groups.
 Violating the above code of conduct may attract disciplinary action.
 Damaging lab equipment or removing any component from the lab may invite
penalties and strict disciplinary action.

3.2 Attendance
 Attendance in the lab class is compulsory.
 Students should not attend a different lab group/section other than the one
assigned at the beginning of the session.
 On account of illness or some family problems, if a student misses his/her lab
classes, he/she may be assigned a different group to make up the losses in
consultation with the concerned faculty / lab instructor. Or he/she may work
in the lab during spare/extra hours to complete the experiment. No attendance
will be granted for such case.
BIDV Lab Manual (CSL 232) |
xi
2022-23

3.3 Preparation and Performance

 Students should come to the lab thoroughly prepared on the experiments they
are assigned to perform on that day. Brief introduction to each experiment
with information about self-study reference is provided on LMS.
 Students must bring the lab report during each practical class with written
records of the last experiments performed complete in all respect.
 Each student is required to write a complete report of the experiment he has
performed and bring to lab class for evaluation in the next working lab.
Sufficient space in work book is provided for independent writing of theory,
observation, calculation and conclusion.
 Students should follow the Zero tolerance policy for copying / plagiarism. Zero
marks will be awarded if found copied. If caught further, it will lead to
disciplinary action.
 Refer Annexure 1 for Lab Report Format.
BIDV Lab Manual (CSL 232) |
xii
2022-23

4. LIST OF EXPERIMENTS

Sr. Title of the Experiment Software used Unit CO Time


No. covered Covered Required
1. Select the dataset and create a Tableau 1 CO1 3 hrs
Tableau viz to calculate central
tendency, measure of dispersion
and create a Tree map and draw a
histogram using Tableau
2. Select the dataset Superstore and Tableau 1 CO1 3 hrs
create a Tableau viz to draw a tree
map, pie chart, box plot, gantt
chart using Tableau
3. Create location hierarchies; build Tableau 1 CO1 3 hrs
and present a basic map view using
Tableau
4. Take two dataset and blend and Tableau 2 CO2 3 hrs
join them in Tableau.
5. Take superstore dataset and Tableau 2 CO2 3 hrs
create context and simple filters,
perform table calculation.
6. Take the superstore dataset and Tableau 3 CO3 3 hrs
create a dashboard. Present the
same with storytelling in Tableau.
7. Creating advance Level of Details Tableau 3 CO3 3 hrs
view for better analysis
8. Create a KNIME workflow where KNIME analytics 4 CO4 3 hrs
you read the provided autos data
set. Exlcude the columns
BIDV Lab Manual (CSL 232) |
xiii
2022-23

'normilized' and 'bore'. Filter the


rows on the price column and keep
only instances describing cars that
are less than $10,000. Write the
file out to csv and report back the
number of rows
9. Train a Decision Tree on 75% of KNIME analytics 4 CO4 3 hrs
the data and use the remaining
25% for testing in Knime Analytics
Tool. Set the parameters of the
Learner Decision Tree node to
utilize: gain ratio with no pruning
and minimum of 4 attributes per
leaf node. What is the overall
model accuracy?
10 Partition the data into 80/20% KNIME analytics 4 CO4 3 hrs
training and test data sets. Utilize
the K-means clustering algorithm
to train the model to produce 3
clusters. Use color, shape and
scatter plot nodes in Knime
Analytics to visualize results.
BIDV Lab Manual (CSL 232) |
xiv
2022-23

Linking SQL Server with Tableau Tableau and CO5


4. 1,2,3,4,5 4hrs
and Query Execution SQL Software
5. LIST OF FLIP ASSIGNMENTS

1. Select the dataset Superstore and create a Tableau viz to draw a tree map, pie chart,
box plot, gantt chart using Tableau.
2. Your company has tasked you with analyzing your company’s shipping status and
how many shipments are completed per week. Your task is to create a new
calculation and title it “Ship Status.” This new calculation should show whether the
shipment was “Shipped on Time,” “Shipped Early,” or “Shipped Late.” You will also
create a filled line chart showing the total number of shipments by order date.
Furthermore, you will need to colorize your chart by shipping status, and add Order
date, Ship Mode, and Ship Status to the filter. Filter by the Q4 2014 data only.
(Be sure to download the dataset below, as it is different from previous ones used in
this course.)
3. Sales Superstore Dataset.xlsx. The variables you will need to use to complete the
task are:
• Order Date – Year
• Order Date – Week
• Number of Records
• Ship Status (Calculated Field)
• Ship Mode
4. Combine the data and follow the best practices to present your story. Create
calculated fields for KPIs to build a figure that will be used to measure progress in
the data. Assemble a dashboard. Analyze concepts and techniques for compelling
storytelling with data.
5. Create a KNIME workflow where you read the provided autos data set. Exlcude the
columns 'normalized' and 'bore'. Filter the rows on the price column and keep only
BIDV Lab Manual (CSL 232) |
xv
2022-23

instances describing cars that are less than $30,000. Write the file out to csv and
report back the number of rows.
6. Capstone Projects in KNIME Analytics, Tableau and SQL.
6. LIST OF PROJECTS

1. Identify key metrics in the data and create KPIs to track those metrics. Use those
KPIs to create dashboards that allow for comparative views and “brushing and
linking” of the data. Develop proper context of an explanatory analysis that will
form the basis for the design decision for exploration of the data as KPIs and
worksheets to demonstrate the visual and cognitive design principles learned in
throughout the course. In particular, and make use of advanced features like
hierarchies, actions, charts, calculated field, filters, parameters and LODs.

2. Project on Data Analytics using KNIME: Partition the data into 80/20% training
and test data sets. Utilize the K-means clustering algorithm to train the model to
produce 3 clusters. Use color, shape and scatter plot nodes in Knime Analytics to
visualize results.

3. Choose a parameter which to train a Logistic Regression Model to predict on the


basis of those parameters
a. Use the Normalizer (PMML) node to z-score normalize all numerical
columns
b. Partition the dataset into a training set (80%) and a test set (20%). Apply
stratified sampling on the color column.
c. Train a logistic regression model on the training set, and apply the model
to the test set
BIDV Lab Manual (CSL 232) |
xvi
2022-23

7. RUBRICS

Marks Distribution
Continuous Evaluation (15 Marks) Project Evaluations (80 Marks)
Each experiment shall be evaluated for 10 Both the projects shall be evaluated for
marks and at the end of the semester 30 marks each and at the end of the
proportional marks shall be awarded out semester viva will be conducted related
of total 15. to the projects as well as concepts
Following is the breakup of 10 marks for learned in labs and this component
each carries 20 marks.
4 Marks: Observation & conduct of
experiment. Teacher may ask questions
about experiment.
3 Marks: For report writing
3 Marks: For the 15 minutes quiz to be
conducted in every lab.
Annexure 1

Business Intelligence and Data Visualization


(CSL 232)

Lab Practical Report

Faculty name: Student name: Ansh rohatgi

Roll No.: 20csu169

Semester: Fifth

Group: DS-B4

Department of Computer Science and Engineering


The NorthCap University, Gurugram- 122001, India
Session 2022-23
INDEX
S.No Experiment Page Date of Date of Marks CO Sign
No. Experiment Submission Covered

1 Select the dataset 24-08-22 24-08-22


and create a Tableau
viz to calculate
central tendency,
measure of
dispersion and create
a Tree map and draw
a histogram using
Tableau.
2 Select the dataset 01-09-22 01-09-22
Superstore and
create a Tableau viz
to draw a tree map,
pie chart, box plot,
gantt chart using
Tableau
3 Create location 01-10-22 01-10-22
hierarchies; build and
present a basic map
view using Tableau
4 Take two dataset and 08-10-22 08-10-22
blend and join them
in Tableau.
5 Take superstore 08-10-22 08-10-22
dataset and create
context and simple
filters, perform table
calculation.
6 Take the superstore 08-10-22 08-10-22
dataset and create a
dashboard. Present
the same with
storytelling in
Tableau.
7
9) Train a Decision
Tree on 75% of
the data and use
the remaining
25% for testing in
Knime Analytics
Tool.

10) Partition the data


into 80/20%
training and test
data sets. Utilize
the K-means
clustering
algorithm to train
the model to
produce 3
clusters. Use
color, shape and
scatter plot nodes
in Knime
Analytics to
visualize results.
BIDV Lab Manual (CSL 232) | 1
2022-23

EXPERIMENT NO. 1

Student Name and Roll Number: Ansh rohatgi 20csu169

Semester /Section: fifth DS-B4

Link to Code:

Date:

Faculty Signature:

Marks:

Objective(s):
 Demonstrate the ability to use technical skills in descriptive analytics to support business
decision- making.
 Demonstrate the ability to calculate central tendency, measure of dispersion and draw a
histogram using Tableau

Outcome:

Students will be able to demonstrate different descriptive analytics concepts like mean, median,
mode, standard deviation.
Problem Statement:
BIDV Lab Manual (CSL 232) | 2
2022-23
Select the dataset and create a Tableau viz to calculate central tendency, measure of dispersion
and create a Tree map and draw a histogram using Tableau.

Background Study:

Step 1: Connect to the Sample dataset for SAT score from LMS and the distribution of SAT scores.

Step 2: Right-click on a measure and navigate to the Create menu, make bins, parameters,
calculations, and groups. Tableau will even suggest a bin size too.

Step 3: The scores are bucketed in increments of 50 or 100 points and the bars represent the
number of students scoring in that bin. Students are reassured that a score of 1600 is rare, and
most scores fall in the middle of the range.

Step4: From the worksheet menu, navigate to Export, Data, select your data file, and the residuals
will be saved as an Access file.

Question Bank:

Q1. is the term used to describe creating visual representations of data.

a. Business Intelligence (BI)


b. Big Data
c. The Big Idea
d. Data Visualization

Q2. Data visualization is a term used to describe the practice of depicting text-based data in
visual context.

a. True
b. False

Q 3. refers to the phenomenon when a person is able to recall visual content


better than content presented only as text.
BIDV Lab Manual (CSL 232) | 3
2022-23

a. The Picture Superiority Effect


b. Business Analytics
c. Data Visualization
d. The Big Idea

Q4. is the term used to describe a set of tools and techniques (such as queries
and reports) used to convert data from various databases into meaningful information.
a. The Big Idea
b. Business Intelligence (BI)
c. Big Data
d. Data Visualization

Q5. Like Tableau, Python and R can process both unstructured data and big data.

a. False
b. True

Q6. What involves asking questions and answering those questions using statistical and
quantitative tools for explanatory and predictive analysis?

a. Information Management
b. Multi-threading
c. Business Intelligence
d. Aggregation
e. Business Analytics

Q7. The Picture Superiority Effect involves asking questions and answering those
questions using statistical and quantitative tools for explanatory and predictive
analysis.

a. True
b. False
BIDV Lab Manual (CSL 232) | 4
2022-23
Q8. What are the three components of big data? (Check all that apply)

a. Increasing at a very high velocity


b. Pulled from a single source
c. Increasing in volume
d. With a lot of variety

Q9. One of Tableau Public’s limitations is you cannot save visualization locally.

a. False
b. True

Q10. Which visualization tool's strength is visualization and storytelling?


a. R
b. Python
c. Tableau
d. Microsoft Excel
BIDV Lab Manual (CSL 232) | 5
2022-23

Student Work Area


Algorithm/Flowchart/Code/Sample Outputs
BIDV Lab Manual (CSL 232) | 6
2022-23
BIDV Lab Manual (CSL 232) | 7
2022-23

EXPERIMENT NO. 2

Student Name and Roll Number: Ansh rohatgi and 20csu169

Semester /Section: Fifth DS-B4

Link to Code:

Date:

Faculty Signature:

Marks:

Objective
 Demonstrate the data visualization using different charts.
 Demonstrate the ability to use technical skills in descriptive analytics to support
business decision- making.

Outcome: Students will be able to demonstrate different type of charts to visualize the data
analysis.

Problem Statement: Select the dataset Superstore and create a Tableau viz to draw a tree map,
pie chart, box plot, gantt chart using Tableau.
BIDV Lab Manual (CSL 232) | 8
2022-23

Background Study:
Pie Chart
Step 1: Connect to the Sample dataset Superstore data source.
Step 2: Drag and Drop Dimensions to Label Card.
Step 3: Access Formatting Options.
Step 4: Check the Pie Chart Showing Total Sales.
Step 5: Analyse Pie Chart.

Tree Map
Step 1: Drag and drop the measure profit two times to the Marks Card. Once to the Size shelf and
again to the Color shelf.
Step 2: Drag and drop the dimension ship mode to the Label shelf. Choose the chart type Tree
Map from Show Me.
Step 3: Add the dimension Region to the above Tree map chart. Drag and drop it twice. Once to
the Color shelf and again to the Label shelf. The chart that appears will show four outer boxes for
four regions and then the boxes for ship modes nested inside them. All the different regions will
now have different colors.

Box Plot:
Step 1: Drag and drop the dimension category to the Columns shelf and profit to the Rows shelf.
Also drag the dimension Ship mode to the right of Category in Columns shelf.
Step 2: Choose Box-and-Whisker plot from Show Me. The following chart appears which shows
the box plots. Here, Tableau automatically reassigns the ship mode to the Marks card.
Step 3: It can create box plots with two dimensions by adding another dimension to the Column
shelf. In the above chart, add the region dimension to the Column shelf. This produces a chart
which shows the box plots for each region.

Gantt Chart
Step 1: Drag the dimension order date to the Columns shelf and Sub-Category to the Rows shelf.
Next, add the order date to the Filters shelf. Right-click on order date to convert it to the exact date
values.
BIDV Lab Manual (CSL 232) | 9
2022-23
Step 2: Edit the filter condition to select a range of dates. It is because you want individual date
values and there is a very large number of dates in the data.
Step 3: Drag the dimension ship mode to the Color shelf and the measure quantity to the Size shelf
under the Marks card.

Question Bank:
Q1: What is the difference between a bar chart and a line chart?
a. A Line Chart is only used for discrete data
b. A Bar Chart is only used for continuous data
c. A Bar Chart is best used for discrete data while a Line Chart is best used for continuous
data.
Q2: Which of the scenarios below is best suited for a bar chart?
a. A chart showing a student's favourite after school activity
b. A chart showing trends in the stock market over time
c. A chart showing trends in weather patterns over time.

Q3: Which of the following correctly describes how you would overlay one chart on top of
another using Tableau?
a. You cannot overlay one chart on top of another in Tableau
b. Create both of your charts, then right click on the second measure pill on your Rows Shelf
and select "Dual Axis"
c. Create both of your charts, then right click on the first measure pill on your columns Shelf
and select "Dual Axis"

Q 4 How do you change the sizing options on a bar chart?


a. You cannot change sizing options on a bar chart.
b. Drag a measure to the Sizing option on the Marks Card, click on the Sizing option and then
change the size of the marks.
c. Drag a measure to the Colors option on the Marks Card, click on the Colors option and
then change the size of the marks.

Q5: How do you edit the colors on a Tableau chart?


a. You cannot change the colors of the bars on a bar chart.
BIDV Lab Manual (CSL 232) | 10
2022-23
b. Drag a measure to the Sizing option on the Marks Card, click on the Sizing option and then
change the colors of the marks.
c. Drag a measure to the Colors option on the Marks Card, click on the Colors option and
then change the colors of the marks.

Q6. What is the purpose of the Tableau Tooltip? (Select all that apply.)
a. To provide the end user with additional charts
b. To provide the end user with additional information not shown in the worksheet or chart
c. To provide the end user with additional information on what information is shown in the
worksheet

Q7. How do you edit or customize the Tooltip in Tableau? (Select all that apply.)
a. Right click on the tooltip marks card and click on the Edit Tooltip.
b. Drag any measures you want in the tooltip to the tooltip on the marks card.
c. Click on the Tooltip on the Marks card and make your edits directly in the Tooltip dialog
box.

Q8. When you have continuous data what would be the recommended kind of chart to
use?
a. Line Chart
b. Pie Chart
c. Gantt Chart

Q9. How do you align the axis on a dual axis chart?


a. Right click on the left side axis and click on Dual Axis.
b. Right click the secondary axis and select Synchronize Axis.
c. Right click on the right-side axis and click on the Dual Axis.
d. There is no way to align axes on a dual axis chart because each axis is using a different
measure.
BIDV Lab Manual (CSL 232) | 11
2022-23

Student Work Area


Algorithm/Flowchart/Code/Sample Outputs
BIDV Lab Manual (CSL 232) | 12
2022-23
BIDV Lab Manual (CSL 232) | 13
2022-23

EXPERIMENT NO. 3

Student Name and Roll Number: Ansh rohatgi and 20csu169

Semester /Section: Fifth and DS-B4

Link to Code:

Date:

Faculty Signature:

Grade:
BIDV Lab Manual (CSL 232) | 14
2022-23

Objective
 Create new hierarchy in dataset
 Build map view for geographical data variables
 Develop an introductory level of competency on the use of Tableau software for data
visualization.

Outcome: Students will be able to connect to and join geographic data; format that data in
Tableau; create location hierarchies; build and present a basic map view; and apply key mapping
features along the way.

Problem Statement: Select the dataset Superstore and create a Tableau viz to join geographic
data; format that data in Tableau; create location hierarchies; build and present a basic map view.

Background Study:

Step 1: Connect to your geographic data of superstore.


Spatial files, such as a shapefile or geoJSON file, contain actual geometries, whereas text files or
spreadsheets contain point locations in latitude and longitude coordinates, or named locations that,
when brought into Tableau, connect to the Tableau geocoding.
Step 2: On the Data Source page, click the globe icon.
Step 3: Select Geographic Role, and then select a role that best matches your data.
Step 4: The Data pane, right-click the geographic field, Country, and then select Hierarchy > Create
Hierarchy.
Step 5: In the Create Hierarchy dialog box that opens, give the hierarchy a name, such as Mapping
Items, and then click OK.
Step 6: In the Data pane, drag the State field to the hierarchy and place it below the Country field.
Step 7: Repeat step 3 for the City and Postal Code fields.
Step 8: In the Data pane, double-click Country.
BIDV Lab Manual (CSL 232) | 15
2022-23

Question Bank

Q1. Which of the following are geography fields that Tableau can utilize? (Select all that
apply.)
a. Country Names
b. Telephone Numbers
c. State Names
d. Zip Codes

Q2. How do you change the size of elements on your map in Tableau?
a. Click on the text option on the marks card and increase or decrease the size of your
map elements
b. Click on the size option on the marks card and increase or decrease the size of your
map elements
c. Click on the detail option on the marks card and increase or decrease the size of your
map elements

Q3. How do you change the shape of your map elements in Tableau?
a. Click on the Shape detail option on the marks card and pick a new map element
b. Click on the Detail option on the marks card and pick a new map element
c. Click on the Colors option on the marks card and pick a new map element

Q4. How do you change the map layout to better fit your data?
a. Click on Map on the options bar and select Background Images
b. Click on Map on the options bar and select Map Layers
c. Click on Map on the options bar and select Geocoding

Q 5. How do you combine multiple maps onto one map in Tableau?


a. Drag either map pill on top of the other map pill
b. Right click on the Map Chart and click on Dual Axis
c. Right click on the second Map Pill and click on Dual Axis
BIDV Lab Manual (CSL 232) | 16
2022-23
Q 6. Which of the following are ways you can edit unknown geographic locations in
Tableau?
a. Click on the notice on the bottom right-hand corner of your Map and select Filter Data
b. Click on the notice on the bottom right-hand corner of your Map and select Edit
Locations
c. Click on the notice on the bottom right-hand corner of your Map and select Show Data
at Default Position

Q7. Which of the following are reasons you might use a Tooltip feature in Tableau for
your map? (Select all that apply.)
a. To add additional information that is not shown on the Map
b. To give the user tips on how to use the map
c. To summarize the same information that is shown in the Map

Q8. How do you change the colors of your map elements in Tableau?
a. Click on the Color option on the Marks card and change the color
b. Click on the Detail option on the Marks card and change the color
c. Click on the Size option on the Marks card and change the color

Q9. In what ways might you use a map in the context of data visualization?
a. To chart discrete data
b. To chart geographical data
c. To chart continuous data

Student Work Area


Algorithm/Flowchart/Code/Sample Outputs
BIDV Lab Manual (CSL 232) | 17
2022-23
BIDV Lab Manual (CSL 232) | 18
2022-23

EXPERIMENT NO. 4

Student Name and Roll Number: Ansh rohatgi and 20csu169

Semester /Section: Fifth DS-B4

Link to Code:

Date:

Faculty Signature:

Grade:

Objective
 Blend the homogeneous as well as heterogeneous datasets.
 Employ BI tools to load and visualize data to generate useful and informative reports
from data.
Outcome: Students will be able to work with filters, sorting, table Calculations and calculated field
for feature engineering and feature extraction.

Problem Statement: Take two dataset and blend and join them in Tableau.
BIDV Lab Manual (CSL 232) | 19
2022-23
Solution

Step 1 – Ensure that the workbook has multiple data sources. The second data source should be
added by going to Data > New data source.

Step 2 − Drag a field to the view. Whichever data source this first field comes from will become the
primary data source.

Step 3 − Switch to another data source and make sure there is a blend relationship to the primary
data source.

Step 4 - Drag a field into the view from the secondary data source.

Question Bank
Q1. What is a table calculation?
a. Table calculations are not allowed to be performed in Tableau
b. Table calculations allow you to filter your data for a specific view
c. Table calculations address data in the cache table and allow you to perform calculations
on visible results

Q2. Which of the following are ways table calculations are used?
a. Table calculations are used to give fields new names
b. Table calculations are used to create new calculations
c. Table calculations are used to view summary data

Q3. What are Tableau parameters?


a. Parameters are dynamic values that can replace constant values in calculations, filters,
and reference lines
b. Parameters are filters
c. Parameters are constant values that can replace dynamic values in calculations, filters,
and reference lines

Q4. Which of the following best describes or defines a calculated field?


a. A calculated field is every field in your data set
b. A calculated field is a new field that is not in the current data set
c. A calculated field is a field that is present in your original data set
BIDV Lab Manual (CSL 232) | 20
2022-23

Q5. How do you create a calculated field in Tableau?


a. Right click anywhere in the data pane and select Create Parameter
b. Right click anywhere on your chart and select Create Calculated Field
c. Right click anywhere in the data pane and select Create Calculated Field

Q6. Which of the following is a quick table calculation? (Select all that apply.)
a. Running Total
b. Percent of Total
c. Month over Month Difference
d. Percent Difference
e. Difference
f. Rank

Q7. Why is a worksheet filter used?


a. To create a new calculation that is not in the current data set
b. To narrow down your data table or chart to specific fields and/or data
c. To provide additional data that may not be present in your chart

Q8. How do you apply a worksheet filter in Tableau?


a. Drag a dimension or measure to the filter data pane
b. Drag a dimension or measure to the color option on the Marks card
c. Drag a dimension or measure to the size option on the Marks card

Q9. How do you show the filter options on a worksheet?


a. Right click anywhere in the data pane and select show filter
b. Right click in your worksheet and click show filter
c. Right click on your filter pill and select show filter
BIDV Lab Manual (CSL 232) | 21
2022-23

Student Work Area


Algorithm/Flowchart/Code/Sample Outputs
BIDV Lab Manual (CSL 232) | 22
2022-23
BIDV Lab Manual (CSL 232) | 23
2022-23

EXPERIMENT NO. 5

Student Name and Roll Number: Ansh rohatgi and 20csu169

Semester /Section: Fifth and DS-B4

Link to Code:

Date:

Faculty Signature:

Grade:

Objective
 Create simple filters in Tableau.
 Create context filters in Tableau.
BIDV Lab Manual (CSL 232) | 24
2022-23
 Employ BI tools to load and visualize data to perform table calculation

Outcome: Students will be able to generate simple filter, context filters and table calculations.

Problem Statement: Take superstore dataset and create context and simple filters, perform table
calculation.

Background Study

Step 1 – Select the measure on which the table calculation has to be applied and drag it to column
shelf.

Step 2 − Right-click the measure and choose the option Quick Table Calculation.

Step 3 − Choose one of the following options to be applied on the measure.

 Running Total

 Difference

 Percent Difference

 Percent of Total

 Rank

 Percentile

 Moving Average

 Year to Date (YTD) Total

 Compound Growth Rate

 Year over Year Growth

 Year to Date (YTD) Growth

Question Bank

Q1. What is a table calculation?


d. Table calculations are not allowed to be performed in Tableau
e. Table calculations allow you to filter your data for a specific view
BIDV Lab Manual (CSL 232) | 25
2022-23
f. Table calculations address data in the cache table and allow you to perform calculations
on visible results

Q2. Which of the following are ways table calculations are used?
d. Table calculations are used to give fields new names
e. Table calculations are used to create new calculations
f. Table calculations are used to view summary data

Q3. Which of the following are among the 10 table calculations we covered? (Select all that
apply.)
a. Moving Average
b. Percent Difference
c. Date Difference
d. Standardized Score
e. Quarter over Quarter Growth
f. Difference
g. YTD Difference
h. Running Total
i. Percent of Total
j. Percentile

Q4. Which of the following best describes or defines a calculated field?


d. A calculated field is every field in your data set
e. A calculated field is a new field that is not in the current data set
f. A calculated field is a field that is present in your original data set

Q5. How do you create a calculated field in Tableau?


d. Right click anywhere in the data pane and select Create Parameter
e. Right click anywhere on your chart and select Create Calculated Field
f. Right click anywhere in the data pane and select Create Calculated Field

Q6. Which of the following is a quick table calculation? (Select all that apply.)
g. Running Total
BIDV Lab Manual (CSL 232) | 26
2022-23
h. Percent of Total
i. Month over Month Difference
j. Percent Difference
k. Difference
l. Rank
m. MTD Total

Q7. Why is a worksheet filter used?


d. To create a new calculation that is not in the current data set
e. To narrow down your data table or chart to specific fields and/or data
f. To provide additional data that may not be present in your chart

Q8. How do you apply a worksheet filter in Tableau?


d. Drag a dimension or measure to the filter data pane
e. Drag a dimension or measure to the color option on the Marks card
f. Drag a dimension or measure to the size option on the Marks card

Q9. How do you show the filter options on a worksheet?


d. Right click anywhere in the data pane and select show filter
e. Right click in your worksheet and click show filter
f. Right click on your filter pill and select show filter

Q10. What are Tableau parameters?


d. Parameters are dynamic values that can replace constant values in calculations, filters,
and reference lines
e. Parameters are filters
f. Parameters are constant values that can replace dynamic values in calculations, filters,
and reference lines
BIDV Lab Manual (CSL 232) | 27
2022-23

Student Work Area


Algorithm/Flowchart/Code/Sample Outputs
BIDV Lab Manual (CSL 232) | 28
2022-23
BIDV Lab Manual (CSL 232) | 29
2022-23
BIDV Lab Manual (CSL 232) | 30
2022-23
BIDV Lab Manual (CSL 232) | 31
2022-23

EXPERIMENT NO. 6

Student Name and Roll Number: Ansh Rohatgi and 20csu169


Semester /Section: 5th Sem / DS-A
Link to Code:
Date:
Faculty Signature:
Grade:

Objective: Employ BI tools to load and visualize data to generate useful and informative
reports
from data.
Outcome: Students will be able to generate reports in Tableau using dashboard and storytelling
Problem Statement: Take the superstore dataset and create a dashboard. Present the same with
storytelling in Tableau.
Background Study

Step 1: Create a blank worksheet by using the add worksheet icon located at the bottom of the
workbook. Drag the dimension Segment to the columns shelf and the dimension Sub-Category to
the Rows Shelf. Drag and drop the measure Sales to the Color shelf and the measure Profit to the
Size shelf. This worksheet is referred as the Master worksheet. Right-click and rename this
worksheet as Sales_Profits.
Step 2: Create another sheet to hold the details of the Sales across the States. For this, drag the
dimension State to the Rows shelf and the measure Sales to the Columns shelf as shown in the
following screenshot. Next, apply a filter to the State field to arrange the Sales in a descending
order. Right-click and rename this worksheet as Sales_state.
Step 3: Next, create a blank dashboard by clicking the Create New Dashboard link at the
BIDV Lab Manual (CSL 232) | 32
2022-23
bottom of the workbook. Right-click and rename the dashboard as Profit_Dashboard.
Step 4: Drag the two worksheets to the dashboard. Near the top border line of Sales
Profit
worksheet, you can see three small icons. Click the middle one, which shows the prompt Use as

Filter on hovering the mouse over it.

Step 5: Now in the dashboard, click the box representing Sub-Category named Machines
and segment named Consumer.
Step 6: Click the New Story tab.
Step 7: In the lower-left corner of the screen, choose a size for your story. Choose from one of
the predefined sizes, or set a custom size, in pixels.
Step 8: By default, your story gets its title from the sheet name. To edit it, right-click the sheet
tab, and choose Rename Sheet.
Step 9: To start building story, double-click a sheet on the left to add it to a story point.
Question Bank

Q1. When building KPIs in Tableau, the following is the most fundamental skill:
• Understanding the difference between a worksheet and a story.
• An artistic flair to do beautiful KPIs.
• A deep understanding of complex statistical
techniques. d. Being comfortable with using
calculations.

Q2. Which of the following is a poor way to design a KPI?


• Using the data that are available and not worrying about whether it is essential
because just getting data out there is important.
• Designing KPIs through evaluation of an organization's strategic plans.
• Getting feedback from stakeholders on early drafts of the KPIs.
• Through discussion with decision makers.

Q3. Which of the following Tableau function is used to set thresholds in your
visualization?
• Story points
• Totals
• Parameters
• Actions
BIDV Lab Manual (CSL 232) | 33
2022-23

Q4. Indicate the correct calculated field code for when you want to set a
threshold in
Tableau to indicate if profit is above or below a benchmark:
• SUM([Profit field])/SUM([Sales field])
• if [Profit Field] > 125000 then "Above benchmark" else "Below benchmark" end
• if sum([Profit Field]) > 125000 then "Above benchmark" else "Below benchmark"
• if sum([Profit Field]) > 125000 then "Above benchmark" else "Below benchmark" end

Q 5 The way you set colors based on your KPI is by?


• Creating a calculated field with an if...else...end statement.
• Creating a trend line in the analytics tab.
• Using the drop downs in Tableau and selecting “KPI category colors”
• Using Tableau's highlighting feature.

Q6. Indicate which of the following would NOT be an appropriate KPI from the
Sales Superstore dataset.
• A map of the United States illustrating weak and strong profits by regions.
• A table of values that show a sales forecast based on last year’s and this year’s sales.
• A bar chart that shows how quickly products were sent.
• A table of names and addresses used by staff to mail products.

Q7. A KPI can be used to evaluate . (Select all that apply).


• Performance based on a department within a corporation but not the corporation itself.
• Large nonprofit organizations.
• KPIs are important but they miss out on some qualitative
information. d. Fully qualitative information that is not expressed as
data.

• A Net Promoter Score is ?


• The percentage of defects in your manufacturing
process. b. A way to quickly see the profit ratio.
• The amount it costs to acquire a new promoter.
• A way to gauge loyalty to your products or company.

Q9. To get the “shapes” marks card to show up in Tableau, what do you need to
do?
BIDV Lab Manual (CSL 232) | 34
2022-23
a. Use the drop-down, click on worksheets, select actions, then add a shapes "action".
b. Click on the down arrow under "Marks" and select "Shape".
• Nothing. It should be there already.
• There is no "shapes" marks card.

Q10. KPIs cannot be .


• Used to measure customer loyalty.
• For an individual to check one’s own
progress. c. The sole way to measure
success
d. Based on a set of measurable criteria.

Q11. What’s a poor way to choose KPIs?


a. Based on understandable, meaningful, and measurable
criteria. b. Chosen through examination of a firm's strategic
plan.
• Based on the SMART goal criteria.
• Based on a vague notion of what a KPI is.
Student Work Area
Algorithm/Flowchart/Code/Sample Outputs

EXPERIMENT NO. 7

Student Name and Roll Number: Ansh Rohatgi and 20csu169


BIDV Lab Manual (CSL 232) | 35
2022-23
Semester /Section: 5th Sem / DS-A
Link to Code:
Date:
Faculty Signature:
Grade:

Objective: Employ BI tools to load and visualize data to generate useful and informative
reports
from data.
Outcome: Students will be able to create advance Level of Details view for better analysis
Problem Statement: Take sample superstore and introduce a Level Of Detail in
Tableau
expression aggregated or replicated in the view.
Background Study
Step 1: Set up the Visualization
• Open Tableau Desktop and connect to the Sample-Superstore saved data source.
• Navigate to a new worksheet.
• From the Data pane, under Dimensions, drag Region to the Columns Shelf.
• From the Data pane, under Measures, drag Sales to the Rows Shelf.
• A bar chart showing the sum of sales for each region appears.
Step 2: Create the LOD expression
• Instead of the sum of all sales per region, perhaps you want to also see the average sales
per customer for each region. You can use an LOD expression to do this.
• Select Analysis > Create Calculated Field.
• In the Calculation editor that opens, do the following:
• Name the calculation, Sales Per Customer.
• Enter the following LOD expression:

{ INCLUDE [Customer Name] : SUM([Sales]) }


• When finished, click OK.
• The newly created LOD expression is added to the Data pane, under Measures.
Step 3: Use the LOD expression in the visualization
• From the Data pane, under Measures, drag Sales Per Customer to the Rows shelf and
place it to the left of SUM(Sales).
• On the Rows shelf, right-click Sales Per Customer and select Measure (Sum) > Average.
Question Bank
Q1. Which of the following are examples of discrete dates? (Select all that apply.)
a. Days in the same
month b. Quarters in
BIDV Lab Manual (CSL 232) | 36
2022-23
different years
c. Months in different quarters

Q2. Which of the following describes how Tableau handles a continuous date?
• A continuous date will be colored Red when dragged to the Row or Column Shelf
• A continuous date will be colored Blue when dragged to the Row or Column
Shelf c. A continuous date will be colored Green when dragged to the Row or
Column Shelf

Q3. Which of the following describes how Tableau handles a discrete date?
• A discrete date will be colored Green when dragged to the Row or Column Shelf
• A discrete date will be colored Red with dragged to the Row or Column
Shelf c. A discrete date will be colored Blue when dragged to the Row or
Column Shelf

Q4. Under which circumstances might it be advisable to manually change the


date field? (Select all that apply.)
• When you have a year date field and need to change it to months
• When you have a discrete date and need to convert it to a continuous date
• When you want to create a map

Q5. How do you convert a discrete date to a continuous date?


a. Drag your date field to the Rows or Columns shelf and right click on the blue pill and
click on Continuous

• When you drag your date field to the Rows or Columns shelf, Tableau will ask you if
you want the date field to be discrete or continuous.
• Drag your date field to the Rows or Columns shelf and click on the + sign on the blue pill

Q6. What is the date hierarchy used for?


• To convert dates from discrete to continuous
• To drill down into your data and show each date level (i.e. Years, Quarters, Months)
• To change your chart options

Q7. Under which of the following scenarios would you use the date hierarchy?
• If your date is showing Years and you need to show both Years and Months
BIDV Lab Manual (CSL 232) | 37
2022-23
• If your date field is showing Years and you need to show the difference between two dates
• If your date field is showing Years and you need to change the chart type

Q8. What is the first step to go from a date year to a date month?
• You cannot change date variables
• Create a new date month variable from a pre-existing date year
variable c. Right click on the Year date pill and select Month

Q9. Which chart is best for discrete dates?


• Line chart
• Pie chart
• Bar chart

Q10. Which chart is best for continuous dates?


• Map
• Line chart
• Bubble chart

Q11. Which of the following shows the correct steps in the correct sequence to
convert a continuous date to a discrete date in Tableau?
a. Right click on your blue date pill and select discrete

• Right click on your green date pill and select discrete


• When you drag your date field to the Rows or Columns shelf, Tableau will ask you if
you want the date field to be discrete or continuous

Q12. What table calculation function would you use to determine the difference
between two dates?
• DATEPARSE
• DATEADD
• DATEDIFF
Student Work Area
Algorithm/Flowchart/Code/Sample Outputs
BIDV Lab Manual (CSL 232) | 38
2022-23

Business Intelligence and Data


Visualization Lab Manual
CSL 232

Knime Project Report

Faculty name: Dr. Poonam Chaudhary

Student name: Ansh and Rishabh

Roll No.: 20csu169 & 20csu373

Semester: 5th Group:


BIDV Lab Manual (CSL 232) | 39
2022-23
DS B

Department of Computer Science and Engineering The


NorthCap University, Gurugram- 122001, India Session
2022-23
DD

BIDV Lab Manual (CSL 232) | 1


2022-23

Table of Contents
S.No Page No.

1. Project Description 2

2. Problem Statement
3
3. Analysis

3.1 Hardware Requirements

3.2 Software Requirements 3


4. Design 3

5. Implementation and Testing (stage/module wise) 4

6. Output (Screenshots) 5

7. Conclusion and Future Scope 10


DD

BIDV Lab Manual (CSL 232) | 2


2022-23
1. Project Description
Diabetes is a common, chronic disease. Prediction of diabetes at an early stage
can lead to improved treatment. Data mining techniques are widely used for
prediction of disease at an early stage. In this research paper, diabetes is
predicted using significant attributes, and the relationship of the differing
attributes is also characterized.. Significant attributes selection was done via the
principal component analysis method. Our findings indicate a strong
association of diabetes with body mass index (BMI) and with glucose level,
which was extracted via the Apriori method. K nearest (KNN), random
forest (RF) and K-means clustering techniques were implemented for the
prediction of diabetes. The Decision Tree model provided a best accuracy of
75.7%, and may be useful to assist medical professionals with treatment
decisions.

About Dataset
The dataset contains 768 rows and 9 columns, some of which are Glucose,
Insulin, Pregnancies, BMI and Outcome. Given with these details we have to
predict whether the Patient is Diabetic or not .
DD

BIDV Lab Manual (CSL 232) | 3


2022-23

2. Problem Statement:
Predicting whether the person is Diabetic or not using supervised
learning model like KNN, Decision Tree for the optimized result and
accuracy.

3. Analysis
3.1. Hardware Requirements
A 64-bit operating system with at least 32GB RAM and 8 CPU cores as minimum

3.2. Software Requirements


Knime analytics platform

4. Design
The following steps were taken to get the best model accuracy:

 Importing excel dataset


 Removing unnecessary columns
 Removing duplicate rows
 Normalizing the dataset
 Splitting data into train and test data
 Using model learner
 Model prediction
 Checking model accuracy
DD

BIDV Lab Manual (CSL 232) | 4


2022-23

5. Implementation and Testing


(stage/module wise)

a) Excel Reader
Reading the excel file using this node.

b) Column Filter
Removing unnecessary columns

c) Normalizer
Normalizing the data using min-max normalization

d) Partitioning
Dividing the dataset into two parts: 80% of training data and 20% of test data

e) Logistic Learner (Regression)


Applying random forest technique on the training dataset to train the model.
The EPI score is taken as the target variable.

f) Decision Tree Learner (Regression)


Applying model to the test data.

g) Numeric Scorer-applied to both


Finding the accuracy of the model
DD

BIDV Lab Manual (CSL 232) | 5


2022-23

6. Output (Screenshots)
File Table
DD

BIDV Lab Manual (CSL 232) | 6


2022-23

Normalized table

Partitioning
DD

BIDV Lab Manual (CSL 232) | 7


2022-23

-test data
DD

BIDV Lab Manual (CSL 232) | 8


2022-23
DD

BIDV Lab Manual (CSL 232) | 9


2022-23

Statistics:

Random Forest Learner


DD

BIDV Lab Manual (CSL 232) | 10


2022-23

7. Conclusion
Firstly, we applied both the techniques (Logistic Regression and Decision
Tree) on our dataset without normalization. The accuracy was:
DD

BIDV Lab Manual (CSL 232) | 11


2022-23

After normalization, the accuracy changed to:


Logistic Learner: 76%
Decision Tree: 78%

We can clearly see from the above accuracy scores that Decision Tree is better.

You might also like