Professional Documents
Culture Documents
Business Analytics Assignment Business Analytics Assignment: Neha Singh Neha Singh
Business Analytics Assignment Business Analytics Assignment: Neha Singh Neha Singh
PGDM
NEHA SINGH
BUSINESS
RollSec.’A’
10, No. & Section
ANALYTICS
ASSIGNMENT
Q-1 How do you Compute Cumulative Relative Frequencies in an Excel Sheet. Take an
example to support your answer and structure the process.
SOLUTION:
For cumulative relative frequency Firstly we have to find out cumulative frequency and
relative frequency.
The data in the table represent the tuition for all 2-year community colleges in a region in
2009-2010.
Tuition-
Number-of-Community-colleges
(dollars)
775-799 20
800-824 67
825-849 15
850-874 5
875-899 0
900-924 0
925-949 0
950-974 2
Step 2: Add a third column to your frequency chart. Title it “Cumulative Frequency.”
Step 3: Type the formula “=C2” (where C2 is the actual location of your first frequency
count) in the first row of your new column.
Step 4: Type the formula “=SUM(C2:C3)” (where C2 is the actual location of your first
cumulative frequency count from Step 3, and B3 is the location of your second frequency
count) in the first row of your new column.
Step 5: Click the cell you entered the formula in Step 4. Click and drag the little black square
in the bottom right hand corner of the cell to the bottom of the column. Excel will populate
the cell with all of the remaining values.
Step 2: Type the formula =C2/C$12 in cell E2 then click Enter key.
Step 3: Click the cell you entered the formula in Step 2. Click and drag the little black square
in the bottom right hand corner of the cell to the bottom of the column. Excel will populate
the cell with all of the remaining values.
Step 3: In this step the same formula will apply (which was used for previous cell)
=SUM(E2:E4) ) in cell F4 then click Enter key OR type =SUM and select cell E2 to E4 and
click enter key. And, do the same with all remaining columns to find cumulative relative
frequency.
Q-2 You are asked to analyze the impact of Covid 19 (Novel Coronavirus) on economy
of India for which you need to conduct a survey on individuals in Delhi city. You are
required to ask the following:
• Gender
• Education
• Ethnicity
• Name
• Age
• Length of residency
• Factors affecting Economy (using a scale of 1–5, going from poor to excellent)
SOLUTION:
1) What types of data (categorical, ordinal, interval, or ratio) would each of the survey
Items represent and why?
Gender
It is categorical or nominal type of data scale. Nominal scales were often called
qualitative scales, and measurements made on qualitative scales were called qualitative data.
Education
It is a type of ratio Scale. Applications of measurement models in educational contexts
often indicate that total scores have a fairly linear relationship with measurements across the
range of an assessment.
Ethnicity
It is cover under Nominal Scale. The nominal type differentiates between items or
subjects based only on their names or (meta-) categories and other qualitative classifications
they belong to; thus dichotomous data involves the construction of classifications as well as
the classification of items.
Name
It is also cover under Nominal scale. Because Nominal data are used to label
variables without any quantitative value.
Age
It is type of Ratio Data scale. Using the aforementioned definition, age is in a ratio
scale. And In this case there are no categorized are given of age. So, in this situation
it is cover under the ratio scale only.
Length of residency
It is also a Ratio scale. Ratio scale refers to the level of measurement in which the
attributes composing variables are measured on specific numerical scores or values.
2) What analytical tools you would use on this data to analyze it.
When we talk about the analysis tools there are three names which come to the mind are
MS-Excel, SAS and SPSS.
So, comparison between Excel, SPSS and SAS are given below:
By Job Function
Let's dig into this job market data and study popularity of these tools by Job Function.
SAS
SAS is mostly used in business analytics and intelligence industry. IT Software functional
area comes second in the list, followed by KPO/BPO, clinical research and finance etc.
SPSS
SPSS is mostly used in business analytics and intelligence industry. IT Software functional
area comes second in the list, followed by KPO/BPO, finance and others. Others include
Sales, Retail, Marketing, Logistics etc.
Excel
Excel is mostly used in finance industry. KPO/BPO industry comes second in the list,
followed by IT software, business analytics and others. Others include HR,Retail, Marketing,
Supply Chain, Logistics etc
Conclusion
In the situation of Covid-19 survey we are working as market research. So, the best tool for
the Market research uses is SPSS which is most popular in the Market research industry and
it is also a tool from 3 of them which major job function is Analytics. It has many advance
functions and cheaper than SAS also.
3) What precautions you will take as an analyst before analyzing the data?
Before analyzing the data analyst should remove the errors in data. Here are 7 criteria you
should consider:
Respondents who answer just a fraction of your required questions can bias your overall
results for many reasons:
It can be a sign that they weren’t qualified to take your survey to begin with (leading
them to leave).
It can indicate that they weren’t as engaged and considerate in their responses as those
who were willing to complete it.
Say you want to survey women between the ages of 18 and 29.
You wouldn’t want the responses of a 50-year-old influencing your overall findings, would
you?
Whatever audience specifications you land on, you can ignore respondents who don’t match
them by filtering them out.
If they only take a few seconds to complete it, they’re likely speeding through the questions,
which means they aren’t reading them carefully and answering them thoughtfully.
So how do you go about deciding who’s a speeder and who isn’t? The answer can vary,
depending on the subject of your survey and the types of questions you ask.
Straight lining is when a respondent chooses the same answer choice over and over again
(e.g. the first answer option). Straight liners are often speeders as well, as they race through
the survey by answering each question with little to no thought.
5. Respondents who provide unrealistic answers
Imagine asking respondents how much TV they watch per week, on average. If a respondent
writes in 165 hours, they’re likely exaggerating (Hint: there are only 168 hours in a week).
We call this type of response an outlier, because it falls beyond the range of answers from our
other respondents, and is, quite frankly, unrealistic.
When a respondent’s answer contradicts their response to another question, it’s clear that
they’re either being dishonest or careless (or even both!)
Having a response like: “Fdsklj” might make you smile, but it isn’t going to get you far in
your analysis.
Q3 Discuss the concept of contingency tables and analyze the dashboard
given below on the following parameters
DATA SET
Regi Time
Cust ID on Payment Transaction Code Source Amount Product Of
Day
10001 East Paypal 93816545 Web $20.19 DVD 22:19
OUTPUT OF DASHBOA
SOLUTION:
The subcategories of variable must be mutually exclusive and exhaustive, meaning that each
observation can be classified into only one subcategory, and taken together over all
subcategories, they must constitute and complete data set.
Additionally, a pivot table is one of the possible ways of creating a contingency table. A
typical pivot table has the visual form of the contingency table, or pivot table can be used to
quickly create cross-tabulation and to drill down into a large set of data in numerous ways.
Payment Mode
In East, North, South credit payment is use more than Paypal. But in West Paypal
is much casual
In Total the much amount or payment is made and accept by the credit.
Sale Amount.
Highest sale amount is in West, because the no. of product ordered in west is
highest.
As no. of product sold in east and south is not a big difference. Similarly the sale
amount is also haven’t big difference (East have $251.67 more than south).
North has lowest sale Amount or Lowest sale.
D
REFRENCES
https://www.youtube.com/watch?v=DrtChy0dBuk
https://en.wikipedia.org/wiki/Level_of_measurement
https://stats.stackexchange.com/questions/240363/is-age-interval-scale
https://www.surveymonkey.com/curiosity/survey-data-cleaning-7-things-to-
check-before-you-start-your-analysis/
https://www.listendata.com/2013/04/data-analysis-tools-excel-spss-or-sas.html
https://stats.stackexchange.com/questions/44834/what-is-the-
difference-between-the-pivot-table-and-contingency-table
Michael.hahsler.net