You are on page 1of 3

Question:You have been assigned as a business analyst for a leading online retail

company specializing infurniture and office supplies. A dataset of 51290


transactions between 2013 to 2016 has beengiven to you. The dataset contains the
columns Row ID, Order ID, Order Date, Ship Date, ShipMode, Customer ID,
Customer Name, Segment, Postal Code, City, State, Country,
Region,Market, Product ID, Category, Sub-Category, Product Name, Sales, Quantity,
Discount, Profit,Shipping Cost, Order Priority. Use the dataset and make a 5-page
report (Times New Roman,12pt, 1.5 Spacing) by answering the following questions
1.Use appropriate visualization techniques to summarize each of the variables and
to construct hypothesis statements. Develop at least five hypothesis statements.
[10 marks]
2.Use appropriate analytic techniques to test the hypothesis statements. Explain
and interpret your results in detail. [15 marks]
3.Provide recommendations to the company relating to marketing strategies that it
should adopt based on the results of the analysis. [15 marks]

Ans :

1. Use appropriate visualization techniques to summarize each of the variables and


to construct hypothesis statements. Develop at least five hypothesis statements

Below is the categorization of the variables based on their corresponding scales of


measurement (Nominal, Ordinal, Interval or Ratio).

Row ID - Nominal
Order ID - Nominal
Order Date - Interval
Ship Date - interval
Ship Mode - Ordinal
Customer ID - Nominal
Customer Name - Nominal
Segment - Nominal
Postal Code - Nominal
City - Nominal
State - Nominal
Country - Nominal
Region - Nominal
Market - Nominal
Product ID - Nominal
Category - Nominal
Sub-Category - Nominal
Product Name - Nominal
Sales RatioQuantity - Ratio
Discount - Ratio
Profit - Ratio
Shipping Cost - Ratio
Order Priority - ordinal
Row ID - Nominal
Order ID - Nominal
Order Date - Interval
Ship Date - interval
Ship Mode - Ordinal
Customer ID - Nominal
Customer Name - Nominal
Segment - Nominal
Postal Code - Nominal
City NominalState - Nominal
Country - Nominal
Region - Nominal
Market - Nominal
Product ID - Nominal
Category - Nominal
Sub-Category - Nominal
Product Name - Nominal
Sales - Ratio
Quantity - Ratio
Discount - Ratio
Profit - Ratio
Shipping Cost - Ratio
Order Priority - ordinal

Five hypothesis statements:

I will be using the R language to do the visualization of data using, Scatter Plot,
Histogram, Bar & Stack Bar Chart, Box Plot, Area Chart, Heat Map, correlogram,
ANOVA and multiple regression for hypothesis statements.

1.) Mean Sales across all categories is equal (using ANOVA)


2.) Mean Sales across all Segment is equal (using ANOVA)
3.) Mean Sales across all Order Priority is equal (using ANOVA)
4.) Mean Sales across all Discount is equal (using ANOVA)
5.) Mean Profit across all Order Priority is equal (using ANOVA)
6.) Store Sales is not affecting to store Profit. (using multiple regression)

2. Use appropriate analytic techniques to test the hypothesis statements. Explain


and interpret your results in detail.

a) ANOVA
b) Multiple regression

>> Did the hypothesis test using ANOVA Mean Sales across all categories is equal
(using ANOVA)?
Ho: Mean Sales across all Categories is equal (Null Hypothesis)
Ha: At least one pair of Mean Sales across all categories is not equal (Alternate
Hypothesis)
P value is 0.894 which is more than 0.05 hence we will accept the Null Hypothesis.

>> Did the hypothesis test using ANOVA Mean Sales across all Segment is equal
(using ANOVA)?
Ho: Mean Sales across all Segment is equal (Null Hypothesis)
Ha: At least one pair of Mean Sales across all Segment is not equal (Alternate
Hypothesis)
P value is 0.528 which is more than 0.05 hence we will accept the Null Hypothesis.

>> Did the hypothesis test using ANOVA Mean Sales across all Order Priority is
equal (using ANOVA)?
Ho: Mean Sales across all Order Priority is equal (Null Hypothesis)
Ha: At least one pair of Mean Sales across all Order Priority is not equal
(Alternate Hypothesis)
P value is 0.849 which is more than 0.05 hence we will accept the Null Hypothesis.

>> Did the hypothesis test using ANOVA Mean Sales across all Discount is equal
(using ANOVA)?
Ho: Mean Sales across all Discount is equal (Null Hypothesis)
Ha: At least one pair of Mean Sales across all Discount is not equal (Alternate
Hypothesis)
P value is 0.719 which is more than 0.05 hence we will accept the Null Hypothesis.
>> Did the hypothesis test using ANOVA Mean Profit across all Order Priority is
equal (using ANOVA)?
Ho: Mean Profit across all Order Priority is equal (Null Hypothesis)
Ha: At least one pair of Mean Profit across all Order Priority is not equal
(Alternate Hypothesis)
P value is 0.325 which is more than 0.05 hence we will accept the Null Hypothesis.

>> Did hypothesis test using multiple regression Store Sales is not influencing to
store Profit.
Ho: Store Sales is not influencing to store Profit (Null Hypothesis)
Ha: Store Sales is influencing to store Profit (Alternate Hypothesis)

P value is <2e-16 which is less than 0.05 hence we will accept the Null Hypothesis.

3. Provide recommendations to the company relating to marketing strategies that it


should adopt based on the results of the analysis.

>> Store profit and discount is highly negative correlated each outer
>> Shipping cost and quantity have positive correlation each outer
>> Multiple regression with Profit, Quantity, Discount, Shipping Cost and Sales has
not return higher P value, I change the dependent variable with each other only
keeping dependent Profit with independent, Quantity, Discount, Shipping Cost and
Sales has R square and Adjusted R square has 42% only.

Hence we could not find conclusive evidence using the given data for Profit,
Quantity, Discount, Shipping Cost and Sales.

You might also like