Professional Documents
Culture Documents
Ans :
Row ID - Nominal
Order ID - Nominal
Order Date - Interval
Ship Date - interval
Ship Mode - Ordinal
Customer ID - Nominal
Customer Name - Nominal
Segment - Nominal
Postal Code - Nominal
City - Nominal
State - Nominal
Country - Nominal
Region - Nominal
Market - Nominal
Product ID - Nominal
Category - Nominal
Sub-Category - Nominal
Product Name - Nominal
Sales RatioQuantity - Ratio
Discount - Ratio
Profit - Ratio
Shipping Cost - Ratio
Order Priority - ordinal
Row ID - Nominal
Order ID - Nominal
Order Date - Interval
Ship Date - interval
Ship Mode - Ordinal
Customer ID - Nominal
Customer Name - Nominal
Segment - Nominal
Postal Code - Nominal
City NominalState - Nominal
Country - Nominal
Region - Nominal
Market - Nominal
Product ID - Nominal
Category - Nominal
Sub-Category - Nominal
Product Name - Nominal
Sales - Ratio
Quantity - Ratio
Discount - Ratio
Profit - Ratio
Shipping Cost - Ratio
Order Priority - ordinal
I will be using the R language to do the visualization of data using, Scatter Plot,
Histogram, Bar & Stack Bar Chart, Box Plot, Area Chart, Heat Map, correlogram,
ANOVA and multiple regression for hypothesis statements.
a) ANOVA
b) Multiple regression
>> Did the hypothesis test using ANOVA Mean Sales across all categories is equal
(using ANOVA)?
Ho: Mean Sales across all Categories is equal (Null Hypothesis)
Ha: At least one pair of Mean Sales across all categories is not equal (Alternate
Hypothesis)
P value is 0.894 which is more than 0.05 hence we will accept the Null Hypothesis.
>> Did the hypothesis test using ANOVA Mean Sales across all Segment is equal
(using ANOVA)?
Ho: Mean Sales across all Segment is equal (Null Hypothesis)
Ha: At least one pair of Mean Sales across all Segment is not equal (Alternate
Hypothesis)
P value is 0.528 which is more than 0.05 hence we will accept the Null Hypothesis.
>> Did the hypothesis test using ANOVA Mean Sales across all Order Priority is
equal (using ANOVA)?
Ho: Mean Sales across all Order Priority is equal (Null Hypothesis)
Ha: At least one pair of Mean Sales across all Order Priority is not equal
(Alternate Hypothesis)
P value is 0.849 which is more than 0.05 hence we will accept the Null Hypothesis.
>> Did the hypothesis test using ANOVA Mean Sales across all Discount is equal
(using ANOVA)?
Ho: Mean Sales across all Discount is equal (Null Hypothesis)
Ha: At least one pair of Mean Sales across all Discount is not equal (Alternate
Hypothesis)
P value is 0.719 which is more than 0.05 hence we will accept the Null Hypothesis.
>> Did the hypothesis test using ANOVA Mean Profit across all Order Priority is
equal (using ANOVA)?
Ho: Mean Profit across all Order Priority is equal (Null Hypothesis)
Ha: At least one pair of Mean Profit across all Order Priority is not equal
(Alternate Hypothesis)
P value is 0.325 which is more than 0.05 hence we will accept the Null Hypothesis.
>> Did hypothesis test using multiple regression Store Sales is not influencing to
store Profit.
Ho: Store Sales is not influencing to store Profit (Null Hypothesis)
Ha: Store Sales is influencing to store Profit (Alternate Hypothesis)
P value is <2e-16 which is less than 0.05 hence we will accept the Null Hypothesis.
>> Store profit and discount is highly negative correlated each outer
>> Shipping cost and quantity have positive correlation each outer
>> Multiple regression with Profit, Quantity, Discount, Shipping Cost and Sales has
not return higher P value, I change the dependent variable with each other only
keeping dependent Profit with independent, Quantity, Discount, Shipping Cost and
Sales has R square and Adjusted R square has 42% only.
Hence we could not find conclusive evidence using the given data for Profit,
Quantity, Discount, Shipping Cost and Sales.