You are on page 1of 33

Submitted to: Dr.

Saurabh Mittal

Submitted by: Manisha Mittimani

Roll no: 20PGDM083

Title: Home Assignment (EDA Report for HATCO data)


Initial Commands:

library(haven)
Hatco_data <- read_sav("D:/Users/MANISHA MITTIMANI/Downloads/Hatco data.sav")
> View(Hatco_data)
> library(dplyr)
Hatco <- Hatco_data
> View(Hatco)
> is.na(Hatco)
> factor.variable <- Hatco[c(9,12,13,14,15)]
> scale.variable <- Hatco[-c(9,12,13,14,15)]
> Hatco$x8 <- as.factor(Hatco$x8)
> Hatco$x14<- as.factor(Hatco$x14)
> Hatco$x13 <- as.factor(Hatco$x13)
> Hatco$x12 <- as.factor(Hatco$x12)
> Hatco$x11 <- as.factor(Hatco$x11)
> str(Hatco)
> str(scale.variable)
> str(factor.variable)

### MANISHA MITTIMANI 20PGDM083


# ***** Univariate Analysis *****
# Descriptive Univariate Analysis
## 1. SCALE DATA :
> library(psych)
> scale.describe <- data.frame(describe(scale.variable))
> View(scale.describe)
> View(summary(scale.variable))
> View(apply(scale.variable,2,IQR))

Visualization for Univariate analysis

## 1. Scale Data :
### 1.1 Histogram
### 1.2 Boxplots
## 2. FACTOR DATA : Barcharts
> library(ggplot2)
library(GGally)
> ggpairs(scale.variable)
> t1 <- table(Hatco$x8,Hatco$x12)
> View(t1)
Univariate Analysis

Boxplots

Title: Services
Histogram:

Code

hist(scale.variable$x5, main = "Services", xlab = "Services", ylab = "",col ="maroon")


> rug(scale.variable$x5, col = 'orange')

Interpretation:
The average service satisfaction from HATCO ranged between 2.5 to 3.5 which means that the purchasers
were not that satisfied.

Title: Satisfaction Level

Histogram:

Code
hist(scale.variable$x10, main = " Satisfaction Level", xlab = "Satisfaction Level",
ylab = "",col ="green")
> rug(scale.variable$x10, col = 'blue')

Interpretation:
Purchasers were somewhat satisfied with their past purchases from HATCO. The average satisfaction
ranged between 4 to 5.5.
Title: Salesforce Image
Histogram:

Code:

hist(scale.variable$x6, main = "Salesforce Image", xlab = "Salesforce Image", ylab =


"",col ="maroon")
> rug(scale.variable$x6, col = 'orange')

Interpretation:
The salesforce of HATCO were perceived to be below average.

Title: Product Quality


Histogram:

Code:
hist(scale.variable$x7, main = "Product Quality", xlab = "Product Quality", ylab =
"",col ="brown")
> rug(scale.variable$x7, col = 'orange')

Interpretation:
The quality of products offered by HATCO were perceived to be above average as the ratings were on the
higher side.
Title: Price Flexibility
Histogram:

Code:
> hist(scale.variable$x3, main = "Price Flexibility", xlab = "Price Flexibility", ylab
= "",col ="maroon")
> rug(scale.variable$x3, col = 'orange')

Interpretation:
The HATCO representatives were willing to negotiate on prices on all types of products most of the times as
the rating is skewed to the higher side.
Title: Manufacturer’s Image
Histogram:

Code:
hist(scale.variable$x4, main = "Manufacturer Image", xlab = "Manufacturer Image", ylab
= "",col ="yellow")
> rug(scale.variable$x4, col = 'orange')

Interpretation:
Overall image of HATCO was not that positive as the ratings received were between 4 to 6 which showed
that overall image of the supplier was not that good.
Title: Level of Price
Histogram:

Code:

hist(scale.variable$x2, main = "Level of Price", xlab = "Levels", ylab = "",col


="maroon")
> rug(scale.variable$x2, col = 'orange')

Interpretation:
The perceived level of price charged by HATCO was considered to be dissatisfactory according to the
ratings provided by the customer.
Title: Delivery Speed
Histogram:

Code:

hist(scale.variable$x1, main = "Delivery Speed", xlab = "Speed", ylab = "",col


="blue")
> rug(scale.variable$x1, col = 'red')

Interpretation:
The speed of delivery oh HATCO ranged between below average to average which means the delivery
speed was considered to be slow in comparison with other suppliers in the market.
Title: Level of Price
Histogram:

Code:

hist (scale.variable$x9, main = "Level of Usage", xlab = "Usage Level", ylab = "",col
="maroon")
> rug(scale.variable$x9, col = 'orange')

Interpretation:
Customers don’t prefer to purchase a lot of products from HATCO. The level of usage ranges between 35 to
50 on a 100-point scale. The preference of purchase from HATCO is below average to average.
Boxplots

Title: Level of Price


Boxplot:

Code:

boxplot(scale.variable$x9, main = "Level of Usage", ylab = "Usage Level",col


="maroon")
> rug(scale.variable$x9, col = 'red', side =2)

Interpretation:
Customers don’t prefer to purchase a lot of products from HATCO. The level of usage ranges between 35 to
50 on a 100-point scale. The preference of purchase from HATCO is below average to average.
Title: Service
Boxplot:

Code:

boxplot(scale.variable$x5, main = "Service", ylab = "Service",col ="Red")


> rug(scale.variable$x5, col = 'blue', side =2)

Interpretation:
The average service satisfaction from HATCO ranged between 2.5 to 3.5 which means that the purchasers
were not that satisfied.
Title: Service
Boxplot:

Code:

boxplot(scale.variable$x10, main = "Level of Satisfaction", ylab = "Satisfaction


Level",col ="maroon")
> rug(scale.variable$x10, col = 'red', side =2)

Interpretation:
Purchasers were somewhat satisfied with their past purchases from HATCO. The average satisfaction
ranged between 4 to 5.5.
Title: Salesforce Image
Boxplot:

Code:

boxplot(scale.variable$x6, main = "Salesforce Image", ylab = "Salesforce Image",col


="grey")
> rug(scale.variable$x6, col = 'blue', side =2)

Interpretation:

The salesforce of HATCO were perceived to be below average.


Title: Product Quality
Boxplot:

Code:

boxplot(scale.variable$x7, main = "Product Quality", ylab = "Product Quality",col


="maroon")
> rug(scale.variable$x7, col = 'red', side =2)

Interpretation:
The quality of products offered by HATCO were perceived to be above average as the ratings were on the
higher side.
Title: Price Level
Boxplot:

Code:

boxplot(scale.variable$x2, main = "Price Level", ylab = "Price Level",col ="Yellow")


> rug(scale.variable$x2, col = 'green', side =2)

Interpretation:
The HATCO representatives were willing to negotiate on prices on all types of products most of the times as
the rating is skewed to the higher side.
Title: Price Flexibility
Boxplot:

Code:

boxplot(scale.variable$x3, main = "Price Flexibility", ylab = "Price Flexibility",col


="maroon")
> rug(scale.variable$x3, col = 'red', side =2)
>
Code boxplot(scale.variable$x4, main = "Manufacturer Image", ylab = "Manufacturer
Image",col ="orange")
> rug(scale.variable$x4, col = 'blue', side =2)

Code boxplot(scale.variable$id, main = "ID", ylab = "ID",col ="maroon")


> rug(scale.variable$id, col = 'red', side = 2)
Code boxplot(scale.variable$x1, main = "Delivery Speed", ylab = "Speed",col ="pink")
> rug(scale.variable$x1, col = 'purple', side =2)
Title: Bar plot (factor variables)
Bar plot:

Code:

par(mfrow = c(2,3))
> barplot(table(factor.variable[1]),main = "Size of the Firm", xlab = "Size",col
="red" )
> barplot(table(factor.variable[2]),main = "Buying Specification", xlab = "Buying
Specification",col ="red")
> barplot(table(factor.variable[3]),main = "Structure of Procurement", xlab =
"Structure",col ="red")
> barplot(table(factor.variable[4]),main = "Type of Industry", xlab = "Type",col
="red")
> barplot(table(factor.variable[5]),main = "Buying Situation", xlab = "Situation" ,col
="red")

Interpretation:
Most of the purchasing firms from HATCO are small in size.
Maximum purchasing companies use the total value analysis approach which means they evaluate each
purchase separately and then buy from the supplier.
50% of the purchasing firms have a centralized structure of procurement while 50% firms have a
decentralized structure of procurement
50% of product purchasers belong to A class industries while the other 50% belong to other industries.
Most of the purchasers are either buying from a new task or are straightforward rebuying the products they
order
While some purchasers also modify their purchases in each order.
Title: Bar plot (factor variables)
Graph:

Code:

> correlation <- cor(x=scale.variable, y= scale.variable,method="pearson")


> View(correlation)

Correlation:
The graph is showing the correlation of all the scale variables with each other. A positive correlation means
that if one is increasing the other increases and vice versa while a negative correlation means that when one
variable increases the other decreases and vice versa.
Bi- Variate Analysis

Title: Relationship between Speed and Size


Graph:

X1: Delivery speed(scale)


X8-size of the firm(factor)
Code:

> ggplot(Hatco) +
+ aes(x = x8, y = x1) +
+ geom_boxplot(fill = "#fb6a4a") +
+ theme_minimal()

Interpretation:
From the above figure, we can see that small firm purchasers feel that the delivery speed of HATCO is slow
while large firm purchasers feel that delivery speed is kind of satisfactory but still is below the average
speed on a 10-point scale.

Title: Relationship between Salesforce’s image and Type of buying situation


Graph:

X6: Salesforce’s image


X14: type of buying situation
Code:

ggplot(Hatco) +
+ aes(x = x14, y = x6) +
+ geom_boxplot(fill = "#0c4c8a") +
+ theme_minimal()

Interpretation:
Purchasers buying for a new task feel that the salesforce of HATCO are not really helpful and do not really
have a positive image of the salesforce of HATCO
For straight rebuyers, there is a great variability in rating the salesforce of HATCO; the average rating lies
between 2 to 3 which is below average.
Title: Relationship between Price Level and Source of Procurement
Variables: Price Level (x2) and Source of Procurement (x12)
Graph:

X2: price Level


X12: Structure of procurement
Code:

ggplot(Hatco) +
+ aes(x = x12, y = x2) +
+ geom_boxplot(fill = "#6baed6") +
+ theme_minimal()

Interpretation:
According to the above figure, the companies having a decentralized procurement method are not favourable
towards the prices level of HATCO’s products while companies with centralized procurement method show
a great variability in rating with minimum rating being around 2.2 to a maximum of 4 which means they
perceived the prices of the supplier’s product to be somewhat satisfactory.
Title: Relationship between Usage level and type of industry
Graph:

X9: Usage level


X13: type of industry
Code:

ggplot(Hatco) +
+ aes(x = x13, y = x9) +
+ geom_boxplot(fill = "#b4de2c") +
+ theme_minimal()

Interpretation:
Companies from A class industries purchase less than half of their products from HATCO on an average.
Also, companies from other industries purchase less than 50% products from HATCO on an average but the
variability is higher in case of companies from other industries with respect to usage level.

Title: Relationship between Satisfaction level and Specification Buying


Graph:

X10: Satisfaction level


X11: Specification buying

Code:

ggplot(Hatco) +
+ aes(x = x4, y = x6, colour = x14) +
+ geom_point(size = 1L) +
+ scale_color_hue() +
+ theme_minimal()

Interpretation:
Purchasers who focus on specification buying have given an average rating of 4.2 on a 7-point scale which
means they are satisfied with the products of HATCO but the ones employing total value analysis are more
satisfied and the average rating is also 5.2

Multivariate Analysis
Title: Relationship between the price flexibility, levels of price and Size of the firm
Variables:
Scale variables: Price Flexibility (x3) and Price Level (x2)
Factor variables: Size of firm (x8)

Graphical Representation:

X-8

0- Small Firms
1- Large Firms
Code

ggplot(Hatco_data) +
aes(x = x3, y = x2, colour = x8) +
geom_point(size = 1L) +
scale_color_viridis_d(option = "cividis") +
theme_minimal()

Interpretation:
The analysis shows that small firms believe that the willingness of Hatco sales representative to negotiate on
prices is more as opposed to the large firm purchasers. In the future there are chances that small firm
purchasers would be more willing to deal with Hatco.

Title: Relationship between the type of Industry, Structure of Procurement and Price level
Variables:
Factor Variables: Type of Industry (x13), Structure of Procurement(x12)
Scale Variables: Price Level(x2)

x13

0- other industries

1- industry A classification,
Code

ggplot(Hatco_data) +
aes(x = x12, y = x2, colour = x13) +
geom_boxplot(fill = "#e3c1ea") +
scale_color_viridis_d(option = "cividis") +
theme_minimal()

Interpretation:
Slicing has been done by colour using Industries as the variable.
In the case of companies having decentralised procurement, the companies belonging both the categories of
Industries, perceive the level of price charged by Hatco as below average.
In the case of companies having centralised procurement, the companies belonging both the categories of
Industries, perceive the level of price charged by Hatco as average or above average.

Title: Relationship between Delivery speed, price, usage and the size of the firm
Variables:
Scale variables: Delivery Speed (x1), Price level(x2) and Usage Level (x9)
Factor: Size (x8)
Graphical Analysis:

Code

ggplot(Hatco_data) +
aes(x = x2, y = x1, colour = x8, size = x9) +
geom_point() +
scale_color_viridis_d(option = "viridis") +
theme_minimal()

Interpretation:
Small firms believe that Hatco has a lower delivery speed and price level, whereas, the large firms believe
that Hatco better delivery speed and higher price level.
Small firms have a lower usage of Hatco products whereas large firms have a higher usage of Hatco
products.
Recommendation:
For an improved customer satisfaction, Hatco should focus o increasing its delivery speed while most of it
purchasers are from small firms.

Title: Relationship between manufacturer’s image, Salesforce image and type of buying situation
Variables:
Scale Variables: Manufacturer’s image (x4) and Salesforce’s image (x6)
Factor Variables: Type of buying situation (x14)
Graphical Representation:

X14

1= new task, 2=modified rebuy, and


3=straight rebuy
Code
> ggplot(Hatco) +
+ aes(x = x4, y = x6, colour = x14) +
+ geom_point(size = 1L) +
+ scale_color_hue() +
+ theme_minimal()

Interpretation:
The above given analysis shows that, low manufacturer’s image and salesforce’s image have straight rebuy,
whereas new task and modified rebuy are increasing as manufacturer’s image and salesforce’s image goes
higher in the scale.
Title: Relationship between Specification buying(x11), Type of buying situation(x14) and Satisfaction
Level (X10)
Variables:
Scale Variables: Specification buying(x11) and Type of buying situation (x14)
Factor Variables: Satisfaction Level (X10)
Graphical Representation:
X-11 X14
1- Employes total value analysis approach
2- Evaluating each purchase separately 1= new task, 2=modified rebuy,
0- Use of specification buying and 3=straight rebuy

Code
ggplot(Hatco) +
+ aes(x = x11, y = x10, colour = x14) +
+ geom_boxplot(fill = "#0c4c8a") +
+ scale_color_hue() +
+ theme_minimal()

Interpretation:
Use of specification buying led to modified rebuy from Hacto and no straight rebuy. New buys were lower
in the case of specification buy in comparison with when customers follow total value analysis approach and
modified rebuy was higher in the case of specification buying
Title: Delivery speed (x1) , Service (x5) , Size of firm (x8) and Satisfaction level (x10)
Variables:
Scale Variables: Delivery speed, Service and Satisfaction level
Factor Variables: Size of firm
Graphical Representation:
Code
ggplot(Hatco) +
+ aes(x = x1, y = x5, colour = x8, size = x10) +
+ geom_point() +
+ scale_color_hue() +
+ theme_minimal()

Interpretation:
The above given graphical analysis shows that, there is greater level of satisfaction when the Delivery speed
increases and since the larger firms believed that the delivery speed is better in Hatco, larger firms
experience more satisfaction from Hatco. It was also observed that the smaller firms believe that Hatco
provides better services to the customers, whereas the larger firms disagree.

Recommendation:
Hatco should focus on improving its services to large firms in order to expand its client base.

You might also like