BRM - Notes

January 8th, 2021
Final project report to be submitted after 7 days of completion of subject
Primary research project, consumer centric | Psycho-graphic study
Stages involved in Business Research
1. Problem Statement
a. Exploration <> Objective / Research questions
b. Methods of Validation / Basic Research
c. Questionnaire formulation
d. Sampling plan
e. Methods of data collection
Above are all steps in Desk research/planning
2. Field visit
3. Editing and cleaning of data
4. Data analysis & interpretation > Lead to solutions for step (a)
January 9th, 2021
Research Design: Blueprint for planning phase of research. It is always made in such a fashion that it
answers the 6 W’s(What, Why, How, Who, When, Where)
U/Disturbances/Errors are called extraneous variables – minimized than max in causal vs descriptive
Exploratory Design Ways:

1. Self and Peer experience
2. Literature studies, internet
3. Secondary data analysis, market available data
4. Expert opinion
a. Expert interview
b. panel discussion
5. Qualitative analysis on end consumer
a. Direct
i. Focus Groups
ii. in-depth interviews
b. Indirect – projective techniques
i. Word association test
ii. Sentence completion test
iii. Thematic appreciation test
iv. Role play
RD and Dupont to be read/completed - Done
Week of Jan 17th to schedule 2 hr sessions with Rastogi sir……
January 14th
FG’s to have a common ground discussion for coming to conclusion
FG’s to be homogenous
FG’s to have moderator

January 16th
Focus Group Discussion Panel Discussion In-depth interview Expert interview

End Consumers Experts Consumer Expert
8-12 3-4 Unstructured, free Structured,
flow pointed questions
Homogenous Heterogenous Loosely prepared Fixed set of
questions question
To get minimum common Get as much diverse Time consuming Fixed time
thought/set of variables information as possible
Inexpensive Expensive Less expensive Expensive
Moderator – who knows Moderator – to keep Laddering, hidden Direct answers
the problem discussion on track and issue & symbolic with logic, point
keep everyone in check analysis of view
35-40 25-30 Views expressed after Views are direct
thought process
January 23rd:
Projective Techniques – Word association (top of the mind recall), Sentence completion, thematic
aperception, role play
Dupont Discussion:
Jan 22 – Dupont Discussion
To study the factors influencing the buying decision in residential segment
b. To study the market share vs competitors
c. To understand the demographic
d. Would the launch in residential influence in commercial segment
Whether feel/touch influence the buying decision
ii. Whether visual / design / pattern of carpet influencing…
iii. Does quality influencing
iv. Does durability influence
v. Does color influence
vi. Does the purpose of use influence
vii. Does price influence
Causal/Experimental design: Impact of cause on variable, keeping everything else constant

3 elements of Experimental design: Randomization, Replication & local control
Experimental Design: Pre-experimental, true experimental, quasi-experimental, statistical
Scale and scaling design – Causal design and DuPont case discussion
February 4th:
Causal Design: 3 basic principles of Randomization, Replication and Local Control
Pre-Experimental design: least efficient
Extraneous errors – are those not interested to study, but only for finding dependent and uncontrolled
variables – to minimize the influences of all other disturbances/errors
Static Design – Experimental group vs control group. Division is not based on any scientific principles but
arbitrary decision of researcher
Causal design types and discussion
Demographic variables are nominal or ratio
Read Scales and Scaling technique > Psychographic variables > Comparative(ordinal) and Non-
comparitive (interval)
Paired comparison: if row is preferred over column, we give 1 else 0. Diagonal elements are ignored (no
comparison b/w same). Each triangle should be identical and then the rows are to be summed up for
max weightage of any particular parameter.
Read Questionnaire & pg 360 Hospital questionnaire | pro’s con’s of design
Sem-2  April 15th – 20th
Exploration > Objectives to be ascertained > Questionnaire to be made(should start with the purpose of
survey) > Samling plan(size, target and method of selecting sample) > Method of collecting the data >
analysis if tool (hypo testing)
April 2nd: Cosmopolitan survey on sexual behavior of US women
To cover: Nike case, US women survey
Q3. Non probabilistic convenient sampling
Q4.
Notes on simple regression – revise
Regression equation =
Ei = random error
B1= regression slop
Multiple regression or multi-variate regression: Assuming, variables are linear in nature and
independent X’s.
Either convert non-linear into linear or apply using log linear
Y = Dependent, regressed, study variable
α = constant, fixed effect

B1, B2, B3 = partial regression co-efficients
X1, X2, X3 = based on exploratory research(if correct variables are surveyed, U would be less impactful)
U = disturbance, brings randomness in equation
Y = α + B1 + B2 + B3 + U (σ 2)
If X is changed by 1 unit, Y will change by B 1 units = partial regression
Assumptions for nulti-variate:
1. Linear relationship b/w X & Y

2. Y is normally distributed (in case of violation, Y should be measured on interval scale)
3. Multiple regression cannot be applied for ordinal or nominal case variables
Logistic regression: Used for categories and assessment of risks
Discriminant regression: discrimination b/w categories
Scatterplots are drawn assuming linearity, b/w Y & Y’(predicted value)
Steps for regression:

1. Test the assumptions
2. Fit the regression equation & interpret the coefficients
Y^ = α ^ + B1^ + B2^ + B3^(Estimated, best fit line)
3. Interpret the efficiency of forecast by using R 2
4. Test for the significance of the regression equation using ANOVA
5. Test for the significance of the independent variables using ‘t’ test
Case study - Sales as a function of other variables
1. Test for normality of “Sales” using Kolmogorov Smirnov test
Z=0.244, p=0.000, α =0.05
Since Z> α , Reject H0
H0: Sales follows normal distribution
H1: Sales does not follow normal distribution, but sales is measured on ratio scale hence using
2. Fit the model for predicted values | Run Analyze>Regression>Linear

a. All are positively influencing other than “index”
b. Market potential is increased by 1 unit(1 lakh), sales will increase by 0.312 lakhs,
keeping other variables constant
c. If dealers are increased by 1, sales increases by 0.309 lakhs
d. If the number of salesperson is increased by 1, sales will increase by 0.696 lakhs
e. If competition increases by 1 unit, sales will decrease by 3.219 lakhs
3. For comparing relative importance between different influencers, we look at standardized β
coefficients
4. Efficiency R2 = 0.9737, remaining is U | By adding variables, we can reduce R 2
a. Adjusted R2 is taking into account other independent variables
5. Construct scatter plot to check linearity
6. To test for the significance of R2/goodness of fit | Highest value of R2 will decide which model is
the best fit
a. H0: Model insignificantly explains sales
b. H1: Model significantly explains sales
Since F=265.339, p=0 & α =0.05, p< α , Reject H0
7. Significance of independent variables(all) | t-stat talks about significance
a. H0: Market potential insignificantly influences sales(R 2=0)
b. H1: Market potential significantly influences sales(R 2>0)
Since p< α , Reject H0
8. To test the model | Best Linear Unbiased Estimated (BLUE) model | Markovian testing
9. Assumptions for BLUE are:
a. The mean of residuals is 0
b. The residuals are normally distributed
c. Variance of residuals is constant(homogeneity of variance)
d. Successive residuals are un-correlated to independent among themselves | Violation of
this assumption is called auto-correlation | All time series are auto-regressive
e. The independent variables X are independent among themselves(violation of this will
become multi collinearity)
Identification of multi-collinearity:
1. Variance Inflating Factor = 1/(1- R2) | VIF will live b/w 0-1
a. If VIF = 1, R2 = 0 => MC is absent
b. If VIF = 1-6 => MC is insignificant
c. If VIF = 6-10, MC is significant(Researcher’s decision zone)
d. If VIF > 10, sever MC => Remedial action is required
2. Condition indices: If CI = 1
a. If CI = 1, R2 = 0 => MC is absent
b. If CI </= 15 => MC is insignificant
c. If VIF = 15-25, MC is significant(Researcher’s decision zone)
d. If VIF > 25, sever MC => Remedial action is required
3. Generally VIF and CI will give similar results. In case of conflicts, CI is given priority.
4. For reducing multi collinearity, we drop the variables one by one. Varible having highest VIF is
dropped 1st and consequently others.
5. Collinearity stats: Tolerance is percentage variance and reciprocal of VIF | we look for
dimensional CI and then start removing variables with highest VIF
May 21st – EOD submission for project
Successive residulas are independent aong themselves – if this condition is violated, it becomes atuo
correlation/serial corr.
Most common form of detection is durbin-watson test (correlation coefficient b/w 2 successive term
4) Correlation coefficient(row) lies b/w -1 to 1, D can be b/w 0 to 4
H0: AC is insignificant(R2=0)
H1: AC is significant(R2>0)
 D = 4, row = -1, perfect negative auto correlation

 D = 2, row = 0, auto correlation absent
 D = 4, row = 1, perfect positive auto correlation
 D = 1.5-0 & 0-2.85, auto correlation insignificant(positive & negative)
 D = 0-1.5, significant positive auto correlation
 D = 2.85-4, significant negative auto correlation
Remedial action: In case AC is present, we apply Generalized least squares/weighted least squares
method for estimation in place of OLS.
3) Variance of residuals = constant(homoscedasticity)
Homoscedasticity, or homogeneity of variances, is an assumption of equal or similar variances in

different groups being compared. This is an important assumption of parametric statistical tests
because they are sensitive to any dissimilarities. Uneven variances in samples result in biased and
skewed test results.
If variance is not constant, there will be a high degree of loss for predictive probability/forecast
Heteroscediscity is the biggest deterrent as causing high variance of observations.
(i) BPG(Brusch Pagan Godfrey)

(ii) Goldfield quandt test
(iii) White test – best, assumption free test based on chi-square test
Parametric statistics are based on assumptions about the distribution of population from which the
sample was taken. Nonparametric statistics are not based on assumptions, that is, the data can be
collected from a sample that does not follow a specific distribution.
Not on SPSS, but on e-views and R | Instead, not for inferential, but we can have a diagrammatic test
plotting scatter b/w standard predicted value and standardized residual value. (There should be an
evident pattern) | If no pattern, the data is homoelascedistic.(pattern is hetero)
Remedial actions: we apply Generalized least squares/weighted least squares method for estimation in
place of OLS.
2) Mean of residuals = 0 | Sum of deviations about mean is always 0
1) Residuals are normally distributed | If residuals are not normally distributed, this indicates mis-
specification error. Highlights presence of linearity
Analyze>Regression>Linear> Plots + Stats + Save
To check for normality diagrammatically, histogram or NPP can be selected
Purchase intention case study:

1. Ran regression and checked for normality and linearity – SPSS
R2 = 43.6% - only able to explain 45% selling behavior
Durbin = 1.9, very close to 2
Using ANOVA model significant, null hypo rejected – F and p-values
Looking at std. beta coefficients – “highest like to travel” > 0.5
Unstd. Is used to build predictive models only
This is a fit case of MC data, all insignificant variables independently
2nd approach to eliminate multi-collinearity is factor analysis
It eliminates MC
It reduces the number of independent variables for making it more logical
Factors are latent constructs which are not direct but derived
Factor analysis assumptions:
1. It is used for data reduction
2. It helps to eliminate multi-collinearity
Pre-requisites are:
1. The variables should be on the metric scale – interval or ratio scale

2. The variables should have significant correlation b/w themselves
H0: Corelation is insignificant(R2=0)
H1: Corelation is significant(R2>0)
We use barlett’s test for sphericity(X 2) – it is a chi-square test
3. The sample size should be sufficiently large/adequate – tested by Kaiser Mayer Orkin
(KMO)value > 0.5. It is called adequacy test
Assumption 1 & 2 are necessary, while 3 is sufficient condition(as long as 5 times observation of the
variables)
differences b/w FA and regression
1. Lambdas are called factor loadings vs regression co-efficients

2. Regression had a fixed effect called alpha not in FA
3. No regression U in FA
The factor derives its name from dominating factor. Mostly dominating factors have a high correlation
amongst themselves. It is also called segmentation of variables.
Since the study was conducted on a 9 point likert scale, therefore 1 st assumption is fulfilled
H0: Corelation is insignificant
H1: Corelation is significant
X2 = 7280.059, p = 0, Alpha = 0.05  Since p<alpha, null is rejected
KMO = 0.750 which is > 0.5, therefore 2 nd condition satisfied
Correlation is the determination of R which has to be looked at.
Factor Analysis:
1. Is an interdependence technique
2. Used to reduce the no. of variables(observed)
3. Scores/data of latent variables/constructs is generated
4. Helps to remove multi-colinearity
F = ƛ1X1 + ƛ2X2 + …… + ƛnXn
Theoretically, no of factors = no of variables
Conditions:
1. Variables should be measured on metric scale(interval or ratio)

2. Variables should be significantly correlated, R>0
3. Sample should be adequate
Exploratory(Purchase intention) vs confirmatory(Dell): Basically knowing whether factors and groupings

are know or not
Steps:
1. Test assumptions
2. Identify no. of factors to be extracted
a. Exploratory: no. of factors to be extracted is not known
i. Eigen value > 1 approach
ii. Scree plot approach: diagrammatic approach plotting the Eigen values vis a vis
the factors on x-axis. We try to find the elbow in the graph and the
corresponding factors are extracted. This is an indicative, subjective approach. It
generally gives 1 value more than the Eigen approach.
b. Confirmatory: no. of factors to be extracted are known
3. Method of extraction: If lines intersect at right angles, there is no correlation(Principal
component analysis), Variables/Factors are extracted at 90(orthogonal extraction)
a. PCA – Factors are extracted at 90 degrees orthogonally
b. Principal Access Factoring: Factors are extracted at any other angle except 90. MC will
still remain.
4. Redistribution of variability or rotation:
a. Varimax – variance maximizing in each of the factor to have optimal variance across all
factors. Orthogonal at 90 degrees
b. Quartimax – at 45 degrees
c. Equamax – at 60 degrees
d. Promax –
e. (There should be atleast more than 15 iterations to allow for variability) | Purpose is to
stabilize and converge.
Analyze > Dimesion reduction>Factor
Descriptives>KMO & Coefficients | Extraction>PCA with Scree plot & exploratory or confirmatory choice
H0: Corelation is insignificant
H1: Corelation is significant
X2 = 7280.059, p = 0, Alpha = 0.05  Since p<alpha, null is rejected. Hence 2 nd assumption is fulfilled
Since the study was conducted on a 9 point likert scale, therefore 1 st assumption is fulfilled
KMO = 0.750 which is > 0.5, therefore 3 rd condition satisfied
Correlation is the determination of R which has to be looked at.
P.S: Sum of eigen values will be equal to no of variables
Higher the commonality, important is the variable.

Higher the factor loading, important is the variable to the given factor
P.S: If the sample correlation, r = 0.43, the variable will be insignificant
Factor loadings will lie b/w -1 to +1
5. Name the factors on the basis of dominating variables

6. Calculate the factor scores(all individual scores of each factors)
Purchase intention:
a. 1st factor having 6 variables(monetary) – F1 – Financial behaviour

b. 2nd factor having 4 variables(style) – F2 – Style behaviour
c. 3rd factor having 3 variables(Future/optimism) – F3 – Future prediction
d. 4th factor having 3 variables(Confidence) – F4 – Confidence
e. 5th factor having 3 variables(travel) – F5 – Travel
f. 6th factor having 2 variables(home) – F6 – Family oriented
>Scores>Save as variables + regression
Then build regression model using factors extracted
Colinearity dignostics, residual and pred, stand and unstand coefficients
Durbin is insignificant
Auto corekation is positive and corelated
Heteroscedacity is present
All are positively influcing the condition to buy

BRM - Notes

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

BRM - Notes

Uploaded by

Copyright:

Available Formats

January 8th, 2021

Final project report to be submitted after 7 days of completion of subject

Primary research project, consumer centric | Psycho-graphic study

Stages involved in Business Research

January 9th, 2021

Exploratory Design Ways:

RD and Dupont to be read/completed - Done

Week of Jan 17th to schedule 2 hr sessions with Rastogi sir……

FG’s to have a common ground discussion for coming to conclusion

FG’s to have moderator

Focus Group Discussion Panel Discussion In-depth interview Expert interview

Jan 22 – Dupont Discussion

To study the factors influencing the buying decision in residential segment

b. To study the market share vs competitors

c. To understand the demographic

d. Would the launch in residential influence in commercial segment

Whether feel/touch influence the buying decision

ii. Whether visual / design / pattern of carpet influencing…

iii. Does quality influencing

iv. Does durability influence

v. Does color influence

vi. Does the purpose of use influence

vii. Does price influence

Causal/Experimental design: Impact of cause on variable, keeping everything else constant

Experimental Design: Pre-experimental, true experimental, quasi-experimental, statistical

Causal Design: 3 basic principles of Randomization, Replication and Local Control

Pre-Experimental design: least efficient

Causal design types and discussion

Demographic variables are nominal or ratio

Read Questionnaire & pg 360 Hospital questionnaire | pro’s con’s of design

Sem-2  April 15th – 20th

To cover: Nike case, US women survey

Q3. Non probabilistic convenient sampling

Notes on simple regression – revise

B1= regression slop

Either convert non-linear into linear or apply using log linear

Y = Dependent, regressed, study variable

α = constant, fixed effect

U = disturbance, brings randomness in equation

If X is changed by 1 unit, Y will change by B 1 units = partial regression

Assumptions for nulti-variate:

1. Linear relationship b/w X & Y

Logistic regression: Used for categories and assessment of risks

Discriminant regression: discrimination b/w categories

Scatterplots are drawn assuming linearity, b/w Y & Y’(predicted value)

Steps for regression:

Case study - Sales as a function of other variables

1. Test for normality of “Sales” using Kolmogorov Smirnov test

Z=0.244, p=0.000, α =0.05

Since Z> α , Reject H0

H0: Sales follows normal distribution

2. Fit the model for predicted values | Run Analyze>Regression>Linear

May 21st – EOD submission for project

4) Correlation coefficient(row) lies b/w -1 to 1, D can be b/w 0 to 4

 D = 4, row = -1, perfect negative auto correlation

3) Variance of residuals = constant(homoscedasticity)

Homoscedasticity, or homogeneity of variances, is an assumption of equal or similar variances in

Heteroscediscity is the biggest deterrent as causing high variance of observations.

(i) BPG(Brusch Pagan Godfrey)

2) Mean of residuals = 0 | Sum of deviations about mean is always 0

Analyze>Regression>Linear> Plots + Stats + Save

To check for normality diagrammatically, histogram or NPP can be selected

Purchase intention case study:

Using ANOVA model significant, null hypo rejected – F and p-values