You are on page 1of 6

Model 1 -

import pandas as pd
import statsmodels.api as sm

# Load the data into a pandas DataFrame


data = pd.read_csv('data.csv')

# Define the dependent and independent variables


X = data[['Volume', 'VixClose']] # Independent variables
y = data['NiftyClose'] # Dependent variable

# Add a constant term to the independent variables


X = sm.add_constant(X)

# Create the multiple linear regression model


model = sm.OLS(y, X).fit()

# Print the summary of the model


print(model.summary())
The summary provides us with the following information:

1. The R-squared value is 0.613, which indicates that the independent variables
(Volume and VixClose) explain about 61.3% of the variation in the dependent
variable (NiftyClose).
2. The coefficients for Volume and VixClose are 0.0107 and 99.9138, respectively.
This means that for every unit increase in Volume, NiftyClose is expected to
increase by 0.0107 units, holding VixClose constant. Similarly, for every unit
increase in VixClose, NiftyClose is expected to increase by 99.9138 units, holding
Volume constant.
3. The p-values for both Volume and VixClose are very small (close to zero),
indicating that the coefficients are statistically significant.
4. The F-statistic and its associated p-value indicate that the model as a whole is
statistically significant.
5. The Durbin-Watson statistic is 0.127, which suggests the presence of positive
autocorrelation in the residuals.
6. The Jarque-Bera statistic and its p-value indicate that the residuals are not
normally distributed.

Overall, while the model appears to be statistically significant, there are some concerns
regarding the assumption violations (autocorrelation and non-normality of residuals).
Further diagnostics and potential remedies (e.g., transformations, alternative regression
techniques) may be required to improve the model's fit and reliability.

Model 2 -

# Define the dependent and independent variables


X = data[['Volume']] # Independent variable
y = data['VixClose'] # Dependent variable

# Add a constant term to the independent variable


X = sm.add_constant(X)

# Create the linear regression model


model = sm.OLS(y, X).fit()

# Print the summary of the model


print(model.summary())
The summary provides us with the following information:

1. The R-squared value is 0.250, which indicates that the independent variable
(Volume) explains about 25% of the variation in the dependent variable
(VixClose).
2. The coefficient for Volume is 2.03e-06, which means that for every unit increase
in Volume, VixClose is expected to increase by 2.03e-06 units, holding other
factors constant.
3. The p-value for Volume is very small (close to zero), indicating that the coefficient
is statistically significant.
4. The F-statistic and its associated p-value indicate that the model as a whole is
statistically significant.
5. The Durbin-Watson statistic is 0.227, which suggests the presence of positive
autocorrelation in the residuals.
6. The Jarque-Bera statistic and its p-value indicate that the residuals are not
normally distributed.

While the model is statistically significant, the R-squared value is relatively low,
indicating that Volume alone does not explain a large portion of the variation in
VixClose. Additionally, there are concerns regarding the assumption violations
(autocorrelation and non-normality of residuals).
To improve the model's fit and reliability, further diagnostics and potential remedies
(e.g., transformations, alternative regression techniques, including additional
independent variables) may be required.

Model 3 -

# Define the dependent and independent variables


X = data[['NiftyClose', 'Volume']] # Independent variables
y = data['VixClose'] # Dependent variable

# Add a constant term to the independent variables


X = sm.add_constant(X)

# Create the multiple linear regression model


model = sm.OLS(y, X).fit()

# Print the summary of the model


print(model.summary())
The summary provides us with the following information:

1. The R-squared value is 0.695, which indicates that the independent variables
(NiftyClose and Volume) explain about 69.5% of the variation in the dependent
variable (VixClose).
2. The coefficient for NiftyClose is -0.0016, which means that for every unit increase
in NiftyClose, VixClose is expected to decrease by 0.0016 units, holding Volume
constant. This negative relationship is expected, as higher NiftyClose values (i.e.,
higher market levels) are generally associated with lower levels of volatility
(VixClose).
3. The coefficient for Volume is 1.88e-06, which means that for every unit increase
in Volume, VixClose is expected to increase by 1.88e-06 units, holding NiftyClose
constant. This positive relationship is expected, as higher trading volume is often
associated with higher market volatility.
4. The p-values for both NiftyClose and Volume are very small (close to zero),
indicating that the coefficients are statistically significant.
5. The F-statistic and its associated p-value indicate that the model as a whole is
statistically significant.
6. The Durbin-Watson statistic is 0.118, which suggests the presence of positive
autocorrelation in the residuals.
7. The Jarque-Bera statistic and its p-value indicate that the residuals are not
normally distributed.
By including both NiftyClose and Volume as independent variables, the model's
explanatory power (R-squared) has increased significantly compared to the previous
model with Volume alone. However, there are still concerns regarding the assumption
violations (autocorrelation and non-normality of residuals).
Further diagnostics and potential remedies (e.g., transformations, alternative regression
techniques) may be required to improve the model's fit and reliability, especially if the
assumption violations are severe.

You might also like