You are on page 1of 13

100% certainty= confidence + error

As we are working on samples, we can only be confident that something will happen.

The minimum confidence that we can accept will be 90%, so the maximum error level (α) will
be 10%.

Maximum confidence that we can accept will be 99% and the maximum level will be 1%.

Hypothesis testing

H0=Null hypothesis

H1=Alternative hypothesis

In the end of the testing procedure, we have a probability to accept H0.

If P-value>α H0

If P-value< αH1

Example: p-value=0,23 (23%)>5%, this means that I accept H0.

Golden rule: At first, we must see the P-value.

H0 No relationship (coefficient approximately =0)

H1 Significant relationship

In homework we will have to explain the outputs, we have to consider that the teacher doesn’t
have a clue about econometrics or statistics.

1.-The simple linear regression model

Y(x)=a+bx+ ε
a+bx is the confidence.

Ε is the error

E=Yreal-Yestimated (y^)

PRÁCTICA
Buscar en google como interpretar valores en un modelo de regression simple. Hay que
explicar las conclusiones.

If std.variation>mean, we have huge variation problems, so we have to take the log for the

assets.

F is the Fisher test, its the test related to the anova procedure. If there is a relationship
between the variables, the explained variance Will be high, if contrary, the residual Will be
high.

H1 realationship

H0 no relationship

If Prob F is higher than 5% or any critical level we set, we accept H0 (we dont have a significant
relationship).
In this case we accept H1, Also, Prob>F is the same as P>/t/, because we didn’t add more
factors.

R is 0.7117 means 70% so you’re already in the upper part of the range.

If the coefficient is significant, we don’t have 0 in our confidence Interval, it shows me that we
have a positive impact of revenues on assets.

If the revenues would double, the assets would increase on 85%.

Adjusted R squared is for the purpose of comparison, we have 118 companies in observation of
revenues and assets. The larger the sample the better, because more stable the results.

Lassets= 4.33+0.85lrev+E (epsilon, es una estimación)

E=lassetsreal - lassets (^)

Homocedastic means that I have similar variation all along the plot.

On the contrary, heteroskedascity NOT OK because it shows me that I have problems with the
data that I didn’t solve.
As I can’t draw a rectangle in my plot but I can draw a phunnel, the data is heteroskedastic, and
it needs to be treated.

TEST FOR HETEROKEDASTICITY

As the probability is lower than 5%, we accept H0 (constant variance/homokedasticity)

TEST RAMSEY REGRESSION SPECIFICATION…

Are there any variables that would impact the assets and they would impact the regression?
PROBABILITY>ERROR, accept H0. If its lower, reject H0 and accept H1.

En este caso la probabilidad es un 5,3%, entonces si el nivel de confianza es del 5% aceptas H0,
perso si es del 10% la rechazas.

INFORMATION CRITERIA

In the OLS we minimize the squared errors (we minimize the difference between reality and
estimation):

a=M(y)-bM(X)

b=cov(x,y)/varianza de x al cuadrado

Los test se hacen con los residuos.

If the correlation coefficient is above 0.5, you should not but them both in the same
regression.
HOMEWORK

In the homework we have a list of variables, we must pick variables for the model, and we may
not have normality and we will have to use log likelihood and information criterion instead of R
squared.

Thursday

With a 1% percent increase in revenues, we estimate an average increase of 0.85% in assets.

Para meter mas variables en la regression (y hacerla regresion multiple), le vamos dando a las
flechas delante de los nombres en la parte derecha de la pantalla para meterlos en el
commando.

Entre la regresion simple y la multiple, la Prob>F sigue siendo 0.

El DI no es significante porque el P valor es 0.324, lo mismo para IIRFMention.


R-squared nos dice que las variables que hemos considerado impactan a assets (variable

dependiente) en un 71%, sin embargo, al ver el P>/t/ individualmente, vemos la importancia de


cada variable.

With the standarized beta coefficients, the variables become comparable.

Now in the right columna we have the beta coefficient. For revenues I have a coefficient of
0.84, for DI, 0.057, and for IIRF, -0-04. A positive coefficient shows me a direct relationship, and
a negative one, an inverse relationship.

The highest beta, is the one that is the most impacting variable (Rev).

OPEN A FILE

File, Log, Begin (ESTO ES PARA EL OUTPUT, PARA VER LO QUE HAS HECHO). After that, you
import the Excel sheet.

Para guardar el data set (las nuevas variableexcels que metamos, etc.), le damos a guardar
como.

METERLE LOG A UNA VARIABLE

gen l(nombre)=ln(variable)
Si tenemos un valor per capita, mejor cogerlo de variable dependiente.

Como podemos observar en la gráfica, existe una relación fuerte entre las variables, ya que
claramente se puede observar una dirección común entre los puntos. We can expect a strong
lineal regression.

Las gráficas se guardan desde su propio menú.

En este caso, cuando metemos inversiones en vez de exportaciones, vemos que hay menos
relación entre las variables.

VC: Variance coefficient ((Desviación típica/media) x 100). Esto nos deja ver cuánto nos
desviamos de la media, si este es menor al 30%, la desviación es baja, y por tanto las variables
son homogéneas.

En el caso de exp, la desviación típica es mayor que la media, es por ello que el coeficiente de
variación es mayor al 100%.
The higher the R-squared, the more you explain the dependant variable, so better the model.

If prob > F is below 5%, we accept H1 (significant GLOBAL relationship between the variables).

To check the significance of each variable, we look at P>/t/.

Lgdp_capital= 0.57+0.51ldom-0.05lfdi_abs+0.83lexp-0.97lSAV_abs

 A 1% increase in domestic credit leads on average to a 0.51% increase in GDP per


cápita.
 A 1% in foreign investments leads to a drop in the GDP by 0.05% on average.
 A 1% increase on savings on average leads to a 0.97% decrease on GDP on average.

Countries that recieve more foreign investments have the lower GDP (because of the negative
relationship between the variables).

Para buscar qué variable tiene más efecto sobre el GDP, hacemos el coeficiente beta.
Podemos ver que savings es lo que más impacto tiene sobre el GDP.

Si VIF > 10, tenemos muchos problemas. El VIF nos dice que estas 4 variables están muy
correlacionadas (multicolinealidad), por lo que debería de meter cada una de ellas en una
regresión simple.

En este caso estamos yendo de lo general a lo especifico.


El cambio de signos quiere decir que la relación inversa que tenían savings y lfdi con el GDP era
así porque también estaban las variables de créditos y exportaciones.

EVIEWS
La primera variable que coges es la dependiente.

We have most of the procedures in views. Go to representations, and it will give us how to
write the ecuations.

Views-Actual fitted residuals.


Ahora hacemos el test de heterocedasticidad.

We accept H0 because the pvalue is


higher tan 5%.

PARA GUARDAR EN EVIEWS:


------------------------------------------------------------------------------------------------------------------------------

TIME SERIES
Se trata de comparar una variable a lo largo del tiempo, surge en la necesidad de preveer
(forecast). Siempre miramos hacia atrás para saber la tendencia y luego poder hacer
estimaciones para el futuro.

In order to forecast you need a history and you assume that everything that happens in the
past preserves in the future.

Stationarity: Una variable es estacionaria si la media y la desviación típica son estables a lo


largo del tiempo.

Unstationarity: The past impacts the present. I can see the impact by simply substract from
each value its previous one. De esta forma le quitas la correlación a la variable con su anterior
valor. Las variables no estacionarias tienen media y desviación típica inestables.

Cuando tenemos dos variables tenemos que preservar el tiempo porque lo que nos interesa es
la evolución.

You might also like