Professional Documents
Culture Documents
As we are working on samples, we can only be confident that something will happen.
The minimum confidence that we can accept will be 90%, so the maximum error level (α) will
be 10%.
Maximum confidence that we can accept will be 99% and the maximum level will be 1%.
Hypothesis testing
H0=Null hypothesis
H1=Alternative hypothesis
If P-value>α H0
If P-value< αH1
In homework we will have to explain the outputs, we have to consider that the teacher doesn’t
have a clue about econometrics or statistics.
Y(x)=a+bx+ ε
a+bx is the confidence.
Ε is the error
E=Yreal-Yestimated (y^)
PRÁCTICA
Buscar en google como interpretar valores en un modelo de regression simple. Hay que
explicar las conclusiones.
If std.variation>mean, we have huge variation problems, so we have to take the log for the
assets.
F is the Fisher test, its the test related to the anova procedure. If there is a relationship
between the variables, the explained variance Will be high, if contrary, the residual Will be
high.
H1 realationship
H0 no relationship
If Prob F is higher than 5% or any critical level we set, we accept H0 (we dont have a significant
relationship).
In this case we accept H1, Also, Prob>F is the same as P>/t/, because we didn’t add more
factors.
R is 0.7117 means 70% so you’re already in the upper part of the range.
If the coefficient is significant, we don’t have 0 in our confidence Interval, it shows me that we
have a positive impact of revenues on assets.
Adjusted R squared is for the purpose of comparison, we have 118 companies in observation of
revenues and assets. The larger the sample the better, because more stable the results.
Homocedastic means that I have similar variation all along the plot.
On the contrary, heteroskedascity NOT OK because it shows me that I have problems with the
data that I didn’t solve.
As I can’t draw a rectangle in my plot but I can draw a phunnel, the data is heteroskedastic, and
it needs to be treated.
Are there any variables that would impact the assets and they would impact the regression?
PROBABILITY>ERROR, accept H0. If its lower, reject H0 and accept H1.
En este caso la probabilidad es un 5,3%, entonces si el nivel de confianza es del 5% aceptas H0,
perso si es del 10% la rechazas.
INFORMATION CRITERIA
In the OLS we minimize the squared errors (we minimize the difference between reality and
estimation):
a=M(y)-bM(X)
b=cov(x,y)/varianza de x al cuadrado
If the correlation coefficient is above 0.5, you should not but them both in the same
regression.
HOMEWORK
In the homework we have a list of variables, we must pick variables for the model, and we may
not have normality and we will have to use log likelihood and information criterion instead of R
squared.
Thursday
Para meter mas variables en la regression (y hacerla regresion multiple), le vamos dando a las
flechas delante de los nombres en la parte derecha de la pantalla para meterlos en el
commando.
Now in the right columna we have the beta coefficient. For revenues I have a coefficient of
0.84, for DI, 0.057, and for IIRF, -0-04. A positive coefficient shows me a direct relationship, and
a negative one, an inverse relationship.
The highest beta, is the one that is the most impacting variable (Rev).
OPEN A FILE
File, Log, Begin (ESTO ES PARA EL OUTPUT, PARA VER LO QUE HAS HECHO). After that, you
import the Excel sheet.
Para guardar el data set (las nuevas variableexcels que metamos, etc.), le damos a guardar
como.
gen l(nombre)=ln(variable)
Si tenemos un valor per capita, mejor cogerlo de variable dependiente.
Como podemos observar en la gráfica, existe una relación fuerte entre las variables, ya que
claramente se puede observar una dirección común entre los puntos. We can expect a strong
lineal regression.
En este caso, cuando metemos inversiones en vez de exportaciones, vemos que hay menos
relación entre las variables.
VC: Variance coefficient ((Desviación típica/media) x 100). Esto nos deja ver cuánto nos
desviamos de la media, si este es menor al 30%, la desviación es baja, y por tanto las variables
son homogéneas.
En el caso de exp, la desviación típica es mayor que la media, es por ello que el coeficiente de
variación es mayor al 100%.
The higher the R-squared, the more you explain the dependant variable, so better the model.
If prob > F is below 5%, we accept H1 (significant GLOBAL relationship between the variables).
Lgdp_capital= 0.57+0.51ldom-0.05lfdi_abs+0.83lexp-0.97lSAV_abs
Countries that recieve more foreign investments have the lower GDP (because of the negative
relationship between the variables).
Para buscar qué variable tiene más efecto sobre el GDP, hacemos el coeficiente beta.
Podemos ver que savings es lo que más impacto tiene sobre el GDP.
Si VIF > 10, tenemos muchos problemas. El VIF nos dice que estas 4 variables están muy
correlacionadas (multicolinealidad), por lo que debería de meter cada una de ellas en una
regresión simple.
EVIEWS
La primera variable que coges es la dependiente.
We have most of the procedures in views. Go to representations, and it will give us how to
write the ecuations.
TIME SERIES
Se trata de comparar una variable a lo largo del tiempo, surge en la necesidad de preveer
(forecast). Siempre miramos hacia atrás para saber la tendencia y luego poder hacer
estimaciones para el futuro.
In order to forecast you need a history and you assume that everything that happens in the
past preserves in the future.
Unstationarity: The past impacts the present. I can see the impact by simply substract from
each value its previous one. De esta forma le quitas la correlación a la variable con su anterior
valor. Las variables no estacionarias tienen media y desviación típica inestables.
Cuando tenemos dos variables tenemos que preservar el tiempo porque lo que nos interesa es
la evolución.