# Yingda Zhao

OPRE 3360
December 8, 2017
Final Project: Regression Analysis

Introduction
Sex, age, education, life, and news are selected as independent variables to explain the change
in variable income, and regression analysis will show how much this model explains the change
in income.
Justification
Life and news are selected as the independent variables in addition to sex, age, and education. I
believe the mental state of a person is vital to his or her income. People who find life exciting
because they learn new knowledge every day or they really enjoy their work could make a much
higher income than those feeling boring at work. For example, Bill Gates always feel programming
is interesting and he is so devoted to it that he tries to learn and started Microsoft. Normal
employees do not really like programming. They think it is just a routine of the job. The income
of the two different kinds of people is huge.
Reading news can also contribute to income, especially for people who are major in business
and engineering. Today is the information era that information is playing a vital role in decision
making and learning. And the newspaper is a significant way to get information. For example, A
CEO could make a mistake to buy a company having a scandal with a huge amount of money
because he or she did not see the newspaper this morning; a human resource manager could hire a
murder by mistake because he or she did not see the last newspaper; an investor could investor the
wrong stock because he or she does not read the newspaper. Newspaper not only tell us what to
do, it also tells us what to learn. For example, it was reported by a newspaper that Java is becoming
the most popular programming language. Some students, however, stick to C, and C++, which
makes them hard to find a job after graduation.
Descriptive Data Analysis
The data of sex reflects the gender information of the sample. In this research, 0 means male 1
means female. Mean of sex is 0.49; it means that 49% of the sample is female. The number of
males can be a little bit more than females.
The data of age shows that the sample covers different ages people. The minimum is 18 and the
maximum is 79, the Standard division (SD)is 13.74. It tells us the age is spread from 18-79 because
both the range and SD are high.
Not all the respondents are well educated. The average year of school completed by the
respondent is 14 years, and the skewness is almost 0 which makes the distribution is like a normal
distribution. The minimum is 4, which means that this respondent did not graduate from the
primary. The maximum is 20 telling us this respondent should be pursuing a master or doctoral
degree.
The mean of “Often Read news” is 0.44 and. It means that 44% of respondents often read the
news, and 56% of respondents do not. (Respondents who read newspaper every day or few times
per week are classified as “Often Read News”)
Similarly, the mean of “Exciting” is 0.55, which means 55% of respondents find their life
exciting. (Respondents who find their life exciting are classified as “ Exciting”)
The data of income is very interesting because the skewness is as high as 1.88. The mean is
39,254 much higher than the median 27,500. The maximum is 175,000, but the minimum is only
500. The degree of dispersion is also high because the SD is 37,055.
Regression Analysis
The model explains income in some ways, but some data is not significant as expected. The R
Squire Number for this model is 0.28; this model explains 28% of the variation in income. The
Significant F value is 0.00, which is less than 0.05, means this model is significant; it helps explain
the change in income. The P-value of “SEX”, “AGE”, “EDUC”, and “Exciting” are 0.00, so they
are significant variables in this model.
The Coefficients of intercept, “SEX”, “AGE”, “EDUC”, “Often Read News”, and “Exciting”
are -49729, -16199, 414, 5011, 4774, and 12903 respectively. The coefficient of “SEX” is -16199
means if a person is female, she will earn 16199 less than the male keeping other factors the same;
the coefficient of “AGE” is 414 means if a person is one year older than another, he or she can
earn 414 more keeping other factors the same; the coefficient of “EDUC” is 5011 means if a person
receives one more year education than another, he or she can earn 5011 more keeping other factors
the same. If you often read the newspapers, it is possible your annual income will be 4774 higher,
and finding life excited can increase the income by 12903 keeping the other factors the same. The
equation of predicted-income=-44,955-16,199(SEX)+414(AGE)+5,011(EDUC)+4,774(Often
Rend News) +12,903(Exciting)
Interpretations of The Regression Analysis Outcome
It turns out gender discrimination still exist in the job market. It is surprising that a person can
earn 16,199 less only because she is female, which is unfair.
Age can make the income higher. I think it is related to experience and company policies.
More and more companies want to hire employees with experience rather than invest in training
students with no experience at all. Some companies also have the policies that the longer an
employee stay in the company the higher salary he or she can get.
Education can significantly increase personal income. Education is getting more and more
valued by employers, and business owners cannot win competitions if they do not get a higher-
level education.
Often reading newspapers or not is not significant in this model. People are getting
information from elsewhere. Nowadays, social media and websites are much more popular than
newspaper and they are easier to access.
Passion is the most important factor in this model. As the saying goes “Where there is the will,
there is a way” The ones who find life exciting can really make a difference.
Appendix: