Professional Documents
Culture Documents
Submitted by
2. Correlation Analysis
medexp 1.0000
Note: Correlation tests *** significant at 1%, ** at 5%,* at 10% significance level.
Inc
The positive correlation coefficient (0.0780), indicates a weak positive correlation between the
two variables, at the significance level of 5%, providing 95% confidence that the correlation is
statistically significant. Hence, the null hypothesis can be rejected and the alternative
hypothesis is accepted. It suggests that as Annual income(inc) increases or decreases, Medical
expenditure (medexp) tends to increase or decrease as well.
Age
The positive sign of the correlation coefficient (0.6650), indicates a strong positive linear
correlation between the two variables, at a significance level of 1%, providing 99% confidence
that the correlation is statistically significant. Hence, the null hypothesis can be rejected and the
alternative hypothesis is accepted. It suggests that as age increases or decreases, Medical
expenditure (medexp) tends to increase or decrease as well.
3. Regression analysis
(a) Multiple Linear Regression Analysis
Prob>F - 0.0000
Since Prob > F (0.000) is less than the p-value (0.01), the model is statistically significant at the
1% level and the model is a good fit.
R-squared - 0.5015
The R-Square is 0.5015, approximately 50.15% of the variability in the dependent variable
(medexp) can be explained by the age, inc and insur included in the regression model. The
remaining 49.85% is not included in the model.
Parameter Estimates
The P-value is 0.000 (less than 0.01). This means that age in year (age)of individuals is
statistically significant in the medical expenditure (medexp) at 99% confidence level. Hence, the
null hypothesis can be rejected and the alternative hypothesis can be accepted.
The coefficient is 0.0090412 (Positively related). Hence, it indicates that higher Annual income
(inc) is associated with higher medical expenditure. Individuals with higher income might be
more willing to spend on healthcare services.
The P-value is 0.009 (less than 0.01). This means that the annual income is significant in the
medical expenditure at 99% confidence level. Hence, the null hypothesis can be rejected and
the alternative hypothesis can be accepted.
The coefficient is 1.3826 (Positively related). It suggests that having insurance is associated
significantly with higher medical expenditure. Hence, it indicates that the individuals who buy
insurance have higher medical expenditure compared to those who don’t.
The P-value is 0.000 (less than 1). This means that the insurance status is significant in the
medical expenditure at 99% confidence level. Hence, the null hypothesis can be rejected and
the alternative hypothesis can be accepted. .
(c)The prediction of the annual medical expenditure by the regression model
B0 (Constant) = -2.622888
B1 (Age) = 0.132764
B2 (Inc) = 0.0090412
B3 (Insur) = 1.38326
This means that, when there is no insurance (X3 = 0), the predicted annual medical expenditure
is approximately $5,230.00. The value of X3 doesn’t have an effect on Y.
If X3= 1 (Insurance)
This indicates that, when there is insurance (X3 = 1), the predicted annual medical expenditure
increases to approximately $6,613.91.
By the multiple regression function, it reveals that the predicted annual medical
expenditure for a 55-year-old patient with an annual income of $61,000 is influenced by factors
such as age, annual income, and insurance status. The absence of insurance is associated with
a predicted expenditure of around $5,230.00, while having insurance increases the predicted
expenditure to approximately $6,613.91.
4.(a) Summary of the findings from the descriptive statistics,
correlation analysis, and regression analysis.
1. Descriptive statistics
For Annual Medical Expenditure (Medexp) in hundreds of dollars
Based on the histogram, there are two variables 0- no insurance policy, and 1= if
the individual has insurance policy. Hence, the histogram lines are far from each other.
The kurtosis, 1.008482 (less than 3) indicates that the distribution is flatter, and has a
short, and thick tail which indicates that there are no outliers.
The standard deviation of annual income (inc) is the highest among the Age and
Insurance Status and Annual Income, indicating substantial variability in income levels
of the individuals within the dataset. With the positive correlation, it suggests that an
increase in income is associated with higher medical expenditure and lower income is
associated with lower medical expenditure. Hence, considering the significant
variability in annual income and its positive correlation with medical expenditure,
targeted interventions to address the healthcare affordability for individuals with
various income levels are recommended. For example, by targeting interventions
towards lower-income populations, the researcher can address the financial barriers
they are facing to access the healthcare services. This recommendation would reduce
the inequity and accessibility in healthcare access and improve health outcomes for
individuals with limited financial assets. Therefore, focusing the research on Annual
Income for these interventions would be a good decision.
For insurance,
According to the case scenario of the patient of age 55 with an annual income ,
the absence of insurance is associated with a predicted expenditure of around
$5,230.00, while having insurance increases the predicted expenditure to approximately
$6,613.91. The difference between patients with policy or without policy is around
$1,400 which will impact on Medical Expenditure (medexp). In the real world, insurance
policies are likely to reduce the medical expenditure to be more affordable. However ,
according to the regression analysis, the coefficient of regression is positively
correlated with annual income and insurance which means that individuals with higher
income who purchase insurance policy might spend more on healthcare service. Hence,
we recommend the research to analyze why the individuals who purchase the insurance
might have more expenditure rather than the individuals who don’t have one in order to
improve the healthcare affordability.
For age,
Based on the substantial variation in the ages of individuals within the dataset, as
evident from the histogram and the standard deviation of age in years, coupled with a
positive correlation coefficient in the regression analysis, it is suggested that as the age
of individuals increases, medical expenditures tend to rise. Consequently, recognizing
the considerable variability in age and its correlation with medical expenditure, targeted
interventions to enhance healthcare affordability, and accessibility especially for
individuals across different age groups, are recommended. This approach aims to
address the diverse healthcare needs associated with varying ages and income levels,
facilitating more effective and equitable healthcare interventions.