Professional Documents
Culture Documents
10190598_SFM_A1.1
Contents
I. INTRODUCTION..........................................................................................................................3
II. MAJOR FINDINGS......................................................................................................................4
Part A. Data Sources & Method Collection.............................................................................4
1. Data Sources in Business & Economics............................................................................4
2. Data collection method in Business and Economics......................................................4
3. Methods of data analysis........................................................................................................6
Part B..........................................................................................................................................7
1. Evaluate each variable............................................................................................................7
2. Relationship between variables..........................................................................................23
3. Evaluation on the results of summary statistics............................................................27
4. Methods of communication.................................................................................................28
5. Utilization of different types of charts...............................................................................29
Part C. Analyse and evaluate business data........................................................................31
1. T-test to compare variables.................................................................................................31
2. Z-test to compare variables.................................................................................................32
3. Regression Model...................................................................................................................33
4. Comparison of Summary statistics in Part B & Hypothesis Testing in Part C........35
5. Comparison using Regression Analysis in Part C and correlation coefficients in
Part B.................................................................................................................................................35
III. CONCLUSION..............................................................................................................................37
IV. REFERENCES.............................................................................................................................37
2
I. INTRODUCTION
This coursework paper will read and analyze data for 158 observations from a total of a web
survey of pupils in a state of the US statistics table. In addition to simple data description, the
article also demonstrates relationships between different variables. There are evaluations of
communication and methods used to describe and infer data.
3
II. MAJOR FINDINGS
Part A. Data Sources & Method Collection
1. Data Sources in Business & Economics
Primary data source is a data source that can directly represent research’s topic or purpose and
is formed from interaction of researcher and research object through surveys, interviews,
questionnaires, observations, experiments, etc (Kabir, 2016).
Secondary data is a source of information that has been extracted from research result of
collecting, using, and evaluating of other researchers.
Open-Ended Surveys: Respondents can freely and flexibly provide numerous responses.
Interview: It can be one-on-one or focus group depending on purpose of the study and
accessibility of researcher. As the researcher asks the participants questions and records
responses
Direct observation: is a passive method of data collection, researchers observe the
context, in which participants behave and variables cannot be regulated.
Quantitative Method are provided as numbers and can be deduced by mathematics. Random
sampling and structured data collecting tools are used in this procedure to fit different
experiences into preset answer categories. It can easily compare and evaluate magnitude of
different variables, especially in studies with many variables.
4
Table 1 Advantages & Drawbacks of 2 types of sources (Kwantlen Polytechnic University, 2020)
Closely related to topic of research and meet Economical when many quality resources are
the researcher’s demands published and easily accessible
More up-to-date and opportune Short time to make effective use of resources
High level of reliability since this source is in because they have been collected and
Pros raw state and unaltered by people. analysed
Larger sample size for many studies combines
resources of government or organization
allowing it to reach up to thousands or millions
of people.
Expensive cost to implement (including Relevant data may be outdated or non-existent
preparation phase) when reaching many Unspecific answer for question researchers
individuals, and level of expenditure depends needs
on main approach and geographic scope, Lower accuracy and reliability sometimes
printing costs (if not based on online), costs for because not all sources are reputable, or due
looking participants. to personal reasons, some information would
Con
It's time-consuming, depending on needed be exaggerated.
s
sample size, consuming more time if using a
mainstream approach like interviews. It also
requires cleaning and putting it in a database
Degree of trust is uncertain in some cases,
sometimes it shows a person's thoughts and
beliefs but not their behaviours.
5
3. Methods of data analysis
Table 2 Descriptive statistics and Inferential statistics (HILLIER, 2021)
Cons bias.
Research is limited, for no explaining
cause and effect of a research topic.
6
Part B.
1. Evaluate each variable
a. For qualitative variables
Nominal
Q1: University
University
Frequency Percent Valid Percent Cumulative Percent
Valid Colorado 97 61.4 61.4 61.4
Oakland 61 38.6 38.6 100.0
Total 158 100.0 100.0
Q2: Accommodation
Accommodation
Frequency Percent Valid Percent Cumulative Percent
Valid Dorm 79 50.0 50.0 50.0
Other 2 1.3 1.3 51.3
7
agreeing (50%). Students' accommodation options are Parents, Shared Apartment, and Solo
Apartment respectively (27.2%, 15.2%, and 6.3%). Other places of residence are insignificant
with only 1.3%.
Q3: CellPhone
CellPhone
Frequency Percent Valid Percent Cumulative Percent
Valid Alltel 4 2.5 2.5 2.5
Cingular 47 29.7 29.7 32.3
Nextel 5 3.2 3.2 35.4
Other 7 4.4 4.4 39.9
Sprint 14 8.9 8.9 48.7
T-Mobile 21 13.3 13.3 62.0
Verizon 59 37.3 37.3 99.4
Virgin 1 .6 .6 100.0
Total 158 100.0 100.0
Overall, Verizon leads the student phone market with 37.7%, followed by Cingular with
29.7%. Following that, the most popular phone lines are T-Mobile, Spirit, Other, Nextel,
and Altel respectively. Virgin has the lowest use percentage of being used by
participants at 0.6%.
8
Q5: Estimated Minutes
Estimated minutes
Frequency Percent Valid Percent Cumulative Percent
Valid 0 17 10.8 10.8 10.8
1 141 89.2 89.2 100.0
Total 158 100.0 100.0
9
CreditCard
Frequency Percent Valid Percent Cumulative Percent
Valid Amex 7 4.4 4.4 4.4
Discover 3 1.9 1.9 6.3
Mastercard 28 17.7 17.7 24.1
None 3 1.9 1.9 25.9
Other 1 .6 .6 26.6
Visa 116 73.4 73.4 100.0
Total 158 100.0 100.0
Visa dominates the ranking of credit cards used by students with percentage approximately ¾
of the total answers (73.4%). Although percentage is approximately 4 times lower than the 1st
position, Mastercard usage greatly outweigh others as Amex, Discover, and other (17.7%
compared to 4.4%, 1.9%, 0.6% respectively). There is up to 1.9% of participants said that they
do not use credit cards
Estimated balance
10
Q19: Textbook
Textbook
11
never used electronic textbook as primary source in their college courses before, while only
23.4% students have used.
Operating System
Using Linux
Frequency Percent Valid Percent Cumulative Percent
Valid No 84 53.2 53.2 53.2
What's Linux? 52 32.9 32.9 86.1
Yes 22 13.9 13.9 100.0
Total 158 100.0 100.0
12
A majority of students
(53.2%) have never used
Linux, while nearly one-
third participants (32.9%)
have never known what
Linux is. There is only
13.9% said “Yes” as
being put a question
about having used Linux
or not.
13
Game Cube
Frequency Percent Valid Percent Cumulative Percent
Valid No 140 88.6 88.6 88.6
Yes 18 11.4 11.4 100.0
Total 158 100.0 100.0
Q27: PS2
PS2
Frequency Percent Valid Percent Cumulative Percent
Valid No 97 61.4 61.4 61.4
Yes 61 38.6 38.6 100.0
Total 158 100.0 100.0
Q28: PS3
PS3
Frequency Percent Valid Percent Cumulative Percent
14
Valid No 152 96.2 96.2 96.2
Yes 6 3.8 3.8 100.0
Total 158 100.0 100.0
Nintendo Wii
Frequency Percent Valid Percent Cumulative Percent
15
Valid No 145 91.8 91.8 91.8
Yes 13 8.2 8.2 100.0
Total 158 100.0 100.0
16
Q17: Language skill
Language Skill
Frequency Percent Valid Percent Cumulative Percent
Valid Fluent 25 15.8 15.8 15.8
Moderate 46 29.1 29.1 44.9
None 22 13.9 13.9 58.9
Slight 65 41.1 41.1 100.0
Total 158 100.0 100.0
It can be assessed that “Slight” is the most common level that students are currently at 41.1%.
Following that, "Moderate” level is the second most common language skill level with a rate of
29.1%. The number of participants with “Fluent” language level are much lower and only a half
that of the students with the “Moderate” level. And this rate is approximately equal to the
percentage of students with "None" in the sample (15.8% compared to 13.9%)
Frequency of Reading
17
Valid Never 26 16.5 16.5 16.5
Occasionally 110 69.6 69.6 86.1
Regularly 22 13.9 13.9 100.0
Total 158 100.0 100.0
Q22: PC Access
PC Access
Up to nearly two-thirds of
students “Always” connect
to a laptop PC to bring to
class, meaning that
always having access to
PC laptop accounts for the
highest percentage.
Number of people who
18
tend to “Never” connect to a laptop PC is 3 times less than “Always”, but it seems to be over 2
times higher than “Often”. Only a few students with a 5.7% sample go to school but "Rarely" to
access PC
Descriptive Statistics
N Minimum Maximum Mean Std. Deviation Skewness
Statistic Statistic Statistic Statistic Statistic Statistic Std. Error
Time using 158 1 15000 730.22 1299.292 8.768 .193
Cellphone Minutes
Q7: Balance
Degree of variation is extremely large in this variable as standard deviation is 1507,63976 and is
more than double mean balance of students of 706.2920. The maximum balance of a student
can be up to 12000 and the lowest balance is 0.00. Skewness is 4.980 (positive and higher than
1.3), which means data for this variable is highly skewed and has right-skewed distribution.
19
Demonstrate that most students of both 2 universities have balances that are much lower than
mean amount.
Q9: GPA
The highest GPA of students of 2 universities is 4.0 and the lowest GPA is 1.6. The average
GPA of students recorded is 3.1748. Variable’s data has skewness at -0.503 (negative and
lower than -0.3), so called moderately skewed and having left short tail, which means many
students have GPA higher than 3.1748.
Q10: Working Hours
The maximum number of working hours per week is 60 hours, and the minimum is 0 hours. On
average, a student spends 12,193 hours in a week doing a paid job. Variable’s skewness is 0.88
(positive and higher than 0.3), so its data is moderately skewed and has right short tail. It takes
less time than average time for some students to do paid jobs.
20
occupation is highly appreciated with a mean value of 5,57278 and moderately high standard
deviation of 0.973221. Skewness is -1.535 (negative and lower than 1.3), which means data for
this variable is highly skewed and has left long tail. Many students believe that their intended
job's job market is potential, most of the assessment scores are higher than the average score.
Q15: Politics
Political orientation as students' self-assessment has the maximum value of 6.9 and the
minimum value of 1.0 on a scale of 1 to 7. On average, each student's self-rated political
orientation is 3,93692. Variable’s data has skewness at -0.268 (negative and higher than -0.3),
so called fairly symmetrical or little left skewed.
Q16: Religious
Each student attended an average of 11,36 religious services over the past year. Skewness is
calculated by 2.102 (positive and higher than 1.3), so data is highly skewed and has a right long
tail.
21
2. Relationship between variables
a. For a couple of qualitative variables – Q1 & Q2, Q17 & Q18
Q1 and Q2: Relationship between Students’ University and Accommodation
University
Colorado Oakland
Count Column N % Table N % Count Column N % Table N %
Accommodation Dorm 76 78.4% 48.1% 3 4.9% 1.9%
Other 2 2.1% 1.3% 0 0.0% 0.0%
Parents 2 2.1% 1.3% 41 67.2% 25.9%
Share Apt 14 14.4% 8.9% 10 16.4% 6.3%
Solo Apt 3 3.1% 1.9% 7 11.5% 4.4%
As graph shows that, without any relationship between these two variables, Colorado
students tend to stay in “Dormitory” the most (about 78.4% of Colorado students), while
there are only 3 Oakland students on a total of 79 students are in Dorm. And in Oakland,
most students are living with their “Parents” (about 67.2% of Oakland students), the rate
of staying with parents in Colorado students is extremely low (2.1%). “Share Apartment”
22
is the place with the second-highest number of students from both schools. “Solo
Apartment” is the least popular place because it contributes only about 6.3% of total
choice of students of 2 universities, in which Colorado & Oakland students is 1.9% and
4.4% respectively. Obviously, cross table and clustered bar chart show, there is no
relationship between 2 variables University and students' Accommodation.
Q17 and Q18: Relationship between Students’ Language Skill and Frequency of Reading
Language Skill
Fluent Moderate None Slight
Column Table N Column Table Column Table Column Table
Count N% % Count N% N% Count N% N% Count N% N%
Frequency Never 5 20.0% 3.2% 6 13.0% 3.8% 4 18.2% 2.5% 11 16.9% 7.0%
of Reading Occasionally 19 76.0% 12.0% 33 71.7% 20.9% 13 59.1% 8.2% 45 69.2% 28.5%
Regularly 1 4.0% 0.6% 7 15.2% 4.4% 5 22.7% 3.2% 9 13.8% 5.7%
With all participants at any language level, they are most inclined to read books
"Occasionally". From level of "None" and above, students who have reading behavior
23
"Occasionally" gradually high by 59.1%, 69.2%, 71.7%, 76.0%, respectively. Those
considered to have “None” language skills have the highest frequency of “Regularly” at
22.7%, and frequency of reading “Regularly” low gradually across higher language skill
levels. Especially students with "Fluent" do not read "Regularly" as students with other
levels when only 4% of them. Behavior of “Never” reading, even more, proves no
correlation between two variables when "Fluent" students have the highest percentage of
"Never" reading, and this percentage for “Moderate” students at 13.0%, “Slight” students
at 16.9%. Clearly, students' level of language skills was not related to frequency of
reading.
Q4 and Q9: Relationship between Students’ GPA and Time Using Cellphone
24
Correlations
Cell Minutes GPA
Cell Minutes Pearson Correlation 1 -.145
Sig. (2-tailed) .070
N 158 158
GPA Pearson Correlation -.145 1
Sig. (2-tailed) .070
N 158 158
With the correlation coefficient R=-0.145, the relationship between the variable “GPA” and “Time
using cellphone” is evaluated as there is no close correlation, meaning “Negligible Correlation”.
And Significance (p=0.07) is higher than 0.05 so it is also irrelevant – “Not Correlate”. It is easy
to assess, whether students achieve high or low scores, the level of phone use only ranges from
0 to 2000 minutes. Sig
Q9 and Q10: Relationship between Students’ GPA and Time Working at a paid job
Correlations
25
GPA Working Hours
GPA Pearson Correlation 1 .002
Sig. (2-tailed) .985
N 158 158
Working Hours Pearson Correlation .002 1
Sig. (2-tailed) .985
N 158 158
With correlation coefficient R = 0.002, two variables "GPA" and "Working hours" are also
no correlation here - “Negligible Correlation”. Besides, significant coefficient p=0.985
is higher than 0.05, so there is no close correlation between the two variables. These
points in the scatterplot are randomly distributed with no discernible trends, so students'
GPAs do not vary positively or negatively with their hours worked.
3. Evaluation on the results of summary statistics
GPA
According to the table of data processed above, we can see that: For students
possessing "Fluent" language skills, the average score is Mean=3.233 with a moderate
standard deviation SD=0.499 from the rest. And the most popular score of those who
have "Fluent" Language Skills is only Mode= 3.0. Individuals with "Moderate" language
skills have the highest GPA in whole observations with Mean=3.823 and the lowest
standard deviation SD=0.45, meaning the dispersion of pulse scores around
Mean=3.283 is not much. Considering students who have "Slight" language skills,
achieved the lowest GPA with Mean= 3,091, along with the extent of scores occurs most
is Mode=2.8. The rather high standard deviation SD=0.509 than others. Finally, those
who had no concept of the language - "None", have moderate GPA, mean= 3.130, and
26
were not too much lower from GPA of those with better language skills, and the standard
deviation is also relatively low SD= 0.483. Among them, the score of Mode= 3.6
appeared the most.
Working Hours
Mean Mode Standard Deviation
University Colorado 5.3 .0 9.4
a
Oakland 23.2 25.0 13.2
a. Multiple modes exist. The smallest value is shown
Colorado's students seem to be quite ignoring paid jobs, the time they spend in those
jobs with average hours (Mean=5.3) is 4 times lower than the average working hours of
Oakland's students (Mean=23.2). And looking at Mode, there are many Oakland
students who spend up to 25 hours a week (Mode=25) on their part-time jobs, whereas
most Colorado students are not interested in part-time jobs (Mode=0). The Standard
Deviation of the number of hours worked for Colorado students is also lower than that of
Oakland (9.4 vs. 13.2), which means. The tendency of students to spend time working at
Colorado schools is less volatile.
4. Methods of communication
Descriptive Statistics: It is a basic tool that allows researchers to readily interpret sample data
by: (1) Measures of location (Mean, Median, etc.) to determine where data is centred or where a
trend exists, (2) Measures of Variablity (skewed, interquartile range) to determine the spread or
diversity of a particular data collection, (3) Measure of Frequency (Kushwaha, 2020).
27
Characteristic measures are leveraged most in this work: (1) Sample mean is an
excellent way to form a conclusion about a highly correct population mean since it is an
unbiased estimator of the population mean. (2) When mean becomes a poor measure for
outliers, Median is a superior measure within being little impacts. (3) Standard Deviation
signifies a more heterogeneous or different distribution of raw data on a scale. (4)
Skewness is useful to determine distribution is symmetric or not, and data's one side has
a long or short tail.
Pearson’s Correlation Coefficient: In addition to analyzing linear relationship between two
variables and how or to what degree of attachment between them is (Strong or Weak), which
means describing movement of one variable in relation to another (FERNANDO, 2021).
Descriptive Table: In this article, mainly devoted to interpreting the data related to quantitative
variables such as: GPA, working hours, time using cell phone, etc. In contrast to evaluating
qualitative variables, quantitative data have distinct labels in same variable, allowing measures of
location or dispersion of each variable to be calculated. Therefore, a descriptive table is utilized
to understand characteristics of data type (mean, minimum or maximum value, skewness, etc),
clearly accessible.
Charts:
Bar Chart provides a far more comprehensive overview of data than only using table.
This is not to say that it is unworthy to generate a table because its needs to draw bar
graph. Furthermore, bar graphs are also useful for comparing multiple categories in one
variable when the author wants to gauge trend movement.
Pie Charts are useful for showing relative frequency of a minor quantity of categories, but
they are not appropriate for a variable with large number of categories. Thus, the author
describes some variables with 2 or 3 types of results as Q25, Q26, Q27 in Pie chart to
28
reach more clearance.
Clustered bar chart: To assess how the second category variable varies dependently on
each score of the first, means it divides data points across two category variables rather
than one. Studying relationships is good by consistent colours and arrangement for each
variable’s value to be displayed for each group.
Scatter Plot: It can display enormous amounts of data and correlations between two
variables as clustering effects. This chart is not allowed to label data points, so its
difficulty in determining exact values, so the author use Pearson Correlation coefficient
together.
Cross tabulation: As shown in above task, Cross tab assists the author in proving the
correlation between two variables, and it differs in that it is simpler to demonstrate tendency and
probability in data collection. With a large data profile like the brief given, raw data is generally
overwhelming and can lead to a myriad of conflicting outcomes, so applying Cross tabulation
with its ability to distribute entire data collection into representative subgroups supports to
simplify data, to manage easily, as well as to lessen probability of assessment mistakes.
Correlation table: It is useful because it is possible to study the relationship between two
variables as analyzed above. But it also merely shows how one variable's movement in the other
without providing a reason for existing this relationship and an explanation of which variable is in
charge of affecting the other.
29
Part C. Analyse and evaluate business data
1. T-test to compare variables
Compare quantitative variables Q9, Q10 classified by qualitative variable Q1
Group Statistics
University N Mean Std. Deviation Std. Error Mean
GPA Oakland 61 3.15525 .468048 .059927
Colorado 97 3.18710 .505444 .051320
Working Hours Oakland 61 23.230 13.1889 1.6887
Colorado 97 5.253 9.3648 .9508
Assuming that all test is accreditation at 95% Confidence Interval, so α value is defined = 0.05
GPA Variable
30
In Levene’s Test, it gets F=0.27 and Sig(F)=0.869 > α, so possibly concluding “Do not reject H0”,
which refers to using “Equal variances assumed” cell.
With Sig(2-tailed) =0.692 > α, so it comes to a conclusion that “Do not reject H0” and there is
no significant difference in mean. In other words, between two mentioned universities, there is
no evidence to point out differences in GPA. In the Group statistics table, although Colorado has
a higher mean GPA than Oakland, the T-test results show that this difference is not statistically
significant at the 5% level.
In Levene’s Test, it gets F=7.687 and Sig(F)=0.006 < α, so possibly concluding “Reject H0”,
which refers to using “Equal variances not assumed” cell.
With Sig(2-tailed) =0.00 < α, so it concludes that “Reject Null hypothesis”, meaning that time
spent doing paid-jobs of Oakland’s students is not equal with Colorado’s students. More
specifically, comparing with results of sample mean above, it also shows that 2 mean values in
Working hours variable is different significantly, and difference in population mean is also proved
from T-test. Therefore, it can be concluded that working hours between two universities are
unequal.
H0: p1 = p2
H1: p1 ≠ p2
31
With p1 refers Proportions observed in “Working Students of Oakland” with size n1, p2 refers
Proportions observed in “Working Students of Colorado” with size n2. Based on Z-test of
α
With significant level at 95%, z-score associated with a 5% α level, Z is 1.96. Solving the
2
α
formula, it got Zstat = 6.4957. As Rejection rule (performing two tailed test), |Zstat| > Z , it shows
2
that we can reject the null hypothesis and accepts the alternative, students’ working hours of
Colorado and Oakland are not equal.
3. Regression Model
Coefficientsa
Standardized
Unstandardized Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) 3.220 .060 53.932 .000
Working Hours .000 .003 .003 .043 .966
Time using Cellphone -5.498E-5 .000 -.146 -1.827 .070
Using Electronic Textbook -.030 .092 -.026 -.322 .748
Regression Line:
32
GPA = 3.220 + 0Working Hours + (-5.498E-5) Time Using Cellphone + (-0.3) Using Electronic Book +
This table describes the degree of influence of each variable (including 3 variables below)
on the dependent variable GPA:
(1) Working Hours: In other variables stay constant, if student's working hour increase by
1 hour, the average value of GPA will not change. Compare with conclusion from
Correlation Coefficient in part B, that is, Working Hours value does not have any effect on
students' GPA.
(2) Time Using Cellphone: In case of other variables being constant, if students’ time using
cell phone increase by 1 minute, the average value of GPA will decrease 5.498E-5 point
(3) Using Electronic Textbook: if case of other variables remains unchanged, the average
GPA of those who have ever used Electronic Textbook is 0.3 points lower than those
who have never used.
The column labeled Standard Coefficient Beta illustrates extent of independent
variables' impact on the dependent variable (GPA). Sorted in order from strongest to
weakest, the influence of the independent variables on GPA is respectively: Time using
cell phone (0.146), Using electronic Textbook (0.026), Working Hours (0.003)
For the population, further exploration is needed to appreciate significance of values
calculated above. T-test for significance, with Hypothesis that
H0: βi = 0
H1: βi ≠ 0
As shown in Sig Value of all 3 independent variables, P-value of Working Hours (0.966),
Time Using Cellphone (0.70), Using Electronic Textbook (0.748) are higher than α value
(0.05), so H0 is not rejected. And that is, independent variables included, none of which
impact a student's GPA variable.
Model Summary
Model R R Square Adjusted R Square Std. Error of the Estimate
a
1 .147 .022 .002 .489468
a. Predictors: (Constant), Using Electronic Textbook, Time using Cellphone Minutes, Working Hours
Regression equation on the basis of independent variables explains 2.2% of the change in
dependent variable (GPA), which is represented by R square. Otherwise, to assess clearly, R
33
square adjusted with a value of 0.2%, which means that in reality only 0.2% of variation of
GPA variable is explained by independent variables.
ANOVAa
Model Sum of Squares df Mean Square F Sig.
1 Regression .812 3 .271 1.130 .339b
Residual 36.895 154 .240
Total 37.708 157
a. Dependent Variable: GPA
b. Predictors: (Constant), Using Electronic Textbook, Time using Cellphone Minutes, Working Hours
H0: β1 = β2 = ...= 0
H1: One or more parameters is unequal to 0
The sig column represents the p-value (0.339) > α (0.05), so H0 is not rejected. Which shows
that this model has no predictive capability, ignore the f-value
Specifically, in this coursework, T-test and Z-test are utilized with the acceptance of a margin of
error, most of which is 5%. With summary statistics, mean GPA of students of two schools is
recorded as unequal. But as implementing T-test, with an accepted margin of error = 0.05, GPA
between Colorado and Oakland has no difference.
34
To show the relationship between any two variables two or more variables is involved,
Correlation is used. In the event that a more in-depth examination is required of how one an
independent variable affects the dependent one, Regression analysis is a more effective and
reasonable method. Therefore, relationship between X and Y with appropriate correlation, X and
Y can be interchanged and provide same outcome, it is not valid in regression analysis. Some
more detailed information is provided by Regression Analysis, which Correlation does not reflect.
(1) Regression analysis reflects the cause-and-effect relationship. (2) Regression analysis
is a foundation for generating predictions and selecting an appropriate optimization
method. Correlation analysis merely displays the distribution of data on scatter plot diagram,
Regression analysis is depicted as a line with equation Y= a + bX, which allows studying how
dependent variable Y responds while independent variable X varies (increase or decrease) 1
unit.
Specifically, as analyzed in part B, in correlation, relationship between GPA and Time using
cellphone is only shown to be “Negligible Correlation” without indicating specifically which
variable impacts on which one. With regression analysis in part C, two variables are still
related, although impact of Time using cellphone variable on GPA variable is extremely low
(0.146). And if students' time using cellphone increases by 1 minute, the average value of GPA
will decrease by 5,498E-5 points.
35
III. CONCLUSION
In summarization, numerous computes and evaluate methods have been implemented in this
research work to assess a range of data of students from Colorado and Oakland. Besides, taking
advantage of these methods and tools helps to evaluate the relationship between variables in the
data table.
IV. REFERENCES
36