You are on page 1of 21

Table of Contents

INTRODUCTION.......................................................................................................................................2
I. PART A...................................................................................................................................................2
1) Data sources in Business and Economic..........................................................................................2
2) Data collection and method.............................................................................................................2
3) Critical evaluate different data source..............................................................................................3
4) Method of data analysis...................................................................................................................4
II. Part B......................................................................................................................................................5
1. summary statistics, tables, and charts..................................................................................................5
a. For qualitative variables......................................................................................................................5
b. For quantitative variables....................................................................................................................8
c. Chart, table, and correlation calculation............................................................................................10
d. Summarize different variables...........................................................................................................15
IV. Part C..................................................................................................................................................16
1. T-test.................................................................................................................................................16
2. Regression model..............................................................................................................................19
CONCLUSION........................................................................................................................................20
REFERENCES........................................................................................................................................21
INTRODUCTION
This assignment will analyze data from 158 students from two schools in the United
States to explore student behaviors and characteristics. This exercise will use tools and
methods to analyze the data after surveying these students.

I. PART A
1) Data sources in Business and Economic
Primary Data: This is the type of data collected from the main database through methods
such as interviews, observations, and survey by researchers.

Secondary data: Data collected and analyzed by others rather than by the researchers
directly. (University of Minnesota, n.d.).

2) Data collection and method


Primary data source

Inverview: individuals will communicate directly with each other about the issue and data
can be recorded in a variety of ways.

Questionaire/ Survey: a series of questions will be created and sent to a group of people
or an organization, from answering the questions, the data will be sent back in a uniform
manner. (Ainsworth, 2020)

Observation: Data is collected through observation participants, who will record and
process that data.

Secondary data source

Secondary data is collected from exist and available sources such as Books,
Journals, Report, or Previous research (Kabir, 2016)
3) Critical evaluate different data source
In conclusion, both primary and secondary data sources have benefits and
drawbacks, and the advantages of primary data sources might be the shortcomings
of secondary data sources, and inversely.

Collection method: Because primary data sources require researchers to gather


data on their own, the data collection procedure may be constrained due to the
complexity of the data or the quantity of participants (Formpl.us, 2020).
Secondary data, on the other hand, is freely available on the internet, in books, or
in reports, making this type of data more accessible (Valcheva, 2020)

Cost and time of collecting data: Primary data sources take a long time and
money to obtain; on the other hand, secondary data may be collected practically
instantaneously, and some secondary data sources are free (Valcheva, 2020)

Specific for personal purpose: Primary data may be used for personal purposes
since researchers collect data on their own and know what type of data they want.
In terms of secondary data, it might be difficult to locate data that meets the needs
of researchers, and the relevance may be fairly low (Toxplanet.com, 2019)

Data trust level: Because primary data is obtained from a specific population and
does not include personal opinion, it is typically more accurate than secondary
data (Formpl.us, 2020)

Update: Secondary data is out of date compared to main data. (Formpl.us, 2020)
4) Method of data analysis
Frost (2020) distinguishes between two types of data analysis: descriptive statistics
and inferential statistics. Descriptive statistics are an analysis approach that uses
sample data to construct charts and tables that convey some of the most important
aspects of the observed sample. Inferential statistics, on the other hand, employ
data from a sample, but this type of statistic may forecast on a larger population
using a variety of research methods. To summarize, descriptive statistics are used
to remark on a sample, whereas inferential statistics are used to comment on a
population.

Descriptive statistics are used on existing sample data to compute and present
essential statistics that are directly relevant to that sample, with a high level of
accuracy. Inferential statistics, on the other hand, are concerned with providing
information about the population based on sample results, with a lower level of
precision. Furthermore, the data collection process can be influenced by a variety
of elements, including the collector and external events. All of these have the
potential to lower the quality of population findings. Finally, Inferential statistics
are far more sophisticated than Descriptive statistics (Indeed.com, 2020)
II. Part B

1. summary statistics, tables, and charts

a. For qualitative variables


Uiversity
38.6%

61.4%

The pie chart shows the ratio difference between students of the two schools invited to
the survey, Colorado school (61,4%) has nearly twice as many students as Oakland
(38,6%).

Accommodation

15,2%
%%

50%

27%
50% of all students at both schools in Dorm, living with parents is 27% and Share APt
student share is 15.2%, double the number of students at Solo Apt (6.3%) ).
Accommodation can speak volumes about finances and can therefore form different
habits and characteristics of students.

Language skill

15.8%

41.1%

29.1%

13.9%

It can be seen that the number of students who are fluent in English is small, only about
15.8%, of which the students' language skills are at a slight level of 41.1%, After that,
29,1% student are moderated and up to 13.9% of students have no language ability.
b. For quantitative variables

Time using Cellphone:

The smallest student's cellphone use time is 1 minute and the maximum is 15000
minutes, in general, all students use cellphones and on average each student used 730.22
minutes. The skewness is 8.768 (positive and higher than 1.3) so that this data is right
long tailed, and the skewness is significant.

Credit card balance:

The largest student monthly credit card balance is 12000 and there are students with no
credit card balance, on average, each student's credit card balance is 8563. The skewness
is 8.768 (positive and higher than 1.3) so that this data is right long tailed, and the
skewness is significant.

GPA:

Any student has a GPA, the lowest is 1,600 and the highest is 4,000. On average, a
student's GPA is 3.17480. The skewness is -503 (negative and lower than 1.3) so that this
data is left long tailed and the skewness is significant.
Work hours

Statistics show that the maximum working hours of students is 60 hours, and there are
students who do not work (the number of working hours is 0). On average, for every
student who works 12,193 hours,the skewness is 0.880 (positive and lower than 1.3) so
that this data is right short tailed and the skewness is significant.

CarAge:

The deviation is too large, causing the difference to be large, the mean is wrong with the
actual value because the variable is outrial.

JobMkt

On a scale of 1-7, the current state of the job market per person is 5.5728, The skewness
is -1,535 (negative and higher than 1.3) so that this data is right long tailed and the
skewness is not significant

Politics

According to the above state, each person has a suitable political orientation, the lowest is
1 and the maximum is 6.9, so on average, each person has 3,9369 orientations, the
skewness is -0.268 (negative and lower than 1.3) so that this data is right short tailed, and
the skewness is significant.

Religious

The above table statistics that there are students participating in religious services and
there are also students who do not, the number of students participating in religious
services is at most 100 and at least 0, on average, one student participates in 11,36
religious services. The skewness is 2.102 (positive and higher than 1.3) so that this data is
right long tailed, and the skewness is significant.

Prefer Electronic:
The above statistics show that, there are students who like the electronic version of the
textbook and there are also students who do not like it, rated on a scale of 1-7, the
smallest is 1 and the largest is 7, on average with a student, the rate of liking electronic
textbooks is 3.4508. The skewness is 0.427 (positive and lower than 1.3) so that this data
is right short tailed, and the skewness is significant.

c. Chart, table, and correlation calculation


For qualitative data

 Q1-Q2

University
Colorado Oakland
Table N Table N
% Count % Count
Accommodati Dorm 48.1% 76 1.9% 3
on Other 1.3% 2 0.0% 0
Parents 1.3% 2 25.9% 41
Share
8.9% 14 6.3% 10
Apt
Solo Apt 1.9% 3 4.4% 7

According to the table and graph above, we can see that there is no correlation between
Q1 and Q2 because there is no tendency between the two variables, whether students
attend Oakland or Colorado schools is not related to students' living conditions. Colorado
has 76 students (48.1%) and 3 Oakland students (1.9%) live in Dorm. But Colorado has 2
people (1.3%) and 41 people from Oakland School (25.9%) live with their parents. The
correlation is negative, which means the correlation is weak and it is significant at 0.01
level.

 Q17 and Q18


Language Skill
Fluent Moderate None Slight
Table Coun Table Coun Table Coun Table Coun
N% t N% t N% t N% t
Frequency of Never 3.2% 5 3.8% 6 2.5% 4 7.0% 11
Reading Occasionall
12.0% 19 20.9% 33 8.2% 13 28.5% 45
y
Regularly 0.6% 1 4.4% 7 3.2% 5 5.7% 9

According to above table, the correlation is positive and moderate, significant at the 0.01
level.Theo bảng trên cho thấy cứ 5 người (3,2%) không bao giờ đọc báo có language skill
fluent thì có 6 người có language skill morderate. Bên cạnh đó, cứ 1 người (0,6%)
thường xuyên đọc báo có language skill fluent thì có tới 7 người ( 4.4%) có language skill
moderate. Nhìn chung tỉ lệ thường xuyên đọc báo tăng thì language skill càng ở level cao
hơn.

For quantitative data

 Q4 vs Q9
The correlation in this case is negative (or (Q4,Q9)= -0,415)) and it is not significant. It
can be seen that, Sig (2-tailed)> alpha (0.07>0.05) so different time using cellphone had
the quite similar GPA.In summary, the correlation is not valid in this 2 variables.

 Q9 vs Q10
Correlations

Working
GPA Hours
GPA Pearson
1 .002
Correlation
Sig. (2-tailed) .985
N 158 158
Working Pearson
.002 1
Hours Correlation
Sig. (2-tailed) .985
N 158 158

The correlation in this case is positive (or (Q4,Q9)= 0,002)) and it is not significant. It
can be seen that, Sig (2-tailed) < alpha (0.002<0.05) so the 2 variables is correlate.

d. Summarize different variables

GPA
Varianc Standard
Mode Mean Median e Deviation
Language Fluent 3.000 3.233 3.200 .249 .499
Skill Moderat
3.300 3.283 3.376 .202 .450
e
None 3.600 3.130 3.150 .233 .483
Slight 2.800 3.091 3.160 .259 .509

For students with language skills at the morderate level, the mean GPA was 3,233, with a
low standard deviation (0.499). In the case of students with slight language skill, the
mean GPA is 2,800 and the standard deviation is higher than that of students with
language skill at morderate (0.509).

GPA
Standard Varianc
Deviation e Mode Mean Median
Universi Colorad
.505 .255 3.300 3.187 3.244
ty o
Oakland .468 .219 3.000 3.155 3.160
For Colorado students, the mean GPA was 3,187 with a low standard deviation (0.505),
while for Oakland students, the mean GPA was 3.155 with a low standard deviation
(0.468).
Thus, when comparing the GPA of students with different language skill levels, there will
be differences, but when compared on the School variable, the GPA will not be too
different.

IV. Part C

1. T-test
Independent Samples Test
Levene's
Test for
Equality
of
Variances t-test for Equality of Means
95%
Sig. Std. Confidence
(2- Mean Error Interval of the
Sig tailed Differen Differen Difference
F . t df ) ce ce Lower Upper
Workin Equal 7.68 .00 10.00 156 .000 17.9769 1.7965 14.428 21.525
g varianc 7 6 6 2 6
Hours es
assume
d
Equal
varianc
97.92 14.131 21.822
es not 9.276 .000 17.9769 1.9380
6 1 8
assume
d
H0: Average working time (Students working) = Average working time (Students not
working)

H1: Average working time (Students working) ≠ Average working time (Students not
working)

F=7.687, SIG(F)=0.006

For the variable working hours, the Sig value in the Levene test: Sig (F)=0.006 > α =
0.05, which means that the variance between the 2 populations is equal for the results of
the Variance plot to be balanced. assumptions will be used. Sig value of T-test is 0 < α =
0.05 => do not reject H0, which means that the average working time of students who do
not work is not equal to the average working time of students who do not. do.

Independent Samples Test


Levene's t-test for Equality of Means
Test for
Equality
of
Variance
s
Sig. 95% Confidence
(2- Mean Std. Error Interval of the
tailed Differenc Differenc Difference
F Sig. t df ) e e Lower Upper
GP Equal
A variance .86 -.39 -.19047 .12675
.027 156 .692 -.031857 .080299
s 9 7 1 7
assumed
Equal
variance -.40 134.92 -.18789 .12418
.687 -.031857 .078899
s not 4 1 6 1
assumed

H0: Average GPA (Students with work) = Average GPA (Students not working)

H1: Average GPA (Students with work) ≠ Average GPA (Students not working)

For age variable, the Sig value in Levene test is 0.869 > α = 0.05 => the variance between
2 population is equal so that the result of cell Equal variances assumed will be used.
The Sig value of T-test is 0.692 > α = 0.05 => do not reject H0, it mean that the mean
GPA of working students is equal to mean GPA of non-working students.

2. Regression model

Model Summary
Mode Adjusted R Std. Error of
l R R Square Square the Estimate
1 .147a .022 .002 .489468
a. Predictors: (Constant), Using Electronic Textbook,
Time using Cellphone Minutes, Working Hours

Coefficientsa
Standardize
Unstandardized d
Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) 3.220 .060 53.932 .000
Time using Cellphone
-5.498E-5 .000 -.146 -1.827 .070
Minutes
Working Hours .000 .003 .003 .043 .966
Using Electronic
-.030 .092 -.026 -.322 .748
Textbook
a. Dependent Variable: GPA

Regression Line:
GPA = β0 + β1x Working Hours + β2x Mobile Phone + β3x E-Book +
GPA = 3,220 + 0 Hours of Work + (-5.498E-5) Mobile Phone + (-0.3) E-book +
Based on table on, the image of each variable to point the center of sub belongs to:
- Working hours: Cannot be changed, if students work overtime 1 hour, GPA will not
change. Compared to Correlation (part B), Work Hours Value has no effect on GPA
- Cell phone: No change, if a student's phone time increases by 1 minute, GPA means a
decrease of 0.3 points
- Electronic Textbooks: No change, E-Textbook users have mean lower average scores
than non-users.
The variation attribute is influenced by the independent variable (Beta Criterion column)
in the order: Cell Phone Time (0.146) - E-Textbook (0.026) and finally Working Hours
(0.003) )
P-value of Working Hours (0.966), Time Using Cellphone (0.70), Using Electronic
Textbook (0.748) > α value (0.05) => reject H0.

CONCLUSION
In summary, the essay employed a number of calculation and assessment approaches to
examine the disparities in student characteristics depending on their habits. Furthermore,
these approaches and tools assist the author in evaluating the relationship between the
variables as well as the influence of the independent variable on the dependent variable.

REFERENCES

1) University of Minnesota, n.d. Data Sources | CYFAR. [online] Cyfar.org. Available


at: <https://cyfar.org/data-sources> [Accessed 10 December 2020].

2) Formpl.us, 2020. 7 Data Collection Methods & Tools For Research. [online]
Formplus Blog. Available at: <https://www.formpl.us/blog/data-collection-method>
[Accessed 11 December 2020].

3) Ainsworth, Q., 2020. Data Collection Methods. [online] JotForm. Available at:
<https://www.jotform.com/data-collection-methods/> [Accessed 11 December 2020].
4) Kabir, S., 2016. Basic Guidelines For Research: An Introductory Approach For All
Disciplines. 1st ed. Chittagong-4203, Bangladesh: Book Zone Publication, pp.201-
275.

5) Formpl.us, 2020. Primary Vs Secondary Data:15 Key Differences & Similarities.


[online] FormplusBlog. Available at: <https://www.formpl.us/blog/primary-
secondary-data> [Accessed 13 December 2020].

6) Frost, J., 2020. Difference Between Descriptive And Inferential Statistics - Statistics
By Jim. [online] Statistics By Jim. Available at:
<https://statisticsbyjim.com/basics/descriptive-inferential-statistics/> [Accessed 15
December 2020].

7) Indeed.com, 2020. Descriptive Vs. Inferential Statistics: What's The Difference?.


[online] Indeed Career Guide. Available at:
<https://www.indeed.com/career-advice/career-development/descriptive-vs-
inferential-statistics> [Accessed 15 December 2020].

8) Young, J., 2020. Frequency Distribution. [online] Investopedia. Available at:


<https://www.investopedia.com/terms/f/frequencydistribution.asp> [Accessed 14
December 2020].

You might also like