You are on page 1of 31

Introduction...............................................................................................................................................

3
Major findings...........................................................................................................................................4
Part A. Evaluating business and economic data obtained..................................................................4
I. Data collection methods.................................................................................................................4
1. Quantitative data.......................................................................................................................4
2. Qualitative data..........................................................................................................................5
II. Source of data...............................................................................................................................6
III. Benefits and drawbacks of data sources....................................................................................7
IV. Methods for analyzing data........................................................................................................8
Part B. Communicating findings using appropriate charts, or tables...............................................9
I. Cleaning the data...........................................................................................................................9
1. Qualitative data..........................................................................................................................9
2. Quantitative data.....................................................................................................................10
II. Summary statistics, tables, charts to explore each variable....................................................13
1. Qualitative variables................................................................................................................13
1.1. Type...................................................................................................................................13
1.2. Furnished...........................................................................................................................14
1.3. Level_group.......................................................................................................................15
2. Quantitative variables.............................................................................................................16
2.1. Price...................................................................................................................................16
2.2. Bedrooms them bang frequency......................................................................................17
2.3. Bathrooms.........................................................................................................................18
2.4. Area....................................................................................................................................19
3. The relationship between “Price” and the remaining variables...........................................20
3.1. The dependence of price on Quantitative variables.......................................................20
3.2. The dependence of price on Qualitative variables..........................................................23
4. Evaluation of various types of tables and charts...................................................................25
Part C. Analyzing and evaluating “House Price Data Project 1” data............................................26
I. T-test.............................................................................................................................................26
II. Regress analysis..........................................................................................................................28
III. Evaluate the use of summary...................................................................................................30
IV. Differences between regression analysis and correlation coefficients...................................31
References:...............................................................................................................................................32
Introduction
The three sections of this study cover. 'House Price Data Project 1' is reviewed in Part 1
together with business and economic data. In order to analyze the information, part 2
calls for using the proper charts and tables for each variable. Utilizing various statistical
techniques, the last step is the analysis and evaluation of business data.
Major findings
Part A. Evaluating business and economic data obtained
I. Data collection methods
Information gathered for the purposes of measurement, observation, or description is
known as data. Data is important to every business. Data collection occurs when a
company collects and analyzes valuable information (name, email address, consumer’s
feedback, and website analytics) from various sources to build strategies. Create enticing
marketing strategies, find out more about your consumers, or make a budget for your
finances. There are a variety of methods for collecting data, but it's crucial that whichever
method is used yields trustworthy, consistent results. Quantitative and qualitative
information are the two most common forms of data collection.
1. Quantitative data
Quantitative data is information that can be expressed as a count or a number, in which
each set of data has a specific numerical value. The term "data" refers to any quantitative
information that may be used by academics for data analysis and mathematical
computations so that they can base real-world judgments on these numerical derivatives.
The price of a thing is an example about qualitative data. Because they may be
determined analytically with ease, qualitative data makes the assessment of numerous
parameters controlled. In order to collect information for statistical purposes, researchers
often conduct polls, surveys, or questionnaires directed at a specific subset of the
population. Studies can prove the outcomes of a set. There are variety of collection
methods towards quantitative data including: Surveys, One-on-one interviews, and
Observations.
 Surveys and questionnaires: Online methods have progressively replaced the paper-
based methods that were once used to collect data. These surveys typically include
closing questions since they are better at gathering information. The responses they
believed to be the best fit for a given question included in the poll. In order to get input
from larger-than-usual audiences, surveys are a crucial component. The ability of the
replies to be generalized to the full population without substantial disparities is a
crucial component of questionnaires. For instance, requiring participants to rate their
feelings on a scale of 0 to 5.
 One-on-one interviews: Although it used to be done in person, this quantitative data
collecting approach has shifted to phone and internet platforms. Marketers have the
chance to collect detailed data from individuals during conversations. The process of
acquiring data through quantitative interviews is crucial, and they are highly
organized. Currently there are 3 types of interviews including: Face-to-face interview,
Online interviews, and Computer assisted personal interview.
 Observations: This is a straightforward and simple type of collection methods, in
which it requires observer just to keeps an eye on the participants and occurrences in
their natural environment. This enables researchers to make their participants to have
their own judgements and reactions towards issues in a structured setting like a lab or
focus group.
2. Qualitative data
Approximation and defining data are qualitative data. One can notice and document
qualitative facts. This information does not take a numerical form. Focus groups,
individual interviews, methods of observation, and other methods are used to collect this
kind of data. In analytics, data that can be sorted into groups based on their features and
qualities are called categorical variables or qualitative data. The precise prevalence of
features or qualities can be ascertained with the help of qualitative data. It lets researchers
and statisticians set up settings to keep an eye on massive data sets. Watchers can
evaluate their environment with the use of qualitative information. There are four most
common qualitative data collection methods
 One-on-one interviews: Due primarily to its unique methodology, this is among the
most often utilized data gathering instruments for descriptive study. One-on-one, the
interviewer or researcher gathers information from the interviewee directly. The
interview might be conversational, casual, and unstructured. The majority of open-
ended topics are posed on the spot by the interviewers, who lets the actual interview
natural flow dictate which ones are answered.
 Focus group: The setting for doing this is a conversation in a group. Six to ten persons
maximum each group, with a supervisor in charge of guiding the conversation. Each
member of a group may share a trait, depending on the data being sorted.
 Record keeping: As a data frame, this technique makes use of current, trustworthy
files and relevant info sources. In fresh study, this information can be useful. Going to
the library would be comparable. In order to obtain pertinent information for use in
study, one might peruse the books and other resources there.
 Process of observation: By immersing oneself in the environment where respondents
are, the lead researcher can observe the individuals and take notes while
simultaneously gathering qualitative information. Other techniques of documentation
could be employed in addition to taking notes, such as filming and documenting sound
and video, snapping photos, and the like.
II. Source of data
There is no way to do statistical analysis of a business without first collecting relevant
data. In the world of data collection, there are two major types: Primary data and
secondary data. Data gathered for the first time by a researcher is called primary data,
whereas. Existing sources, observational studies, and experiments are the three sorts of
data sources that are theoretically possible.
 Existing sources: The information in this collection has been thoroughly researched,
examined, or tested to ensure its validity. Since every company has a database of its
own, each one may keep data. as well as everyone’s private details, such as client
information, sales, earnings, incentives, and salaries. Nowadays, big businesses gather
and keep data in order to sell or rent it. Additionally, the data collected from
organizations and groups is quite diversified, and the Website is a good resource for
data.
 Observational studies: This is an important source of data that the observer must
record in its naturalistic way. One or more variables will be gathered by observers and
analyzed, then the analytical findings will be used. To acquire the appropriate items,
for instance, supermarkets must gather customer data to determine their patterns. To
use this information, managers may build successful business plans by looking at
client preferences and scheduling of purchases.
 Experiment: Experiments take place in a controlled environment, as opposed to
observational investigations. Because it is done in a lab setting, the data collected is far
more diversified, accurate, and trustworthy than that collected by observation. The
experiment chooses a motivation leads, and the spectator may assess how the
subsidiary factors affect the motivation leads. Information will thus be gathered in
order to conduct analyses prior to mass manufacturing.
III. Benefits and drawbacks of data sources
Advantages Disadvantages
Observational - More accurate - Time-consuming
studies - More control over process
data - Require more labor
- Privacy is maintained - Feedback may be
found
Experiment - Extend beyond a - Human mistake
single industry happens
- Clearly presented - Significant time,
information money, and effort are
- High regularity and needed.
reliability - Subjective outcomes.
Existing - Timesaving - It's possible that it
sources - Analysis over time won't be tailored to
- The information may your requirements
be gathered by - The Quality of Your
anybody Data Is Out of Your
Hands
- You are not the owner
of the information
IV. Methods for analyzing data
To analyze data in statistics, there are two most common solutions including: descriptive
analysis, and Inferential analysis.
Raw data may be easily understood and interpreted by being transformed into a form,
which enables one to reorganize, order, and modify the data to offer insights into the
information presented. A sort of information analysis known as descriptive analysis
explains, presents, or summarizes data points in a useful fashion so that patterns that meet
all data requirements might emerge. One of the most crucial phases in doing a statistical
analysis is this one. You may draw inferences about the distribution of your data from it,
find errors and outliers, and compare the patterns of different variables to have a better
idea of what to do next. a more in-depth statistical examination.
Inferential analysis is a further category of statistics. This method is analytically more
difficult since it dealt with the outcomes of analytical forecasts. The investigator will rely
on the estimation (P value: 0.05). Estimations with a single value presented and interval
forecasts with two possibilities are the two categories into which assessment may be
separated. Additionally, it is possible to forecast and draw judgements about trends in
vast communities using hypothesis testing for both qualitative and quantitative data. To
use this method, the analyzer may quickly ascertain the relationship between the
variables. As a result, organizations may rely on their statistical findings to develop
hypotheses and workable enterprise solutions.

The differences between Descriptive analysis and Inferential analysis

Descriptive analysis Inferential analysis

- Researchers that use descriptive - When researchers need to analyze data


statistics on the raw population data from a large, unstructured population,
have they often resort to inferential statistics
- Descriptive statistics are used when a - Here, a testing procedure is necessary
closer look isn't necessary. since the analysis is conditional on the
- Mean, median, and mode are test limits
characteristics of a sample population - There are no clearly defined statistical
that are measured using a statistical bounds for the properties of the
technique called descriptive statistics investigating data used in inferential
- There are limitations to this kind of statistics.
statistical analysis. This may be used, - When the data under scrutiny
perhaps, with really estimated data. represents a representative sample of
- Since no assumptions are being made the population at large, this technique
about the reliability of the raw is often used to analyze the whole
population data, descriptive statistics population.
are typically completely reliable - Conversely, inferential statistics rely on
sample-dependent theories or infe
However, this method has not yet
established any solutions or forecasts.

Part B. Communicating findings using appropriate charts, or tables


I. Cleaning the data
1. Qualitative data

Statistics

Type Furnished Level

N Valid 501 501 501

Missing 0 0 0

Depending on the statistic table, it can be seen that three variables including Type,
Furnished, and Level do not have any missing in their data. So, I need not to fix, or clean
data.
2. Quantitative data
Statistics
The area of
The price of Number of Number of property by
property bedrooms bathrooms m2
N Valid 501 501 501 500
Missing 0 0 0 1

It can be seen that like qualitative variables, each quantitative variables also has 501 data,
but there is one missing in 501 valid of “area” variable.

Based on the boxplot chart, there are a lot of values which surpass the maximum value,
and there are three extreme outliers. Therefore, to have a better data, I will delete all
outliers by calculating IQR.
On the two boxplots of bedrooms and bathrooms, there are not too many outliers, about 6
in bedrooms’ boxplot, and three in bathrooms’ boxplot. So, I will also keep them because
they do not have many impacts on my data.
As for the variable "Area", I decided not to delete and keep its outliers and extreme
outliers, as I found that the "Price" variable was used more in later analyses.
Level_group
Valid Cumulative
Frequency Percent Percent Percent
Valid 1.00 271 54.1 54.1 54.1
2.00 230 45.9 45.9 100.0
Total 501 100.0 100.0

After recoding variable “Level” into 2 groups on SPSS, group 1 includes Ground, 1st, and
2nd levels; and group 2 includes other levels. It can be seen that group 1 has 271 valid,
and makes up 54.1%, group 2 has 230 valid, and makes up 45.9%.
Statistics
The price The Area of
of Number of Number of the property
property bedrooms bathrooms by m2
N Valid 501 501 501 500
Missing 0 0 0 1
Percentiles 25 = Q1 650000 2.00 2.00 123.250
50 = Q2 1600000 3.00 2.00 150.000
75 = Q3 2500000 3.00 3.00 190.000
Q3 + 5275000 290.125
1.5IQR

By using IQR method, the value of Q3 is 2500000, and Q1 is 650000. According to the
formula of IQR, IQR equals Q3 minus Q1. Therefore, I can calculate the value of “Q3 +
1.5IQR”, and I will choose all values that are smaller than it.
Statistics
The Area of In what floor
The type of The price of Number of Number of the property the property Level_grou
property property bedrooms bathrooms by m2 is. p
N Valid 475 475 475 475 475 475 475
Missing 0 0 0 0 0 0 0

In conclusion, after using IQR method, there are 475 data for each variable to analyse in
the following parts.
II. Summary statistics, tables, charts to explore each variable
1. Qualitative variables
1.1. Type

The type of property


Valid Cumulative
Frequency Percent Percent Percent
Valid Apartment 431 90.7 90.7 90.7
Duplex 20 4.2 4.2 94.9
Penthouse 14 2.9 2.9 97.9
Studio 10 2.1 2.1 100.0
Total 475 100.0 100.0
In general, there is a clear difference between "Apartment" compared to the other 3 types.
It can be easily seen that there are 431 Apartments for sale, and it accounts for 90,7%.
Meanwhile, Duplex, Penthouse, and Studio only accounted for 4.2%, 2.9%, and 2.1%
respectively.
1.2. Furnished

Is the property Furnished or not


Valid Cumulative
Frequency Percent Percent Percent
Valid No 325 68.4 68.4 68.4
Yes 150 31.6 31.6 100.0
Total 475 100.0 100.0
Overall, there is a difference between furnished and unfurnished houses. There are a total
of 150 Furnished houses, accounting for only 31.6%. This figure is about half of the
unfurnished houses (68.4%).
1.3. Level_group

Level_group
Valid Cumulative
Frequency Percent Percent Percent
Valid 1.00 260 54.7 54.7 54.7
2.00 215 45.3 45.3 100.0
Total 475 100.0 100.0
After combining the Ground, 1st, and 2nd levels values into "Group 1", as well as the
remaining values into "Group 2", there is generally not much difference between the two
values. "Group 1" has a total of 260 values, accounting for 54.7%, while "Group 2" has
only 45 values less and accounts for 45.3%.
2. Quantitative variables
2.1. Price

Descriptive Statistics
N Range Minimum Maximum Mean Std. Deviation Variance Skewness
Statistic Statistic Statistic Statistic Statistic Statistic Statistic Statistic Std. Error
The price of property 475 5159000 91000 5250000 1673760.52 1233539.484 152161965758 .848 .112
3.811
Valid N (listwise) 475
Regarding this variable, the price of each house bases mainly on the type, price, and
number of bedrooms and bathrooms. So, the range of this variable is quite high at
5159000, while its maximum value is 5250000, and the minimum value is 91000. The
average price ranges around 1673760, which also means that the house has quite low

price. Wide standard deviations about 1233539.484 and statistical deviation is 0.848, so a
propensity to title to the right are further characteristics of this histogram.

2.2. Bedrooms

Descriptive Statistics
N Range Minimum Maximum Mean Std. Deviation Variance Skewness
Statistic Statistic Statistic Statistic Statistic Statistic Statistic Statistic Std. Error
Number of bedrooms 475 4 1 5 2.74 .668 .446 -.112 .112
Valid N (listwise) 475
Regard to the variable "Bedrooms", the largest number of bedrooms in a total of 501
houses is 5 rooms, and the minimum is 1 room. The range of this variable is 4, and the
average number of rooms is about 2.74. The range of number of bedrooms is quite small,
low standard deviation at 0.668. Statistical deviation is -0.112, so a propensity to tilt to
the left are further characteristics of this histogram.
2.3. Bathrooms

Descriptive Statistics
Std.
N Range Minimum Maximum Mean Deviation Variance Skewness
Statistic Statistic Statistic Statistic Statistic Statistic Statistic Statistic Std. Error
Number of 475 4 1 5 2.10 .805 .649 .285 .112
bathrooms
Valid N (listwise) 475
As for the "Bathrooms" variable, it's quite like the "bedrooms" histogram, they both tend
to be symmetric. The largest number of bathrooms in a house is 5 rooms, and the least is
1 room. The range of this variable is four, and the average number of bathrooms out of
501 rooms is about 2.10. The range of bathrooms is also quite small, low standard
deviation at 0.805. Statistical deviation is 0.285, so a propensity to tilt to the right are
further characteristics of this histogram.
2.4. Area

Descriptive Statistics
N Range Minimum Maximum Mean Std. Deviation Variance Skewness
Statistic Statistic Statistic Statistic Statistic Statistic Statistic Statistic Std. Error
The Area of the property 475 231.0 56.0 287.0 154.073 49.0784 2408.690 .464 .112
by m2
Valid N (listwise) 475
Regarding the area of each house, most houses have an average area of 154.073 m 2,
which is quite a low figure compared to the common ground. For this variable, the
smallest area is 56 m2, and the largest area is 287 m 2. So, the scope of the variable "Area"
is quite high about 231 m2. Statistical deviation is 0.464, so a propensity to tilt to the right
are further characteristics of this histogram.
3. The relationship between “Price” and the remaining variables.
3.1. The dependence of price on Quantitative variables
Correlations
The Area of
The price of Number of Number of the property
property bedrooms bathrooms by m2
The price of property Pearson Correlation 1 .222** .486** .434**
Sig. (2-tailed) .000 .000 .000
N 475 475 475 475
Number of bedrooms Pearson Correlation .222** 1 .564** .659**
Sig. (2-tailed) .000 .000 .000
N 475 475 475 475
Number of bathrooms Pearson Correlation .486** .564** 1 .688**
Sig. (2-tailed) .000 .000 .000
N 475 475 475 475
The Area of the property Pearson Correlation .434** .659** .688** 1
by m2 Sig. (2-tailed) .000 .000 .000
N 475 475 475 475
**. Correlation is significant at the 0.01 level (2-tailed).

 R = 0.222: This metric ranges from 0 to 0.3. So, it can be said that the correlation
between price and number of bedrooms is negligible.
 R = 0.486: ranging from 0.3 to 0.5, this represents a moderate positive correlation
between "Price" and "Bathrooms". It means that when the price increases, the number
of bathrooms will increase a little.
 R = 0.434: shows a low positive correlation between the variable "Price" and the
variable "Area". It means that when the price increases, the area will increase a little.
 Sig. (2-tailed) = P value = 0.01 < α = 0.05: Price and area, as well as number of
bedrooms and bathrooms, are correlated, with a significance level of 1%.
This scatter plot describes the relationship between price and area, as the area changes,
the price will also change. Besides, the distribution of points is quite sparse, but still has
an increasing trend. This demonstrates that there is still a significant but somewhat
insignificant correlation between price and area. Moreover, it can also be seen that there
are still some points of uneven distribution, which proves that the price is less dependent
on the area. The price will be high since diverse homes have vast square meters. Smaller
homes, on the other hand, will cost less money.
This 2 Scatter plot depicts the relationship between price and number of bedrooms and
bathrooms respectively, as there are significant differences despite the variable number of
rooms. Furthermore, they are also not statistically relevant, since the score distribution
only reflects the price, and the number of bedrooms and bathrooms. Looking at the two
charts, the more the number of bedrooms and bathrooms, the higher the price. However,
there are many cases where even if the number of bedrooms and bathrooms is small, the
price is still very expensive based on type, area, and number of floors. Meanwhile, there
are a few houses with more rooms that are even cheaper.
3.2. The dependence of price on Qualitative variables
The price of property
Standard
Count Mean Median Mode Deviation
The type of Apartment 431 1576769 1450000 1600000 1160413
property Duplex 20 2680125 3087500 2000000a 1492732
Penthouse 14 3161637 3150000 2000000a 1567976
Studio 10 1758342 1750000 91000a 1181125
a. Multiple modes exist. The smallest value is shown

Looking at the table, most of the houses sold are of the "Apartment" category,
specifically the average selling price is 1576769, and the number of "Apartment" houses
is the most sold for 1600000. the standard deviation of the price of "Apartment" is
1160413. As for "Duplex" houses, because there is more luxury than "Apartment" they
sold only 20 units, with an average value of 2680125, and most "Duplex" houses sold the
most for 2000000, so the standard deviation of "Duplex" price is 1492732. Not much
different from "Duplex" houses, Penthouse also sold fourteen units with an average price
of 3161637, large about 100000 more than Duplex. However, they sold also the most for
2000000, and the standard deviation was 1567976. For the Studio, they sold only 10 units
with an average price of 1758342, and the most sold at a price of 91000, with a standard
deviation of 1181125.
The price of property
Standard
Count Mean Median Mode Deviation
Is the property No 325 1767532 160000 1600000 1225715
Furnished or not 0
Yes 150 1470589 111650 400000a 1230016
0
Level_group 1.00 260 1710762 150280 1850000 1273929
0
2.00 215 1629014 160000 1600000 1184265
0
a. Multiple modes exist. The smallest value is shown

According to the data, average price of the unfurnished houses is higher than the average
price of the furnished houses, specifically, the average price of the unfurnished houses is
1767532 and the price of the furnished houses is 1470589. Moreover, the unfurnished
houses are also sold 325 units, about 170 more than furnished houses (150). Besides, the
number of unfurnished houses sold the most at 1600000, and the most furnished houses
sold at 400000. However, the standard deviation of the furnished houses is 1230016, and
the standard deviation of the unfurnished houses is 1225715.
For the dependence of price on Levels, it can be seen that the average selling price of the
two variables is not too different. The selling price is 1710762 for the ground floor, 1-
storey, and 2-storey houses, while the selling price is 1629014 for the houses with a
larger number of floors. The number of houses sold under "Group 1" was 266 and were
mainly sold for 1850000. And the houses in "Group 2" sold a total of 224 units, with the
price of 1600000 mainly. The standard deviation of "Group 1" is 1273929, and the
standard deviation of "Group 2" is 1184265.
4. Evaluation of various types of tables and charts.
Qualitative variables: Using frequency tables, bar charts, and pie charts are the most
effective methods for evaluating qualitative variables. Variables such as house type,
levels, and furnished are analyzed based on the above factors. The frequency table shows
the number of non-duplicate observations in each variable. Viewers can quickly capture
the volatility of variables using bar and pie charts. Besides, they can also compare several
variables, or illustrate the trend of each variable.
Quantitative variables: Data types such as minimum, maximum, median, mode, mean,
standard deviation, and variance are presented primarily in quantitative terms. It displays
variables such as price, room size, and number of bathrooms, bedrooms. Histograms are
often used to represent quantitative variables, allowing the viewer to easily identify
differences in data.
Bivariate qualitative variables: An effective way to display correlations between
qualitative characteristics is with a cross table. The relationship between variables like
price per square meter and price dependent on the number of bedrooms and bathrooms
may be seen by readers. To draw attention to connections among independent variables,
they often use a clustered bar.
Bivariate quantitative variables: The correlation, magnitude, and synchronization of
independent and dependent variables are graphically represented by the scatter diagram.
The information is encrypted as a number and the coefficient of correlation between the
two or more variables is employed.
Part C. Analyzing and evaluating “House Price Data Project 1” data
I. T-test

Independent Samples Test


Levene's Test for
Equality of
Variances t-test for Equality of Means
95% Confidence Interval of
the Difference
Sig. (2- Mean Std. Error
F Sig. t df tailed) Difference Difference Lower Upper
The price of Equal variances .330 .566 -2.452 473 .015 -296943.289 121123.760 -534950.508 -58936.070
property assumed
Equal variances -2.448 288.963 .015 -296943.289 121280.485 -535648.448 -58238.130
not assumed

Depending the data on the table, there are different hypotheses put forth:
- H0: ϭ2Furnished = ϭ2Unfurnished
- H1: ϭ2Furnished ≠ ϭ2Unfurnished
F = 0.692
Sig. (F) = 0.566
And α= 0.05
According to the outcome, sig. (F) = 0.566 > 0.05 => do not refuse H0
 H0: ϭ2Furnished = ϭ2Unfurnished
 The variances between two group are not different
The hypothesis of this condition is:
- H0: µFurnished = µUnfurnished
- H1: µFurnished ≠ µUnfurnished
Sig. (2-tailed) = 0.015
And α= 0.05
According to the outcomes, Sig. (2-tailed) = 0.015 < α= 0.05 => reject H0
 H1: µFurnished ≠ µUnfurnished
 This shows that there is a difference between the price of furnished and
unfurnished houses
Independent Samples Test
Levene's Test
for Equality of
Variances t-test for Equality of Means
95% Confidence Interval of
the Difference
Sig. (2- Mean Std. Error
F Sig. t df tailed) Difference Difference Lower Upper
The price of Equal variances 3.722 .054 .719 473 .473 81747.982 113766.828 -141802.924 305298.888
property assumed
Equal variances .724 466.536 .470 81747.982 112982.673 -140269.957 303765.921
not assumed

Depending on the table, two hypotheses are put forward:


- H0: ϭ2group 1 = ϭ2group 2
- H1: ϭ2group 1 ≠ ϭ2group 2
F = 1.330
Sig. (F) = 0.054
α= 0.05
According to the cónequence, sig. (F) = 0.054 > 0.05 => do not reject H0
 H0: ϭ2group 1 = ϭ2group 2
 The variances between two group are not different
The supposition of this condition is:
- H0: µgroup 1 = µgroup 2
- H1: µgroup 1 ≠ µgroup 2
Sig. (2-tailed) = 0.473
α= 0.05
According to the outcomes, Sig. (2-tailed) = 0.473 > α= 0.05 => do not reject H0
 H0: µgroup 1 = µgroup 2
 This shows that there is not a big difference between the price of group 1 and
group 2 (of the variable Level_group)
II. Regress analysis
Model Summaryb
Adjusted R Std. Error of Durbin-
Model R R Square Square the Estimate Watson
1 .539 a
.291 .283 1044386.614 1.763
a. Predictors: (Constant), Level_dummy, Furnished_dummy, Number of
bedrooms, Number of bathrooms, The Area of the property by m2
b. Dependent Variable: The price of property
The number of bedrooms, bathrooms, space, and Level, and Furnished of the house are
five independent variables that will help explain how the dependent variable changed
(price). As can be observed, R2 = 0.291, therefore the independent variables included in
the regression analysis accounted for only 29.1% of the total variance in the dependent
variable (Price). The remaining 73% were attributable to factors beyond the scope of the
study, as well as to random error. Additionally, three variables—the bedrooms,
bathrooms, and the size of the house—can explain R = 0.539, or 53.9%, of the variation
of the selling price.

ANOVAa
Model Sum of Squares df Mean Square F Sig.
1 Regression 2096890633447 5 4193781266895 38.449 .000b
91.800 8.360
Residual 5115586543499 469 1090743399466
34.700 .812
Total 7212477176947 474
26.500
a. Dependent Variable: The price of property
b. Predictors: (Constant), Level_dummy, Furnished_dummy, Number of bedrooms, Number of
bathrooms, The Area of the property by m2
- H0: Overall, the model is not very important.
- H1: Overall, the model is substantial.
 Sig < 0.05: Refuse the assumption H0, that is, R2 ≠ 0 statistically substantial, the
regression model is appropriate.
 Sig > 0.05: Agree with the assumption H0, that is, R2 = 0 statistically
substantial, the regression model is not appropriate.
Regarding the result, Sig. = P Value = 0.000 < α = 0.05 => Eliminate H0
 This model is overall appropriate.
Coefficientsa
Standardized
Unstandardized Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) 22169.441 220501.965 .101 .920
Number of bedrooms -367725.580 97997.562 -.199 -3.752 .000
Number of bathrooms 606607.587 83865.509 .396 7.233 .000
The Area of the property by 7453.628 1524.383 .297 4.890 .000
m2
Furnished_dummy 327864.133 103173.011 .124 3.178 .002
Level_dummy 31114.835 97263.941 .013 .320 .749
a. Dependent Variable: The price of property
^
Price=22169.441+(−367725.580) bedrooms +606607.587 bathrooms+7453.628 Area+ 327864.133 Furnised+ 3
 Number of bedrooms
β1 = - 367725.580: There is a skewed correlation. When adding 1 more bedroom, the
price will decrease 367725.580.
 Number of bathrooms
β2 = 606607.587: Provided other independent variables remain unchanged, an increase
of 1 bathroom is predicted to affect the price increase to 606607.587.
 The area of property by m2
β3 = 7453.628: There will be a price change, when increasing the area of 1 square
meter (total area of the house), the house price will increase to 7453.628.
 Is the property Furnished or not:
β4 = 327864.133: This shows when other variable fixed, the price of furnished houses
is about 327864.133 units more than the price of unfurnished houses (on average)
 In what floor the property is:
β5 = 31114.835: This shows when other variable fixed, the price of ground, 1 st, and 2nd
level houses is about 31114.835 units more than the price of remaining level houses
(on average).
Test for significant of β:
H0 : β = 0
H1 : β ≠ 0
β1: t = -3.752, P Value = 0.000 < α = 0.05 => reject H 0. This shows that β1 is statistical
significance, the variable number of bedrooms is negatively affecting the “Price” margin.

β2: t = 7.233, P Value = 0.000 < α = 0.05 => reject H0. This indicates that β2 is
statistically significant in predicting prices, or that the bathroom variable is statistically
significant in explaining selling prices. The variable number of bathrooms is positively
affecting the “Price” margin
β3: t = 4.890, P Value = 0.000 < α = 0.05 => reject H 0. This indicates that β3 is
statistically significant in predicting price, or area variable is statistically significant in
explaining selling price. The “Area” variable is positively affecting the “Price” margin
β4: t = 3.178, P Value = 0.002 < 0.05 => reject H0. This indicates that β 3 is statistically
significant in predicting price, or area variable is statistically significant in explaining
selling price. The “Furnished” variable is positively affecting the “Price” margin
β5: t = 0.320, P Value = 0.749 > 0.05 => do not reject H0. In the regression model, the
coefficient is not statistically significant, meaning that this independent variable does not
affect the dependent variable "Price".
In summary, the P values of the four variables: number of bedrooms, number of
bathrooms, area, and furnished are all less than α = 0.05. So, the coefficient is significant.
However, with the coefficient of the Level variable, the P Value is more than 0.05, so it is
possible to reduce the “Level” variable narrow down the model.

III. Evaluate the use of summary


Descriptive statistics are applied to recap and analyze variables like types of properties,
furnished, and floor. This is because the data is presented briefly with tables and charts,
which is the right way to help the reader understand the information. However, this
method has not yet established any solutions or forecasts.
As an added bonus, the data analyst may utilize inferential statistics to develop a sound
hypothesis that can be supported by the data. However, this strategy can only be used by
really competent analysts.rences. Because of this, there is no such thing as inferential
statistics with a perfect accuracy of 100%.
IV. Differences between regression analysis and correlation coefficients.
The correlation coefficient method is used to examine the link between two factors, such
cost and dwelling area. As an added bonus, it may be used to test for a positive or
negative correlation between two variables. Researchers may learn more than simply how
the dependent variable responds to changes in the independent variable. If a home has
two bathrooms instead of one, for instance, it will cost more than a home with one
bathroom
References:
1. Anderson, D.R., Sweeney, D.J., Williams, T.A., Camm, J.D. and Cochran, J.J.
(2018). Statistics for business & economics. Boston, Ma: Cengage Learning.
2. Bhandari, P., 2020. An introduction to qualitative research. [online] Scribbr.
Available at: https://www.scribbr.com/methodology/qualitative-research/ [Accessed
15 November 2022].
3. Bhandari, P., 2020. An introduction to qualitative research. [online] Scribbr.
Available at: https://www.scribbr.com/methodology/quantitative-research/ [Accessed
15 November 2022].
4. Kalish, C. and Thevenow-Harrison, J., 2014. Descriptive and Inferential Problems of
Induction. Psychology of Learning and Motivation, pp.1-39.
5. Kaur, P., Stoltzfus, J. and Yellapu, V., 2018. Descriptive statistics. International
Journal of Academic Medicine, 4(1), p.60.

You might also like