You are on page 1of 22

ASSIGNMENT

TECHNOLOGY PARK MALAYSIA

CT127-3-2-PFDA

PROGRAMMING FOR DATA ANALYSIS

TYPE INTAKE CODE

HAND OUT DATE: 10 OCTOBER 2022

HAND IN DATE: 28 NOVEMBER 2022

WEIGHTAGE: 50%    

INSTRUCTIONS TO CANDIDATES:

1 Submit your assignment at the administrative counter.

2 Students are advised to underpin their answers with the use of references
(cited using the American Psychological Association (APA) Referencing).

3 Late submission will be awarded zero (0) unless Extenuating


Circumstances (EC) are upheld.

4 Cases of plagiarism will be penalized.

5 The assignment should be bound in an appropriate style (comb bound or


stapled).

6 Where the assignment should be submitted in both hardcopy and


softcopy, the softcopy of the written assignment and source code (where
appropriate) should be on a CD in an envelope / CD cover and attached to
the hardcopy.

7 You must obtain 50% overall to pass this module.

        

Name: YUTOISHIUCHI

Version 3.1 EC 2019 -Oct


TP Number: TP061169

Table of Contents
1.0 Introduction..................................................................................................................................................2
2.0 data import....................................................................................................................................................2
3.0 data Exploratory..........................................................................................................................................3
4.0 Pre-Processing..............................................................................................................................................4
5.0 Data Visualization........................................................................................................................................7
Question 1: How much does the rent vary depending on the condition of the house?............................7
Analysis 1.1 - Find a diagram showing the relationship between the number of BHKs and rent................9
Analysis 1.2 - Find a relationship chart between Area Type and Rent..........................................................10
Analysis 1.3- Examining the relationship between rent and City..................................................................11
Question 2: How do house requirements change depending on the prospective tenant (family or
single)?...........................................................................................................................................................11
Analysis 2.1 - Find out the relationship chart between prospective tenants and size....................................12
Analysis 2. 2 - Examine the relationship between prospective tenants and the number of bathrooms.........13
Analysis 2. 3- Examination of Relationship between Prospective Residents and Currently Occupied Floors
........................................................................................................................................................................14
Question 3: How do room sizes, rents, or other needs change with the number of residential floors?15
Analysis 3. 1 - Finding the Relationship between Number of Residential Floors and Size..........................15
Analysis 3. 2 - Finding the Relationship between Number of Residential Floors and Rent..........................16
Analysis 3. 3 - Finding the Relationship between Number of Residential Floors and Point Of Contact......17
Question 4: How needs change with area location....................................................................................18
Analysis 4. 1 – Find out the relationship between area and floor level when the rent is over 600000..........18
Analysis 4. 2 – To find the relationship between BHK and size of Electronic City Area.............................19
Analysis 4. 3– To find the relationship between Bathroom and House floors in K R Puram Area...............20
5.0 Extra Features............................................................................................................................................20
1: facet_wrap() function in analysis 3.1.........................................................................................................20
2: Analysis by Pie Chart 2.2...........................................................................................................................21
6.0 Conclusion...................................................................................................................................................21

1.0 Introduction
This dataset contains details on rents for a variety of housing types to help determine what
influences people's choice of rental housing based on lifestyle. For example, BHK, Rent,
Size, No. of Floors, Area Type, Area Locality, and many other factors. This project will
identify how people choose rental housing based on family structure, locality, and lifestyle,
and provide meaningful insights for decision-making.

2.0 data import


Figure 1: Data Import Source Code A .csv file is imported and stored in the student variable.
The argument header =TRUE indicates that the first row of the data set is the label for each
column; if set to FALSE, the first row of the data set is unlabeled.

Version 3.1 EC 2019 -Oct


3.0 data Exploratory

Figure 1.1 and Figure 1.2: Data types for all columns

This data set contains 12 columns and over 4700 rows, and also implements Str() to identify
the data type and the value of each column.

Version 3.1 EC 2019 -Oct


Figure 2: Attributes and details each of column

Based on above data types, Summary() is used to show the attributes and details each of
column.

4.0 Pre-Processing
Data processing is applied to a data set, checking for missing values, deciding whether to
omit them or replace them with another value, similarly checking for outliers, checking the
number of rows and columns, and analyzing the overall structure of the data frame.
・The dim() function is used to determine the number of rows and columns

Version 3.1 EC 2019 -Oct


Figure3: Determining number of rows and columns

The output of the function:


The output of the function:
Figure 4: Determining number of
rows and columns
Figure 4: Determining number of
rows and columns
Figure 4: Determining number of
rows and columns
Figure 4: Determining number of
rows and columns
Figure 4: Determining number of
rows and columns
Figure 4: Determining number of
rows and columns
Figure 4: Determining number of
rows and columns

Version 3.1 EC 2019 -Oct


Figure 4: Determining number of
rows and columns
Figure 4: Determining number of
rows and columns
Figure 4: Determining number of
rows and columns
Figure 4: Determining number of
rows and columns
Figure 4: Determining number of
rows and columns
Figure 4: Determining number of
rows and columns
The output of the function:
The output of the function:

Figure 4: Number of rows and columns


・Name() function is used to determine the names of columns

Figure 5: Determining the names of columns


The output of the function:

Version 3.1 EC 2019 -Oct


Figure 6: Name of rows and columns
・Str() function
Figure 7: Determining the compact structure
The output of the function:

Figure 8: Summary of structure


・View() function is to summarized all the data in a table
Version 3.1 EC 2019 -Oct
Figure 9: Source code of viewing house rent data in a table

Figure 10: House rent data in a table

5.0 Data Visualization


Question 1: How much does the rent vary depending on the condition of the house?
First create a graph on rent to see how many are there by rent.

Version 3.1 EC 2019 -Oct


Figure 1.1/1.2 - Line graph on rent and source code

Version 3.1 EC 2019 -Oct


Analysis 1.1 - Find a diagram showing the relationship between the number of 
BHKs and rent

Figure 1.3 - Diagram of the relationship between the number of BHKs and rent

As shown in 1.1, when there are 3, 4, or 5 BHKs, we can assume that the rent is higher than
others. On the other hand, when the number of BHK is 6, the rent is not so high. Also, we can
see that only one place with a BHK of 3 has a higher rent than the other places. The bubble
chart above was plotted using geom_point() and the source code is as follows

Figure 1.2 – Source code for


bubble plot
Figure 1.4 – Source code for bubble plot

Version 3.1 EC 2019 -Oct


Analysis 1.2 - Find a relationship chart between Area Type and Rent

Figure 1.5 - Relationship Chart between Area Type and Rent

Analysis 1.1 found that the closer the BHK is to four, the more expensive the rent is.
Therefore, this analysis was used to estimate which of the super area, carpeted area (CA), and
built-up area have more BHK, and to search for areas with higher rents. Using the bubble
chart, we can infer that carpeted areas, supermarket areas, and built-up areas are the areas
with the most BHK, in that order. In other words, the carpeted areas are the areas with the
highest rents. The bubble chart is plotted using the geom_plot function, and the source code
is as follows

Figure 1.6 – Source code for the above bubble chart

Version 3.1 EC 2019 -Oct


Analysis 1.3- Examining the relationship between rent and City

Figure 1.7- Relationship to Rent and City

Figure 1.5 shows the relationship between City and rent. It can be seen that rents are often
higher for those living in Mumbai, and conversely lower for those living in Kolkata
compared to other cities, except for one place in Bangalore where rents are considerably
higher. We can see that
The box-and-whisker chart shows the function used in the box-and-whisker chart. The
function used for the boxplot is geom_boxplot() and the source code is shown in Figure 1.6
below.

Figure 1.6 – Source code for


box plot above
Figure 1.8 – Source code for box plot above

Question 2: How do house requirements change depending on the prospective


tenant (family or single)?
Before conducting an analysis, I need to know the number of marriages and the number of
people in each marriage.

Version 3.1 EC 2019 -Oct


Figure 2.1/2.2 - Bar graph on prospective tenant and source code

Analysis 2.1 - Find out the relationship chart between prospective tenants and size

Version 3.1 EC 2019 -Oct


Figure 2.3- A Graph of the Relationship between Prospective Tenants and Size

From the graphs, it can be seen that the house sizes for both Bachelor and Family are not
much different; it can also be seen that Bachelor/ Family is slightly larger than the two
graphs. The source code is as shown below and the function used is geom_point().

Figure 2.4- Source code of above graph

Analysis 2. 2 - Examine the relationship between prospective tenants and the number
of bathrooms

Figure 2.5- Relationship between Prospective Residents and the Number of Bathrooms

The pie chart above shows the percentage of prospective tenants and the number of
bathrooms. Contrary to expectations, we can see that the number of bathrooms is higher for
bachelors than for families. The pie chart is plotted using geom_bar and converted to a pie
chart as shown in the source code below.

Version 3.1 EC 2019 -Oct


Figure 2.4 – Source code for
the above pie chart
Figure 2.6 – Source code for the above pie chart

Analysis 2. 3- Examination of Relationship between Prospective Residents and


Currently Occupied Floors

Figure 2.7 – Prospective Residents and Currently Occupied Floors

The graph above shows the relationship between prospective tenants and the floor they
currently reside on; Lo represents Lower Basement, Up represents Upper Basement, and Gr
represents Ground. It can be seen that families do not live in the middle floors of the building,
between the 30th and 40th floors. In other words, I assume that the decision is made in
consideration of the inconvenience to other residents

Version 3.1 EC 2019 -Oct


Figure 2.8 –Source code for the above plot

Question 3: How do room sizes, rents, or other needs change with the number of
residential floors?
Use the following code to check how many floors there are in the dwelling before performing
the analysis

Figure 3.1/3.2 - Bar graph on number of residential floors and source code

Analysis 3. 1 - Finding the Relationship between Number of Residential Floors and


Size

Figure 3.3- The Relationship between Number of Residential Floors and Size

Version 3.1 EC 2019 -Oct


This plot shows that the average size of the houses on the 24th, 36th and 53rd floors is over
20 Also, some of the houses in Gr are large in size, which can be seen from this plot. The
bubble chart is plotted using the geom_plot function, and the source code is as follows

Figure 3.4 –Source code for the above plot

Analysis 3. 2 - Finding the Relationship between Number of Residential Floors and


Rent

Figure 3.5- The Relationship between Number of Residential Floors and Rent

The graph shows that rents on the 2nd, 7th, and 18th floors are relatively high compared to
the houses in other associations. It also shows that some of the houses on the 4th floor have
exceptionally high rents. The graph was using the source code as shown as below.

Figure 3.6- Source code of above graph

Version 3.1 EC 2019 -Oct


Analysis 3. 3 - Finding the Relationship between Number of Residential Floors and
Point Of Contact

Figure 3.7- The Relationship between Number of Residential Floors and Point Of Contact

This plot shows that there is only one contact builder on the first floor, many contact agents
overall, and few contact owners (COs) until around the 30-40th floor.
This graph was plotted as follows using the source code.

Figure 3.8 –Source code for the above plot

Version 3.1 EC 2019 -Oct


Question 4: How needs change with area location
Analysis 4. 1 – Find out the relationship between area and floor level when the rent is
over 600000.

Figure 4.1- Relationship between area and floor floor level when rent is 600000 or more.

This analysis looked at the relationship between area and floor floor level for rents over
600000. The assumption was that all areas would have area locations with many high-rise
floors, such as the 50th and 60th floors, but in fact, all areas are different, with the highest
area being the 24th floor. In reality, however, all the areas are different, with the highest area
being on the 24th floor. This graph was plotted using the source code as follows.

Figure 4.2 –Source code for the above plot

Version 3.1 EC 2019 -Oct


Analysis 4. 2 – To find the relationship between BHK and size of Electronic City
Area

Figure 4.3 – Relationship between BHK and size of Electronic City Area

In this analysis, the source code (table) was used to count how many cases there are per area
location. The results showed that Electronic City (24) was the most common, so the
relationship between this area and prospective tenants was investigated. The number and size
of BHKs were examined as part of the analysis for this area. As a result, we found that on
average there are 1000 size and 2 BHKs. source code, I plotted the following.

Figure 4.4 –Source code for the above plot

Version 3.1 EC 2019 -Oct


Analysis 4. 3– To find the relationship between Bathroom and House floors in K R
Puram Area

Figure 3.5 – Relationship between Bathroom and House floors in K R Puram Area

In this analysis, the codes (table) used in Figures 4-6: were used to count the number of units
per area location. We found that K R Puram (19) had the second highest number, so we
looked at this area and the number of bathrooms and residential floors. As a result, K R
Puram was found to be the only house with house floors up to Gr, 1, 2, and 3. We also found
that the houses on the third floor have fewer bathrooms and more in Gr. Using the source
code of the pie chart, we plotted the following.

Figure 3.6 – Source code for the above pie chart

5.0 Extra Features


1: facet_wrap() function in analysis 3.1

The fact_wrap function used in Analysis 3.1 is a function that displays the relationship
between housing floors, size, and occupancy in a bubble graph. This function was used to

Version 3.1 EC 2019 -Oct


separate the results for the Built Area, CA (Carpet Area), and Suoer Area to facilitate data
visualization and to partition the graph as follows.

2: Analysis by Pie Chart 2.2

A pie chart was created to show the reasons for choosing TenantPreferred and Bathroom, and
the desires of each house. To create the pie chart, it was necessary to create a bar chart using
geom_bar and convert it to a pie chart as shown in the source code below.

6.0 Conclusion
This study investigated what influences people's choice of rental housing based on lifestyle.
We also created graphs to analyze the relationship in each factor (room size, BHK, etc.) to
determine how people's needs change. We found that when choosing a house, prospective
tenants were more likely to be Bachelor/Family, and that the number of bathrooms, BHK,
size, etc. were more common on average than others. The number of floors occupied also
varied from person to person, but we found that families tended to choose lower floors more
often.

Version 3.1 EC 2019 -Oct

You might also like