You are on page 1of 8

Introduction

Hong Kong is an international city that takes important places in both economy and politics
in Asian area. For personal interests, as I’ve spent four years as an undergraduate there, I
wanna digging into the food culture of the neighborhood there. As there is no official division
of neighborhoods and the public transportation, MTR services, especially the subway,
connects people in the city to their daily activities spot, we will simply use subway stations to
represent the neighborhood. In this research, we will look into the restaurant categories 1.5
km along the subway and use different food types as attributes to build up a unsupervised
machine learning classifier model to classify the stations into 3 clusters and study how
neighborhood differs from each other and make recommendations on what types of restaurant
are specifically popular for clusters.

Data

The original list of Hong Kong MTR Stations is downloaded from JohoMaps [1]. The
geographic coordinates (longitude and latitude) of are retrieved from OpenCageGeocode API
[2]. The venues around stations are retrieved through FourSquare Places API [3]. The data is
then merged, cleaned and preprocessed for explicability, understandability and meeting
business requirements purpose.

Methodology
The method used here are descriptive statistics and the algorithm used for the unsupervised
classifier is KMeans.

1. The MTR Station Data

After the combination of list of MTR Stations and use station names to look up the
geographic data, we get a table listing the MTR Station name, Chinese names of the station,
geographic coordinates and hierarchical division of the station.
The distribution of stations in different regions and county would be:

region county count


Hong Kong Island Central and Western District 4
Eastern District 8
Wan Chai District 3
Kowloon Kowloon City District 2
Kwun Tong District 6
Sham Shui Po District 6
Wong Tai Sin District 3
Yau Tsim Mong District 10
New Territories Islands District 4
Kwai Tsing District 5
North District 2
Sai Kung District 5
Sha Tin District 11
Tai Po District 3
Tsuen Wan District 4
Tuen Mun District 2
Yuen Long District 5

An interactive visualization of the data (see in github) with a Folium map is like:
2. The venues with the first level venue category “Food” are selected.
The distribution of venues in TOP 20 categories is:
Venue Category count
Café 391
Cantonese Restaurant 314
Fast Food Restaurant 159
Sushi Restaurant 56
Seafood Restaurant 48
Bakery 47
Dumpling Restaurant 43
Snack Place 39
Vegetarian / Vegan Restaurant 31
Pizza Place 29
Ramen Restaurant 29
Shanghai Restaurant 29
Taiwanese Restaurant 25
BBQ Joint 23
Burger Joint 21
Buffet 19
Food Court 17
Steakhouse 17
Sandwich Place 15
American Restaurant 13

A visualization of Food Venues’ location is:


3. The variance ’Venue Category‘ is then processed with one hot encoding to a dummy
matrix with each unique value of category into a single attribute. The sum occurrence
of each category within an MTR Station neighborhood is calculated.

The top 8 most frequent venues are then showed, for instance:
4. KMeans

K-means clustering is a type of unsupervised learning, which is used when you have
unlabeled data (i.e., data without defined categories or groups). The goal of this algorithm is
to find groups in the data, with the number of groups represented by the variable K. The
algorithm works iteratively to assign each data point to one of K groups based on the features
that are provided. Data points are clustered based on feature similarity. In this case, we want
to cluster MTR Stations together and find inner similarities of restaurant distributions.

In order to look for the bust K clusters, we used the yellowbrick KElbowVisualizer package:

The KMeans is then performed and with kcluster of 3.

Results
By clustering MTR Stations into 3, we find the distribution of the 3 clusters like below with
the blue spot representing cluster 0, red dot cluster 1 and green dot cluster 2.
The distribution of the three clusters are:

Cluster Labels count


0 25
1 35
2 22

A view of top5 venue categories under a cluster is:


Discussion

From the bar chart above, we can see that the taste of each cluster varies.
cluster0: Very much coffee, very much fast food, much Cantonese
cluster1: Much coffee, much Cantonese, some fast food, dumpling and bakery
cluster2: Very much coffee, very much Cantonese, some snack, vegan and sushi, no fast
food
We can see that HKers do drink a lot of coffee no matter the locations. And Cantonese food,
of course, the local food, is the second must-have. You can make no mistakes by running
these two businesses along stations, or say, you can always find certain food services around.
What makes the findings interesting is what goes next.
Stations under cluster0, for instance, University Stations & Choi Hung Station, sits Hong
Kong Chinese University and Hong Kong University of Science and Technology
respectively. Students may have too much pressure, limited time and places for some good
food so that a lot of fast food are consumed.
Stations under cluster1, for instance, Hong Kong Station and Tsim Sha Tsui Station,
surrounded by a bunch of shopping malls and even streets. People there, mostly tourists and
clerks, have wider selections of food and higher consuming power so that coffee and fast
food are not taken that much. Yet bakeries, a unique feature of Hong Kong food culture with
popular brand names such “Mei Xin” and “Tai Chang” and less time consuming, becomes a
popular selection.
Cluster 2 group seems to be healthier and more localized with no fast food and very much
local food and even some vegan and sushi. Interestingly, stations under this category like
“Kowloon Tong Station and Mon Kok East Station are where Hong Kong Baptist University,
Hong Kong City University and Hong Kong Polytechnic University locates. (Not saying
students seeking for food more than studying XD)

Conclusion
This study looks into how food business differs around MTR Station neighborhoods. The
results provide 3 categories of food selection, or rather lifestyle, among MTR Stations all
around Hong Kong. Hopefully, this short study can provide some insights into where to find
food and business opportunities around Hong Kong and entertains you a little bit by
providing some cute visualizations.

References
[1] http://www.johomaps.com/as/hongkong/metroatlas/list_station.htm
[2] https://opencagedata.com
[3] https://foursquare.com/developers/apps

You might also like