You are on page 1of 25

Retail data analysis in Istanbul

University of Oulu
Business Intelligence: Applications and Projects


Contents ............................................................................................................................ 2
List of figures .................................................................................................................... 2
1. Introduction .................................................................................................................. 3
2. Description of the Dataset ............................................................................................ 3
2.1 Dataset Selection .................................................................................................. 3
2.2 Data Source .......................................................................................................... 4
2.3 Data description ................................................................................................... 4
3. Project objectives ......................................................................................................... 5
3.1 Importance and relevance .................................................................................... 5
3.2 Questions to be answered .................................................................................... 6
4. Analytics Process ......................................................................................................... 7
4.1 Pre-processing ...................................................................................................... 8
4.2 Visualizations and Analysis ................................................................................. 8
4.3 Insights and decision- making ........................................................................... 22
6. Reference .................................................................................................................... 25

List of figures

Figure 1: Sample of dataset ............................................................................................... 4

Figure 2: Number of Customers vs. Shopping Mall ......................................................... 9
Figure 3: Customer gender Vs shopping malls ............................................................... 10
Figure 4: Total Number of customers in each age group vs. Shopping mall .................. 11
Figure 5: Age-Group and Gender wise Distribution of Customers ................................ 12
Figure 6: Category and Number of Items Sold (Overall) ............................................... 13
Figure 7: Best-selling Category (Shopping Mall) ........................................................... 14
Figure 8: Age Group vs. Category .................................................................................. 15
Figure 9:Category wise Sales for Each Shopping Mall .................................................. 16
Figure 10:Annual Sales Distribution............................................................................... 17
Figure 11:Monthly Sales Distribution of all three years ................................................. 18
Figure 12: Quarterly Sales by Category for the year with Highest Sales (2022) ............ 19
Figure 13:Quarterly Sales Distribution ........................................................................... 20
Figure 14: Payment Method Popularity in Shopping Mall ............................................. 21
Figure 15: Payment method preferences vs age category ............................................... 22

1. Introduction
This report provides a comprehensive overview of the group project completed as part of
the Business Intelligence: Applications and Projects course. The project is conducted by
utilizing customer data from ten existing shopping malls in Istanbul to explore insights
into the most popular product types, the shops with the highest sales performance,
customer preferences for payment methods and sales seasonal features. The scenario of
this project is that we are from a consulting firm hired by an investor to guide the opening
of a new shopping mall.

To analyze the data, the group made use of data preprocessing techniques to ensure the
accuracy and consistency of the data. Data visualization in Tableau was used to gain a
better understanding of the data and identify patterns and trends. The insights obtained
from the analysis can greatly aid in decision making regarding the opening of a new
shopping mall. For instance, the most popular product types and customer preferences for
payment methods can guide the selection of tenants and payment options for the new
mall. Furthermore, the mall with the highest sales performance can provide insight into
the most successful business models and marketing strategies for the new mall.

In conclusion, this project demonstrates how to utilize business intelligence tools to

support decision making in a real business case. Through the analysis of customer data,
the group was able to extract valuable insights that can guide the opening of a new
shopping mall. By applying data preprocessing and analysis techniques, businesses can
gain a competitive edge and make informed decisions to drive growth and success.

The following parts are structured as follows: Section 2 describes the dataset we choose.
Section 3 introduces the importance and objectives of this project, and the research
questions. Section 4 describes the analysis process, visualizations and results. Section 5
concludes group members’ contributions.

2. Description of the Dataset

This section provides an in-depth description of the dataset that was utilized for the group
project. It will also explain the rationale behind the selection of this dataset and what the
project team intends to achieve from its analysis.

2.1 Dataset Selection

The decision to select this particular dataset for the group project was based on several
key factors. Firstly, the dataset contains a well-defined and understandable set of data,
making it straightforward to process and analyse. Secondly, the dataset aligns with the
project's objectives, which are described in detail in Section 3 of the project report.
Finally, the selection of this dataset was also influenced by the shared interest among the
project team in the topic of retail data. By selecting a dataset that aligned with the team's

interests, the project team was able to maintain a high level of engagement and enthusiasm
throughout the project.

2.2 Data Source

The dataset was sourced from Kaggle (Customer Shopping Dataset - Retail Sales Data,
2023), a publicly available online platform for datasets and data analysis. The dataset was
provided in CSV format, which contained comma-separated values (CSV) as shown in
Figure 1.

Figure 1: Sample of dataset

2.3 Data description

The selected dataset includes details of 10 different shopping malls between 2021 and
2023 in Istanbul. Further, there are 99457 records available with 10 data labels/columns.
Such as, invoice_no, customer_id, gender, age, category, quantity, price,
payment_method, invoice_date, shopping_mall. Below is a detailed description of the
data available in the dataset.

▪ invoice_no: Invoice number. Nominal. A combination of the letter 'I' and a 6-digit
integer uniquely assigned to each operation.

▪ customer_id: Customer number. Nominal. A combination of the letter ‘C’ and a

6-digit integer assigned to each operation.

▪ gender: String variable of the customer’s gender.

▪ age: Positive Integer variable of the customer's age.

▪ category: String variable of the category of the purchased product.

▪ quantity: The quantities of each product (item) per transaction. Numeric.

▪ price: Unite price. Numeric. Product price per unit in Turkish Liras (TL).

▪ paymnet_method: String variable of the payment method (cash, credit card or

debit card) used for the transaction.

▪ invoice_date: Invoice date. The day when a transaction was generated.

▪ shopping_mall: String variable of the name of the shopping mall where the
transaction was made.

3. Project objectives

3.1 Importance and relevance

This project aims to provide insights into consumer behavior and sales performance
across ten major shopping malls in Istanbul. The project is commissioned by an investing
company that plans to open a new mall next to a shopping mall that already has good
sales performance. By analyzing customer data, we aim to identify the product categories
with the best sales and the shopping malls that are performing better than others. This
information can help the investing company make strategic decisions about product
placement, inventory management, and marketing campaigns to increase sales and

In addition, this project intends to investigate the payment preferences of different

genders and age groups, which can provide valuable insights into consumer behavior. By
examining the payment methods used by customers in each shopping mall, we aim to
identify any patterns or trends that may exist. This information can help mall management
optimize payment processing systems and improve customer satisfaction.

To effectively communicate these insights and make data-driven decisions, we will use a
visualization tool to develop a dashboard that presents the data in a clear and
understandable format. Tableau has been chosen as the visualization tool to easily filter
data and customize our analysis. The dashboard will include charts and graphs that
highlight key metrics such as total sales, product category sales, and payment method
distribution. These visualizations can help decision-makers quickly identify trends and
make informed decisions.

3.2 Research questions to be answered

Research Question 1: Where shall we set up the new shopping mall? Where would be
the best location?

This research question focuses on identifying the optimal location for a new shopping
mall in Istanbul. Since there is no available graphic data in the dataset we choose, the
analysis will not consider this perspective and will investigate more on the following
perspectives as follows: Firstly, the shopping mall with the highest sales revenue data
indicates the area with the highest sales potential. Secondly, the shopping mall with the
highest customer population data will be regarded as an area can attract most potential
customers. Customer behaviour data such as purchasing preferences within different
product categories and age groups will also be examined to identify potential target
markets and to understand how to best cater to their needs.

Meanwhile, competitor analysis among the ten shopping malls will be conducted to
investigate the strengths and weaknesses of existing shopping malls in Istanbul. The
location of the new shopping mall will consider the sales performance and purchasing
behaviour of customers in the surrounding areas based on the dataset. The optimal
location will be the one that has the potential to achieve the biggest potential customer
base with the most popular product categories.

Research Question 2: What would be the best categories to sell in the new shopping

This research question aims to decide what product categories can be chosen for the new
shopping mall. The decision is based on the most popular product categories and
corresponding performance trends from the existing malls. The objective of this question
is to identify product categories that have consistent and prominent levels of demand and
that are likely to perform well in the new shopping mall. Besides processing sales data,
the analysis will also consider demographic data such as customer gender and age.

Based on this research question, the investor can prepare inventory management plans
and have wise warehouse rent contracts to increase sales and revenue and decrease costs.
By selecting the right product categories, the investing company can increase sales
revenue, and overall profitability of the new shopping mall.

Research question 3: When is the right time for marketing or changing products and
starting advertising?

This research question aims to ascertain the optimal timing for marketing and identify
seasonal sales peak features derived from the ten existing malls. The investigation
emphasizes on exploring annual, seasonal, and quarterly sales distribution feature across
different product categories. The time features on sales performance, as well as on gender

and age, if exist, will help the new shopping mall to hold prompt marketing campaigns to
attract corresponding customers.

Based on the timing features, the new shopping mall can have efficient marketing events
which boosts sales performance. Moreover, if there are apparent peak seasons, identifying
the time features will help the new shopping mall to prepare solid product inventory and
avoid losing any purchase orders due to tight inventory. Furthermore, this analysis will
also be helpful to inventory management.

4. Analysis Process
Data analysis is a systematic process to derive insights and make data-driven decisions
from analyzing raw data (Davenport & Kim, 2013). The analyzing process includes
several key steps. According to Davenport & Kim, 2013, the data analytics process
involves data collection, data preparation, data exploration, data modelling, evaluation,
deployment, and monitoring.

In this project, we followed the below steps to complete our analyzing project.

1. Data Collection: We used a data set from Kaggle for analysis. We have
considered some factors such as data quality, reviews of the dataset and the
relevant domain when selecting the data set. We believe using a good data set will
provide better results with our analysis.

2. Define Problem: Since this is a student group project, first, we had to look for a
good dataset, and based on the selected dataset we defined our business problem
and what data analysis meant to address. Our dataset provides data about the
shopping malls in Istanbul hence, we defined our research questions based on that
scenario, and details have been provided in section 3.

3. Data Preparation: This step involves preparing data for analysing because most
of the available data sets need cleaning and transforming. Details can be found in
section 4.1.

4. Data exploration: In this step, we used statistical and visual analysis techniques
to identify patterns and find the relationships between data in the data set and find
insights from data which can be utilised to reason our questions. We used Tableau
to visualise and understand the data and perform advanced analysis to answer our
defined questions.

5. Insights and Decision generation: As the final step, we designed a dashboard

using Tableau and included multiple visualizations in the dashboard to provide a
comprehensive view of the data. With advanced analytics, we provided insights

into how new shopping malls can be entered into the shopping mall market in

4.1 Pre-processing
Data pre-processing is an important step in data analysis process. Having data pre-
processing helps to have quality data for analysing and enables generation of more
informed visualizations. However, the dataset we have used for this project which was
originally sourced in Kaggle did not have missing values or null values. Therefore, we
did not need to treat for missing or null values in pre-processing. All data fields have been
used both categorical and numerical values but there was no need to normalization.

However, since the original dataset was a comma separated csv file, we transformed the
entire dataset into .xlsx format for checking any issues with data and imported the original
csv file into Tableau workspace for analysing transforming the comma separated values
into distinct values. We did use calculated fields for further analysis in Tableau. As an
example, the original dataset had the fields Price and Quantity, and we had to create new
a calculated column for Sales by multiplying the values of Price and Quantity.

4.2 Visualizations and Analysis

All the visualizations have been created with the purpose of addressing the questions that
have been created to analyse the data set and provide suggestions to open new shopping
mall in Istanbul. The questions cover topics such as customer popularity of the 10
shopping malls and how their business going with the years provided with data. With the
visualizations we expect to create a story how a new shopping mall would enter this
existing business and identify the requirements, behaviours of the target market and of
course establish the business as a new shopping mall.

The Tableau dashboard included with 4 main sections customers, categories, sales
distribution of the year and payment methods. We have provided detailed information
about the data used for the visualizations and analysis by answering the questions that
were elaborated in section 3.2.


1. Number of Customers vs. Shopping Mall - Number of customers who visited the
shopping malls

Figure 2 shows the number of customers who have been purchased from the
shopping malls during the time that data has been collected. The bar graph
visualises the number of the customers as a label and different colours represent
each shopping mall.

In this graph, distinct count of Customer Id is plotted against shopping malls to

find the most popular shopping mall. The size of the bars corresponds to the
number of customers of that shopping mall.

Figure 2: Number of Customers vs. Shopping Mall

Observation 1: According to Figure 2, the most frequently visited shopping mall among
the 10 malls is Mall of Istanbul. Kanyon and Metrocity are ranked second and third,
respectively, as the most visited shopping malls by customers.

2. Count of Gender grouped by shopping mall - Customer categorization based on their


Figure 3 presents the number of customers who visited the shopping malls
categorising based on their gender. The stacked bar graph is useful for comparing
the total number of each gender and the relative contribution of each category.

The count of each Gender is plotted against each of the shopping malls. Different
colors are used to indicate male and female customers. The labels show the counts
of gender. The length of the bars corresponds to the number of customers
belonging to that particular gender amongst the customers of that shopping mall.

Figure 3: Customer gender Vs shopping malls

Observation 2: Based on Figure 3, it can be observed that female customers have visited
the malls more frequently than male customers. The ratio of male to female customers is
approximately 1:1.5 for Mall of Istanbul and Kanyon, Metrocity, Metropol AVM, and
Istinye Park.

3. Total Number of customers in each age group vs. Shopping mall – customers based
on their age groups

Figure 4 presents the number of customers who visited the shopping malls
categorizing based on their age group. The stacked bar graph is useful for
comparing the total number of each age group and the relative contribution of
each category.

The count of customer ids is plotted against shopping malls. A third variable that
is considered in this case is the age groups. The age groups have different colors
associated with them. The labels show the total count of customers for that
shopping mall belonging to particular age group. The length of the bars reflects
the number of customers for that shopping mall in that age group.

Figure 4: Total Number of customers in each age group vs. Shopping mall

Observation 3: Based on Figure 4, it can be concluded that age 16-30 is the group which
most of the customers belong to. Age group 31-45 is second to the 16-30 age group. This
similar pattern is also true for the Mall of Istanbul, Kanyon and Metrocity.

4. Age-Group and Gender wise Distribution of Customers – Customer distribution with

age and gender

Figure 5 shows the customer distribution of all shopping malls based on their age
group and gender. The bubble graph is used for representing the age and gender
groups against the shopping malls to display data for three variables.

Count of Customer Ids in each Age group and Gender are displayed. The labels
show the number of customers belonging to that gender and age group. Different

colors are used to indicate male and female. The size corresponds to the number
of customers in that gender and age group.

Figure 5: Age-Group and Gender wise Distribution of Customers

Observation 4: Based on the analysis of Figure 5, it can be observed that among female
customers, the age group of 16-30 has the highest frequency of visits, followed by the age
group of 31-45. Among male customers, the age groups of 31-45 and 16-30 have the
highest number of customers.

Based on different aspects of the customer demographics in visiting the shopping malls
we can conclude the following observation:

Mall of Istanbul, Kanyon, Metrocity, Metropol AVM, Istinye Park are the top 5 shopping
malls where customers visit the most.


5. Category and Number of Items Sold (Overall) - Quantity sold for each category

Figure 6 shows the number of items that have been sold by categories. Due to the
ease of visibility and identification of the most selling category, a bubble graph
was used to visualize this data.

Here, the Quantity is plotted against Categories. Different shades of the same
color are used to indicate the number of items sold for each category. The size of
the bubbles and the labels corresponds to the number of items sold.

Figure 6: Category and Number of Items Sold (Overall)

Observation 5: As per the analysis of Figure 6, Clothing, Cosmetics, Food & Beverage,
Toys and Shoes are the most popular categories among the customers.

6. Best-selling Category (Shopping Mall) - Highest sales category for each shopping

Figure 7 shows the best-selling item category among the 10 shopping malls. The
bubble graph has been selected to represent this data since bubble graphs can be
particularly useful for comparing data across multiple categories.

Here, the top selling category is filtered for each of the shopping mall based on
sales. The labels show the sales of that category. First, the sum of sales is plotted
against categories for each shopping mall. Then after sorting, a filter is applied to
include only the top selling category for each shopping mall. The size corresponds
to the sales of that category. The colors are based on the range of sales.

Figure 7: Best-selling Category (Shopping Mall)

Observation 6: According to the analysis of Figure 7, the best-selling category across all
10 shopping malls is Clothing. Mall of Istanbul is the best-selling shopping mall, followed
by Kanyon and Metrocity in second and third place, respectively.

7. Age Group vs. Category – Sales distribution of categories for different age groups

Figure 8 shows the sales with customers’ age groups and item categories. The area
chart has been used to visualize the data with three categories. It can also be used

to show how different categories contribute to a whole, such as the proportion of

each category.

Sales for each category is plotted against age groups. The deeper the color is the
higher the sale is. The area corresponds to the total sales for that category and age

Figure 8: Age Group vs. Category

Observation 7: As per the analysis of Figure 8, clothing is the best-selling category. The
highest sales come from people in the age ranges 16-30 and 31 – 45 buying clothes. The
top categories are Clothing, Shoes and Technology for each age group.

8. Category wise Sales for Each Shopping Mall

Figure 9 shows the total sales of the shopping malls for different categories. The stacked
bar chat has been used to visualize the data because of the multiple variables.

Here, the Sum of Sales for each category in each shopping mall is displayed. The height
of the bar corresponds to the relative sum of sales for each category in that shopping mall.
Different colors correspond to different categories.

Figure 9:Category wise Sales for Each Shopping Mall

Observation 8: Clothing, Technology, Shoes, Cosmetics and Toys are the top 5 product
categories customers have purchased in each shopping mall.

Based on different aspects of the customer buying patterns in categories, we can conclude
the following observation:

Clothing, Technology, Shoes, Cosmetics and Toys - these are sales-wise the top 5 product

Clothing, Cosmetics, Food & Beverage, Toys and Shoes – these are the quantity-wise top
5 product categories.

To summarize our category analysis, clothing is the best-selling category from both sales-
wise and quantity-wise perspective.

Sales distribution of the year

9. Annual Sales Distribution

Figure 10 shows the annual sales distribution throughout the three years 2021, 2022 and
2023. Even though a line chart is known to be used for visualizing the time series analysis,

a bar chart has been used to display the sales distribution here as it provides more

In the graph, Sales is plotted against Year of the invoice date. Different colors correspond
to different years and the height is relative to the total sales for that year.

Figure 10:Annual Sales Distribution

Observation 9: The year 2022 had the highest sales. The sales difference between the
year 2021 and 2022 years is 56521 TL (Turkish Liras).

Note: Data is not available for the whole year 2023.

10. Monthly Sales Distribution of all three years

Figure 11 shows the sales distribution for all shopping malls, monthly. The stacked bar
chart has been used to visualize the total sales price for three years, monthly.

Sales is plotted against Month of the invoice date. Different colors are used to indicate
the year for that month. The height of the bars corresponds to the relative sales for that
month of the year with different colors indicating the year that month belongs to.

Figure 11:Monthly Sales Distribution of all three years

Observation 10: As per Figure 11, there was negligible variation in sales throughout the

Note: Data is not available for the whole year 2023.

11. Quarterly Sales by Category for the Highest Sales year (2022)

Figure 12 shows the sales distribution quarterly by categories for the year 2022. The
grouped bar graph has bars that are grouped together to represent sales for quarters for
each category. It is useful for comparing the values of sales for each category.

Sales is plotted against Quarter of the year from Invoice Date for each category. Different
colors indicate different Quarters and height of the bars indicate the Sales in that Quarter
for each category.

Figure 12: Quarterly Sales by Category for the year with Highest Sales (2022)

Observation 11: As per Figure 12, the Sales were similar in every quarter for each

12. Quarterly Sales Distribution


Figure 13 shows the quarterly sales distribution in a bubble chart. Quarters are
differentiated by colors.

Sales is plotted against Quarter of Invoice Date. The bubble sizes indicate the relative
Sales for that Quarter of the year. Sales is also included as labels.

Figure 13:Quarterly Sales Distribution

Observation 12: As per Figure 13, the highest sales was in Q3 of each year. Sales in each
quarter for every year are slightly lower than Q3 sales.

Based on the sales distribution throughout the year 2021, 2022 and 2023 for the shopping
malls, we can conclude the following observation:

Each quarter had similar Sales roughly. Q3 has topped, however the sales of other
quarters were marginally lower.

Payment methods

13. Payment Method Popularity in Shopping Mall


Figure 14 shows the payment methods that were used by customers when purchasing
from the shopping malls. The visualization includes a bar graph and an area chart.

In this visualization, a bar graph shows the number of invoices with different payment
methods for each shopping mall. The labels indicate the number of instances of that
payment method’s usage. Different colors are used for different payment methods. The
length of the bars corresponds to the relative number of usages of that payment method.

In the area chart, the count of payment method is plotted against shopping malls. Then
the results are grouped by the payment methods to show each method separately. The
area corresponds to the relative number of usages of that payment method. Different
colors are used for different payment methods.

Figure 14: Payment Method Popularity in Shopping Mall

Observation 13: As per Figure 14, all 3 payment methods are used in every shopping
mall. But Cash is the most common of them all.

14. Payment method preferences vs age category

Figure 15 shows the payment methods that have been used by each age group. The bar
chart has been used to visualize the data for payment method.

The count of Invoice No for each type of payment method grouped by age brackets is
plotted in the graph. The height of the bars indicates the relative usage of that payment

method. The labels display the payment method usage times. The shade of the color
indicates different age groups. The darker the blue bar is the higher the age for that group.

Figure 15: Payment method preferences vs age category

Observation 14: As per the observations of Figure 15, cash payment is popular among
every age group. Debit cards are the least popular payment method.

Based on the payment methods that were used by the customers we can conclude that:

Cash payment was the most popular irrespective of the age group.

4.3 Insights and decision- making

After reviewing the data, the results of their analysis, and obtaining sufficient
knowledge, it is time to decide and take timely action. Sometimes decisions are made
based on the results of data analysis, and sometimes decisions are made first, then in
order to know how to implement it in the best possible way, we go to the data-related
analysis and the results of their investigation. The decisions will be fruitful in both
situations if adopted and implemented at the right time. We can implement the
following decisions based on our analysis:

In the following, we will answer the questions mentioned in section 3.2.

1) What would be the best location for a new shopping mall in Istanbul?

One of our most important management decisions is finding a suitable location to build
a new shopping mall. For this purpose, we should look at the population living in the
area, the population of visitors and those who travel there for any reason, and more
importantly, how visitors can access the shopping centre in terms of traffic and ease of
navigation. It should be kept in mind which shopping areas are more welcomed by
buyers. To which centre does the most significant amount of income flow? Based on our
dataset and the implemented analysis, the location of Kanyon and Mall of Istanbul
centres with the highest number of purchases, the purchase price amount, and the
number of buyers (Figure 2,9) are the best possible places to build a new centre.

2) What would be the best categories to sell in the new shopping mall?

It should be kept in mind that the new centre is a complement and not a competitor, so
it should be checked what needs of buyers are not met in these centres. For example,
after shopping, is it possible for customers to rest and eat properly or not? If it is not
available, it is better for the new centre to consider large floors for building a huge
parking, cinema, amusement park, and restaurant for facilities, the entertainment and
relaxation of customers and families, in addition to building shopping floors. Also, the
existence of entertainment centres can create more fascination for buyers and attract
more people to shopping centres.

The classification and arrangement of categories in shopping centres significantly

increase the staying time of visitors in the shopping centre. Investigating the number of
items purchased, the amount of sales, and the number of customers in shopping
centres, shows that clothing stores are the flagships of sales, both in terms of quantities
and income (Figure 6, 7). Considering the higher sales in the clothing category, the first
floor is better to be assigned to clothing sales shops. With the increase in clothing sales,
we are facing an increase in other items sales. Therefore, the existence of rich clothing
stores with a variety of brands, variety of prices, variety of quality, and consideration of
different tastes will have a significant impact on the sales of other items. In this way, and
with a suitable ratio, other shops are divided into the following categories.

3) When is the right time for marketing or changing products and starting advertising?

By analyzing the data over time, we can identify trends and patterns in customer
behavior. According to the analysis of the number of sales for two years, for instance,
January has the highest sales (Figure 11), which can be due to the discounts at the end
of the season. Or it can be assumed that people tend to spend more time in stores at
this time of the year, which leads to more sales. Therefore, the presence of a larger
number of visitors provides suitable conditions for product introduction campaigns,
visual advertisements, and discount campaigns to be held in stores this month.

Discounts are among the most important factors that encourage customers to buy from
shopping centers. The offers we consider should be such that customers feel they will

benefit from shopping at our centers. Also, customers can be encouraged to visit our
shopping centers based on their preferences but at the considered time.

Another way to attract customers is to offer special discounts on different occasions

during the months of customer decline. It should not be forgotten that checking the
effect of discount offers on the amount of income, number of sales, and number of
buyers periodically at different times of the year will significantly impact the subsequent

Based on our analysis, the sharp drop in sales continues into April, and for the rest of
the year, in-store sales remain at the same level. Holding marketing campaigns until the
beginning of the recession season and announcing special offers in the months of April
and later will make customers return to the stores to take advantage of the promised
special conditions. This kind of data can help create sales forecasts for future periods,
which can help to make decisions about inventory management and the amount of
staffing to ensure that there are enough employees to handle increased customer

As seasons and trends change, so too should the products being sold. Considering that
some of the products have met with little acceptance from the buyers, we must check
the reason for these products' lack of efficiency. For example, the number of sales of
books and technology is the same (Figure 6), but because technology costs are higher,
we see that the total income from the sale of technology is higher. This may be due to
the need for more introduction of these categories in the marketing programs of stores,
publishers, and authors. How to encourage customers to buy fewer selling products?
For example, launching a book donation campaign and receiving a few percent discounts
on purchasing new books can be an excellent solution to bring back previous customers
and attract new customers.

Other insights were gathered from the dataset but were not completely related to the
research questions. These findings are laid out below:

1- Determination of target groups

By analyzing the gender and age of the customers who visit these malls, we can get an
understanding of which age and gender groups are more likely to visit these malls.
According to the investigation of different age groups among customers, it is clear that
age has an effect on the purchase rate of men and women, and usually, the most buyers
are between 16-45 (Figure 8) and women buy 50% more than men (Figure 3, 5).

Based on this, we can target specific customer groups and tailor marketing campaigns
accordingly. For example, we can encourage men to buy more by offering special offers
while buying and receiving lottery tickets in their subsequent purchases, considering
that men are more fans of games and participating in watching matches. Moreover,
among the effective advertisements often carried out by commercial complexes are

inviting famous figures such as popular actors and singers, holding ceremonies and
festivals, holding national and cultural events, and holding raffles and competitions
among customers and visitors, which could be considered to attract families to the
shopping malls.

2- Understanding payment preferences.

By analyzing the payment methods used by customers, it is possible to determine which

payment methods are preferred by customers. This information can be used to tailor
payment options to suit the customers’ preferences. As we are progressing forward to
the future, and cash is becoming increasingly obsolete as a payment method, especially
for younger people, in developed countries, it can be a good idea to think about whether
certain stores need to include cash as a payment method. While this does not seem to
be the case in Istanbul, as cash seems to be a very popular payment method there
(Figure 14, 15), it may still become handy in other cases regarding malls.

3- Analyzing mall performance

By analyzing the transaction date and the shopping mall's name, it is possible to estimate
the performance of each mall in the dataset. This information can be used to optimize
the performance of each mall by further investigating the issue and identifying possible
needs for improvement. For example, if one of the malls lacked sales compared to
another by analyzing the dataset, one could investigate the matter further to find out
the exact reason for the mall falling behind, e.g., its location or lack of variety in shops.

5. Reference

Customer Shopping Dataset - Retail Sales Data. (2023c, March 9). Kaggle.

Davenport, T. H., & Kim, J. (2013). Keeping up with the quants. Harvard Business
Review, 91(5), 80-88.

You might also like