Untitled

Founded in 1947, Hennes & Mauritz AB or H&M Group is a Swedish multinational clothing company
with 53 online markets and approximately 4,850 stores. The company has managed to disrupt the
apparel sector with the idea of fast fashion by offering style and quality at the lowest cost feasible.
With the wide range of products offered by the company, it tries to capture the fashion needs of
each customer. In 2021, the company managed to earn a net turnover of 19 billion euros- a 6
percent increase compared to the previous year. Its online stores show an year on year growth of
30 percent and 6 billion euros worth of revenue. While the revenue numbers might seem promising,
there is a lot of scope for expansion for the company and it might not be able to meet its costs. It is
hence clear that maintaining its customer base that is now shifting online after the pandemic, is of
topmost importance for the company to sustain in the long run.
In recent years, H&M has been investing heavily on data analytics and artificial intelligence by hiring
over two hundred data scientists working together to understand purchasing patterns and trends for
every item in each of its stores. H&M relies majorly on outsourcing its design and production
processes with over 800 partner companies in more than 30 countries. Additionally, the company is
freshly linked with Google Cloud for data analysis development. Even though the company is paying
hefty amounts towards customer analysis and recommendation systems, on the contrary, the
company currently is incurring losses and cutting 1500 jobs across its global operations as part of
efforts to save 2bn Swedish Kroner (£158m) a year, amid slowing sales and rising costs for clothing
retailers. According to recent news, the profits of the company fell by 30% in the nine months to the
end of August, partly resulting from the wind down of its operations in Russia in light of the war in
Ukraine. Moreover, the company had to shut down over 56 stores because of losses incurred during
the pandemic.
With a situation like this, the 30-day return policy and wide range of products that might not make
the “trend”, the revenue of the company is extremely dynamic, making it somewhat difficult for the
company to enhance its almost entirely outsourced processes to a greater extent. The company can
only develop if it constantly scales up and makes relevant decisions timely with appropriate funds in
hand. Cost cutting by manufacturing goods according to accurate future predictions and growing
sales by providing a better customer shopping experience is now a necessity to achieve the same.
To add on, many competitors are leveraging technology like using robots to make online order pick-
ups quicker and elevating market research data. In order to gain a continued standing in the highly
competitive market by increasing revenue, it is essential to predict the future to manage inventory
timely and optimise product suggestions in a way that it is able to provide the right product at the
right place at the right time and to the right customer. The first step however to achieve any of the
above mentioned is to optimise in the current knowledge by filtering unnecessary data from the
massive datasets and learn the key aspects that have a significant impact on the company’s
decisions.
As of recent studies, the company employed data analysis to gain insights such as:
-Most customers being women, favoured fashion-focused clothing and hence designed clothes
accordingly
-Shoppers preferred higher-priced items and so now $118 leather bags are displayed next to the
usual $6 T-shirts
-Enhanced shopping experience with selling flowers and a coffee shop in the store
-Behavioural Analysis for products and markets matching
-Future trends analysis to produce pieces that will sell
-Algorithms to understand currency fluctuations to manage worldwide pricings and costs
While the current solutions optimise in a similar dimension to reduce costs by understanding
customers’ needs through surveys, shopping activities, etc, there is a lot of potential for the
company to build on existing internal knowledge that is also not too costly for the firm to generate.
To step up the game of the company, it needs to develop product recommendations and predict the
future based on data from previous transactions, as well as from customer and product meta data.
The available meta data spans from simple data, such as garment type and customer age, to text
data from product descriptions, to image data from garment images. Wide categories of data that
are specific for different areas as different areas have different preferences is utilised. Data in the
fashion industry is mainly collected through CRM systems, social media listening tools, app
behaviour analytics and other smart connected products and services. For the purpose of the study,
Kaggle datasets will be used as the primary source of all the above mentioned data to analyse
various methods that can be used to achieve the above mentioned.
Model 1: Decision Trees for predictions
Decision Trees are a non-parametric supervised learning method used for classification and
regression. The goal is to create a model that predicts the value of a target variable by learning
simple decision rules inferred from the data features. Data splitting is among the most critical steps
before pre-processing. Splitting data prior to processing has the huge advantage of ensuring
consistency in model performance because unseen data is processed in the same manner as test
data.
The process starts by cleaning up data and removing extra columns as the dataset is extremely
large, which cannot be evaluated by machine learning model. Next, customer training and customer
test data frame is merged to create a proper Train & Test Df with proper formatting for testing and
prediction. Then, data is changed from being categorical to encoded to int or float to make it easier
to determine the accuracy through machine learning.
As the model can be completely visualised, it is easy to understand and interpret data and is able to
handle multi-output problems. This model can help make predictive analysis to make timely
decisions and reduce the costs of the company.

Model 2: Clustering for customer grouping
Customer segmentation is the practice of dividing a customer base into groups of individuals that
are similar in specific ways relevant to marketing, such as age, gender, interests and spending
habits. In order for sales to occur, items must be recommended based on the interests of
customers. To do this, it is important to decide what kind of customer information to look at, and to
group similar customers together. We will segment customers and find out the characteristics of
each customer through machine learning methods. To define customer representation, it is
necessary to bring related data into one table. Based on some selected transactions, the data is
combined into one table.
There are many different ways to define the characteristics of a customer. Each customer will be
presented with a numerical value based on which item they purchased. The data represented is
defined as a feature vector, after that clustering is performed. When using distance-based machine
learning models, feature scaling must be performed.
Clustering, an unsupervised setting, is used to understand which customer belongs to which
customer group to make specific decisions and make descriptive analysis.
Model 3: Training deep autoencoders for collaborative filtering
Autoencoders are a specific type of feedforward neural networks where the input is the same as the
output. They compress the input into a lower-dimensional code and then reconstruct the output from
this representation. The code is a compact “summary” or “compression” of the input, also called the
latent-space representation. The autoencoder consists of two neural networks, encoder and
decoder, fused together. The Keras functional API is used.
The model has the following characteristics:
-It is data specific. The model can only compress similar data, reducing the scope of assumptions.
-Autoencoders are lossy. The decompressed outputs will be degraded compared to the original
inputs.
-It learns automatically and is easy to train specialised instances that perform well on a specific
type of input. It doesn't require any new engineering, just appropriate training data.
To build an autoencoder, three things are required: an encoding function, a decoding function, and
a distance function between the amount of information loss between the compressed representation
of your data and the decompressed representation (i.e. a "loss" function). The encoder and decoder
will be chosen to be parametric functions (typically neural networks), and to be differentiable with
respect to the distance function, so the parameters of the encoding/decoding functions can be
optimise to minimise the reconstruction loss, using Stochastic Gradient Descent. It is frequently
used in machine learning applications to identify the model parameters that best match the
expected and actual outputs.
The model is optimal to reach a good fit to create an algorithm that can help provide personalised
recommendations and enhance user experience.
Model 4: Natural Language Processing for deeper insights
To extract features from the text data in order to compute the similarity between them, NLP can be
an optimal solution. The aim is to have text descriptions as vectors.
Word vectors are vectorized representations of words in a document. The vectors have semantic
values. For example, "t-shirt" and "shirt" should have vector representations close to each other,
while "pants" and "necklace" should be far from each other.
Next, Term Frequency-Inverse Document Frequency (TF-IDF) vectors are computed for each text.
In its essence, the TF-IDF score is the frequency of a word occurring in a document, down-weighted
by the number of documents in which it occurs. This is done to reduce the importance of words that
frequently occur in plot overviews and, therefore, their significance in computing the final similarity
score. This gives a matrix where each column represents a word in the overview vocabulary (all the
words that appear in at least one document), and each column represents an article in the dataset.
Natural Language Processing can be extremely beneficial to gain knowledge of the customers’
needs to manage manufacturing accordingly and make relevant decisions timely, helping reduce
overall costs substantially. Furthermore, it can help gain deeper behaviour insights for personalised
product recommendations.
While the above mentioned models can be extremely beneficial for the firm to adopt, there are
various features like high costs that should be kept in mind. Moreover, data filtration might remove
information that the algorithm might not find relevant but is of great importance to the company. The
accuracy of an algorithm might come under question as personal opinions and actual markets do
not always work in particular patterns and are highly dynamic. In order to maintain its proper
dynamics, there is a need for an AI literate workforce that is constantly moving forward or else the
current knowledge is redundant. It is also important to ensure that the algorithms are free from any
biases and needs to create an integrated balance with proper governance of the same.
Furthermore, gaining deeper information about the customers might threaten their privacy and
question the company’s intentions. Hence, creating a boundary between recommendations and
manipulations is important.
An important improvement for the models can be considering more variables like feedback loops
that have not yet been given due consideration in the current methods. Customer feedback can be
extremely helpful to maintain a continued relationship with the customers and stronger market
standing.
References
Craven, N. (2022). H&M shuts one in five stores to slash costs. [online] This is Money. Available at:
https://www.thisismoney.co.uk/money/markets/article-11446987/H-M-shuts-one-five-stores-slash-
costs.html
‌ ertat, A. (2017). Applied Deep Learning - Part 3: Autoencoders. [online] Medium. Available at:
D
https://towardsdatascience.com/applied-deep-learning-part-3-autoencoders-1c083af4d798

discover.certilogo.com. (n.d.). the-role-of-data-analytics-in-fashion. [online] Available at:
https://discover.certilogo.com/en/blog/role-of-data-analytics-in-fashion‌

Douglass, R. (2022). H&M links with Google Cloud on data analytics development. [online]
FashionUnited. Available at: https://fashionunited.uk/news/business/h-m-links-with-google-cloud-on-
data-analytics-development/2022063063862

Hasija, S., Sturm, S. and Erasmus, M. (2021) H&M-(128X128) dataset [online]
Available at: https://www.kaggle.com/datasets/odins0n/handm-dataset-128x128/code?
datasetId=1920073
Jasmijn (2022). H&M online sales grew 30% in 2021. [online] Ecommerce News. Available at:
https://ecommercenews.eu/hm-online-sales-grew-30-in-2021/
Kim., Y., (2022) [H&M] Customer Segmentation using Clustering [online]

Available from: https://www.kaggle.com/code/emphymachine/h-m-customer-segmentation-using-
clustering
MBA, A.L. (2020). H&M: Utilizing Big Data and Artificial Intelligence. [online] Medium. Available at:
https://medium.com/predict/h-m-utilizing-big-data-and-artificial-intelligence-6c837ceaeaa6

Python, R. (n.d.). Stochastic Gradient Descent Algorithm With Python and NumPy – Real Python.
[online] realpython.com. Available at: https://realpython.com/gradient-descent-algorithm-python/
Rathore, M., Maheshwari, K. and Jain, S. (2019). Fast Moving H&M: An Analysis of Supply Chain
Management. [online] Available at:
http://ijariie.com/AdminUploadPdf/Fast_Moving_H_M__An_Analysis_Of_Supply_Chain_Manageme
nt_ijariie10784.pdf

scikit learn (2009). 1.10. Decision Trees — scikit-learn 0.22 documentation. [online] Scikit-learn.org.
Available at: https://scikit-learn.org/stable/modules/tree.html

Sherry (2022) H&M prediction using Decision Tree [online]
Available at: https://www.kaggle.com/code/haiyanan/h-m-prediction-using-decision-tree/notebook ‌
The Guardian (2022) H&M to cut 1,500 jobs as retailers face slowing sales and rising costs [online]
Available at: https://www.theguardian.com/fashion/2022/nov/30/hm-to-cut-1500-jobs-as-retailers-
face-slowing-sales-and-rising-costs

Untitled

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Untitled

Uploaded by

Copyright:

Available Formats

Founded in 1947, Hennes & Mauritz AB or H&M Group is a Swedish multinational clothing company

Model 1: Decision Trees for predictions

Model 3: Training deep autoencoders for collaborative filtering

Model 4: Natural Language Processing for deeper insights

Kim., Y., (2022) [H&M] Customer Segmentation using Clustering [online]

You might also like