You are on page 1of 22

Product Deletion Decision based on Interactive Fusion of Multitype

Feature: A Deep Learning Approach for Long- and Short-term Dynamic


Temporal Classification

Decui Lianga,∗, Haoxin Tanga


a School of Management and Economics, University of Electronic Science and Technology of China, Chengdu 610054, China

Abstract
Deep learning’s modeling of time series has received widespread attention and research in various time-
dependent tasks due to its ability to fit complex dynamic temporal information, making it applicable across
different stages of product decision-making. Product deletion, as a critical stage in product management,
enables timely decision-making to remove products, freeing up resources for new production and reducing
the impact of secondary products on consumer loyalty. However, a major challenge in product deletion
lies in the integration of diverse types of complex decision factors into a decision-making system, capturing
the dynamic interactions and fusion dependencies between different types of features and objectives. To
address this, this paper proposes a deep learning-based approach for assisting product deletion decisions,
focusing on the fusion of multidimensional features and feature interactions. Three sub-models, namely
consumer sub-model, market sub-model, and consumer-market sub-model, are constructed considering the
feature types in the domain of product deletion. By combining convolutional neural network (CNN) and
gated recurrent unit (GRU) and incorporating a recurrent-skip mechanism, the proposed approach captures
the long-term and short-term dependencies of multidimensional time series. Additionally, a automatic
logistic regression component with automatic detection is designed to run in parallel with the three sub-
model networks, addressing the challenge of deep learning in capturing fixed-scale changes between features
and objectives. Through effectiveness and robustness experiments, the proposed predictive framework for
product deletion decisions demonstrates superior effectiveness compared to some classical machine learning
algorithms. Furthermore, this paper provides new perspectives and insights for the application of product
deletion decisions, offering potential business value for online retailers.
Keywords: Product deletion decision, Multivariate time series classification, Feature interaction and
fusion, Deep learning

1. Introduction

Product deletion, as a key issue in numerous product management decisions, has a significant impact on a
company’s profitability and sustainable development. In addition to its internal effects on the organization,
finance, operations, and marketing aspects of the company, it also affects the upstream and downstream
of the supply chain, including procurement, manufacturing, delivery, and service processes (Zhu and Shah,
2018). The human and financial resources freed up through product deletion can be reinvested in new product
design and production, leading to better resource allocation (Barwise and Robertson, 1992). However, the
increasing pace of technological updates and the shortening of product lifecycles have heightened market
dynamics. Most companies operate in a dynamic environment characterized by volatility, uncertainty,
complexity, and ambiguity, which present significant opportunities and challenges (Lin et al., 2006; Nandram

∗ correspondingauthor.
Email addresses: decuiliang@126.com (Decui Liang), thx379303490@163.com (Haoxin Tang)

Preprint submitted to Annals of Operations Research June 11, 2023


and Bindlish, 2017). Taking online retailers as an example, they rapidly select products favored by consumers
in such an environment. Once consumer preferences change, they can obtain this information through
internet media and introduce new products to meet consumer demands, thereby maximizing their profits.
In this process, retailers make product deletion decisions based on consumer considerations. Essentially,
products are removed from the market because the rapid pace of technology and product lifecycles pushes
them into the declining phase of their lifecycle. To fully depict the product lifecycle, factors such as purchase
volume and promotional impact that reflect or influence consumer demand are primarily considered (Lei et
al., 2023). The application context of this paper lies in such a highly dynamic environment, where a big
data discriminative model is designed based on consumer factors like satisfaction and purchase volume to
assist companies in analyzing product deletion decisions and achieving rational resource allocation.
In the existing research on product deletion decisions, most studies rely on the supply chain and develop
subjective evaluation methods to achieve final product deletion decisions. The process of making product
deletion decisions mainly considers various aspects of the upstream, midstream, and downstream, as well
as factors from different stakeholders. For example, the FMEA method is used to assess and manage po-
tential risks in product deletion decisions along the supply chain (Zhu et al., 2021). Another approach is
the integration of QFD methods, which consolidate the interests of multiple stakeholders from different
organizational levels, considering various functional strategies such as manufacturing, supply chain, finance,
and marketing in product deletion decisions (Golrizgashti et al., 2022). These subjective evaluation methods
rely on subjective analysis, the work experience, and domain expertise of engineers or experts. Moreover,
the complete decision-making process is time-consuming and inefficient (Teoh and Case, 2004). More impor-
tantly, these traditional product deletion methods may not be applicable in rapidly iterating domains such
as online retail platforms. For instance, a single store may have hundreds or thousands of products for sale,
making it nearly impractical to ensure quality control for all these products and utilize subjective evaluation
methods for product deletion decisions. In summary, the current state of the sales market necessitates a
shift toward data-driven strategies in product deletion decisions, and there is currently limited research in
this area.
In addition to the current state of related literature, regarding the task itself, data-driven product
deletion decisions are essentially a multi-dimensional time series classification task. The data involved in
this task often contains a significant amount of noise, and the positive and negative samples are extremely
imbalanced. This necessitates modeling algorithms with high information capture capabilities (Ang et al.,
2020). Traditional time series modeling requires manual construction of relevant features, which is costly and
relies on expert knowledge. On the other hand, deep learning can automatically extract key features from
time series data and ensure good predictive performance, thereby overcoming the risk of expert cognitive
biases (Gamboa, 2017). However, deep learning also has its limitations in time series modeling, as it cannot
capture fixed linear scale changes between features and targets (Li et al., 2023).
In a highly dynamic sales market, accurate product deletion decisions require not only capturing the
linear scale changes between features and targets but also addressing the challenge of capturing the dynamic
dependencies and interactions among different time steps and variables. The different variables and time
steps can lead to variations in the dynamic dependencies between them and the target variable (Chang et
al., 2018; Shi et al., 2020; Tang et al., 2020). Furthermore, time series data itself exhibits two modes of
dynamic variation: short-term and long-term patterns (Lai et al., 2018). Taking Figure 1 as an example,
if we want to predict the sales volume of Product A, we can observe that the sales generally experience a
significant decline every three months followed by a slow increase. In addition to capturing the short-term
sales information for predicting the current sales, would it be helpful to use sales data from three months
ago? In practical applications, to achieve accurate predictions based on time series data, it is necessary to
capture both short-term and long-term dynamic patterns. However, classic time series forecasting methods
often fail to address this issue adequately as most methods do not differentiate between these two patterns
and do not explicitly and dynamically model their interactions.

2
Figure 1: Historical sales of product A

Based on the limitations of current literature research on product deletion and the challenges of time se-
ries classification tasks, this paper proposes a deep learning-based multi-dimensional time series classification
framework to assist in product deletion decisions, namely the Interactive Feature Fusion Multi-time Series
Classification Model (IFMFNet). It leverages the advantages of convolutional layers to discover local de-
pendency patterns among multi-dimensional input variables and utilizes recurrent layers to capture complex
long-term dependency relationships. A recurrent-skip mechanism is employed to capture long-term depen-
dency patterns in the data, leveraging the periodic information of the input time series to further improve
the accuracy of the model. Inspired by traditional autoregressive linear models, the IFMFNet framework
also involves a logistic regression component parallel to the nonlinear neural network, enabling the nonlinear
deep learning model to capture fixed linear scale changes and enhancing the model’s robustness. Finally,
sub-models are trained for different types of features to capture their long-term and short-term dependency
relationships. The proposed prediction framework is validated through experiments on a real-world online
retail dataset, comparing it with different methods to demonstrate its effectiveness.
The contributions of this research can be summarized as follows:
• Introducing the perspective of big data modeling, the paper proposes the use of deep learning tech-
niques for assisting product deletion decisions, incorporating convolutional neural networks, recurrent neural
networks, and a recurrent-skip component to capture the temporal information of product-related informa-
tion in product deletion decisions. This distinguishes it from traditional methods and demonstrates the
efficiency and accuracy of the proposed approach.
• Making contributions to deep learning-based time series classification algorithms, the paper designs
three sub-models to explore the long-term and short-term dependency relationships between different types
of time series features and product deletion decision objectives. These sub-models include a consumer sub-
model that explores the interdependencies between product deletion and consumer satisfaction, a market
sub-model that explores the relevant dependencies between product deletion and product lifecycle, and a
consumer-market sub-model that explores the interdependencies between product deletion and comprehen-
sive complex scenarios.
• Improving classical deep learning networks, the paper introduces a logistic regression component to
automatically identify the fixed linear scale changes between product features and the target variable of
product deletion, enhancing the deep learning model’s ability to capture linear information about products.
This can greatly benefit the prediction of product deletion decisions in practical scenarios.
3
The remainder of this paper is organized as follows: In Section 2, we review the related literatures
of product deletion and time series classfication. In Section 3, we provide a detailed explanation of the
proposed prediction framework’s principles and feature construction for different sub models. Section 4
conducts experiments on real data and evaluates the model. Section 5 summeries this paper and outlines
some possible future research directions.

2. Literature review

This paper is based on deep learning to construct a product deletion decision model, which, at its core,
is essentially a time series classification task. Therefore, in this section, the paper reviews the research in
the field from three perspectives: product deletion decision, time series classification, and the integration of
deep learning models.

2.1. Product deletion


According to the types of literature, research on product deletion can be classified into two categories. The
first category primarily focuses on the concept and significance of product deletion. Traditionally, product
deletion is considered an organizational strategic decision involving financial and marketing aspects. It
is defined as the cessation or elimination of a product from an organization’s product portfolio (Shah et
al., 2017). The decision-making process typically involves identifying potential candidates for deletion,
analyzing their potential for regaining market share, and evaluating the products before selecting those that
need to be discontinued (Avlonitis and Argouslidis, 2012). Product deletion is a critical aspect of product
management as it allows for the release of resources to be invested in new production (Lin and Shih, 2010).
Particularly for small-scale enterprises with short product lifecycles entering the market, product deletion
can enhance profitability and facilitate rapid business growth (Katana et al., 2017). Due to the large number
of sellers, predominantly small-scale, and the extensive variety of products with short lifecycles on online
retail platforms, product deletion is frequently applied (Argouslidis et al., 2015). These studies explore the
nature of product deletion and its application in various industries, deriving some general conclusions.
The second category primarily focuses on the construction of practical decision-making methods for
product deletion. These methods are based on different stages of the supply chain and aim to consider
the requirements of different stages and stakeholders, in order to formulate product deletion decisions that
take into account the overall picture. For example, a multi-level subjective evaluation decision model built
considering lean and sustainable supply chain factors can promote the sustainable development of the supply
chain (Zhu et al., 2018). By using Failure Mode and Effects Analysis (FMEA) as an analytical framework for
product deletion, risk factors are scored and ranked, enabling both product deletion decisions and mitigation
of product risks (Zhu et al., 2021). In addition, some researchers have considered a wider range of decision
factors based on financial and non-financial attributes of the supply chain, introducing competitive factors
to improve the system of product deletion decisions, and utilizing the NFISI method for decision support in
product deletion (Pourhejazy et al., 2019). Interactions among different stages of the supply chain can be
explored by uncovering the relationships between business, supply chain strategies, and customer demands.
The Quality Function Deployment (QFD) method can be employed to develop decisions from multiple
strategies and perspectives of relevant stakeholders, making the product deletion process more scientific
(Golrizgashti et al., 2022).Overall, these methods lean towards subjective evaluation and can effectively
address conflicts arising during the product deletion process while considering the relationships between
factors. However, these methods lack a foundation in objective data modeling, their processes are complex,
and the efficiency of product deletion decision-making is low. Additionally, the calculation of product scores
may vary depending on the perspectives of different stakeholders.
The problem addressed in this study and the methodology employed differ significantly from previous
research. Firstly, this paper focuses on the context of online retail, a highly dynamic marketplace where
products are rapidly updated and have short lifecycles. Retailers often lack the ability to make product
deletion decisions throughout the entire supply chain, considering only consumer demands and product
profitability. The core challenge is to achieve rational product deletion decisions within a short timeframe,

4
in order to effectively allocate resources to new product operations and enhance store competitiveness.
Secondly, the research problem in this study is fundamentally a data-driven issue, relying on objective sales
data and consumer satisfaction as the basis for constructing machine learning algorithms. Compared to
traditional subjective evaluation methods, data-driven approaches can better address the visibility issues
faced by retailers, improve the efficiency of product deletion decision-making, and drive decision-making
with higher precision.

2.2. Time series classification


Time series are ubiquitous in daily life, and time series classification tasks have found numerous appli-
cations in domains such as finance, healthcare, and industrial anomaly detection. Based on the types of
methods for time series classification, they can be categorized into feature-based methods, ensemble-based
methods, and deep learning-based methods (Wang et al., 2017). Feature-based methods aim to extract
features that can describe both the global and local information of time series. By dividing the subsequence
into small intervals and utilizing the bag-of-features representation (TSBF) in time series classification
frameworks, it is possible to better capture local information (Baydogan et al., 2013). On the other hand,
ensemble-based time series classification primarily involves combining different classifiers to achieve higher
accuracy. One such method involves using a total of 35 classifiers for model ensembling, constructing
classifiers based on elastic distance metrics and standard classifiers in the time, frequency, change, and
shape transformation domains for time series classification (Bagnall et al., 2015). These feature-based and
ensemble-based methods for time series classification typically require extensive feature engineering and
human effort.
Due to the automatic feature extraction capabilities of deep learning methods, many researchers have
been actively developing the application of deep learning in time series classification tasks. The multi-channel
deep convolutional neural network (MC-DCNN) extracts features from each channel of a univariate time
series, combines these extracted features to form the final feature representation, and utilizes a multi-layer
perceptron network for classification. The effectiveness of this network has been validated on two multivariate
time series datasets (Zheng et al., 2016). The multi-scale convolutional neural network (MCNN) for single-
variate time series classification achieves impressive performance on various datasets from the UCR archive
by employing sliding window techniques and manual feature construction. However, the extensive manual
work and parameter settings make it challenging to apply this model (Cui et al., 2016). In light of this,
Wang et al. (2017) proposed three powerful baseline methods for time series classification: deep multi-
layer perceptron (MLP), fully convolutional network (FCN), and residual network (ResNet). Experimental
results demonstrated that these simple neural network baseline models achieved good performance on 44
benchmark time series datasets. Furthermore, if more complex combinations, such as convolutional neural
network (CNN) and recurrent neural network (RNN), are considered, the model’s performance can be
further improved. Numerous studies have shown the effectiveness of the CNN-RNN combination in time
series classification tasks (Ordóñez and Roggen, 2016; Qian et al., 2016; Tsironi et al., 2017).
A key challenge in time series classification is how to capture and utilize the dynamic dependencies
between variables, including a mixture of long-term and short-term dependencies (Lai et al., 2018). This
requires neural networks to have a longer view by extracting long-term dependencies from sequences with
larger time steps as input. However, this can lead to the issue of vanishing gradients in RNN (Thara et
al., 2019). To address this problem, researchers in the field have made various improvements to RNN,
with Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) being representative examples
(Hochreiter and Schmidhuber, 1997; Chung et al., 2014). In addition to these approaches, there are several
other techniques that can be effectively applied in time series analysis, including Transformer, adversarial
networks, and time series decomposition (Lim et al., 2021; Tang et al., 2020; Nguyen et al., 2019).
The aforementioned techniques have potential applications in time series classification tasks to address
specific problems in fields such as finance, industrial anomaly detection, and healthcare. However, their
performance in product deletion decisions still needs to be validated. Considering that data for product
deletion decisions is highly imbalanced, with positive samples (products being deleted) distributed towards
the end of a product’s lifecycle, this paper proposes a product deletion decision classification framework
based on deep learning techniques. It combines CNN and GRU networks and utilizes a skip-connection
5
mechanism to enhance the model’s long-term view. The framework is divided into three sub-models based
on the feature types in the product deletion domain. Additionally, a logistic regression component is designed
to capture linear fixed-scale variations in features. The results demonstrate that the proposed framework
outperforms some classical machine learning models in terms of prediction ability.

3. Problem description and research framework

Regarding the decision of product deletion, this paper treats it as a time series classification problem,
specifically determining whether to initiate product delisting based on historical product data. This section
provides a detailed description of the research problem and research framework.

3.1. Problem definition


Assuming the current product quantity is N .Since different products have different sales time intervals
1 ∼ tn , we can obtain historical data for product N within a specific time interval PN = {f1 , f2 , · · · , fn },
where f is the characteristic value, n is the characteristic number and fn = {f1n , f2n , · · · , ftN n }. The
historical data of product N will be slid through the window W , the specific process is shown in Figure
2∼(W +1) (t −W )∼tN
2. Then we can obtain the training set PNtrain = {PN1∼W , PN , · · · , PN N } and corresponding
labels YN = {CW , CW +1 , · · · , CtN }, where CW , CW +1 , · · · , CtN −1 = {0} and CtN = {0, 1}. CtN = 0 means
that product N is currently on sale at the time tN . CtN −W = 1 Indicates that product N has been removed
from the shelves at time tN . The training data obtained by merging each product is the final training data.

Figure 2: Construction process of product N training data

3.2. Product deletion decision model framework


3.2.1. Framework overview
Based on the inspiration from several studies on time series prediction frameworks (Lai et al., 2018; Li et
al., 2023), in this paper, we propose a deep learning-based multivariate time series classification framework to
assist with product deletion decisions. The framework consists of several components as shown in Figure 3.
Specifically, we categorize product features into consumer-side features and sales-side features based on the
attribute types of the product-related features. Consumer-side features reflect consumer satisfaction with
the product, providing internal driving force for product operations. Sales-side features reflect the product’s
sales performance, serving as the fundamental guarantee for product sales. Subsequently, we divide them
into consumer submodel {Xc , Y }, market submodel {Xm , Y }, and consumer-market submodel {Xcm , Y }
based on their types.
The consumer submodel {Xc , Y } captures the influence patterns of consumer-side features on product
deletion decisions and extracts the relevant relationship between the overall consumer satisfaction trend
and product deletion decisions. The market submodel {Xm , Y } explores the dynamic relationship between
changes in the sales market, such as price fluctuations, sales volume variations, and competitive market
dynamics, and product deletion decisions. Both consumer-side and sales-side factors have different impact
6
patterns on product deletion decisions (Ekambaram et al., 2020).Additionally, the consumer-market sub-
model {Xcm , Y } examines the combined impact of the two types of features on the target of product deletion
decisions. Previous research has shown that extracting internal relationship information between different
types of features can positively contribute to the final target prediction(Gao et al., 2018). However, in
certain cases, these internal relationships between different types of features may also disturb the accuracy
of the model. Hence, this constitutes one of the reasons for constructing separate submodels {Xc , Y } and
{Xm , Y } for different feature types in this study.
Considering the ability of deep learning to automatically learn feature representations and capture highly
nonlinear information (Ang et al., 2020), the proposed model in this study primarily utilizes deep learning
networks as fundamental building components. In the order from input to output, the basic components
of the DNN model proposed in this study consist of a Convolutional Neural Network (CNN), Short-Term
Gated Recurrent Unit (ST-GRU), Long-Term Gated Recurrent Unit (LT-GRU), and Fully-Connected Neu-
ral Network (FNN). ST-GRU and LT-GRU are parallelized in the same layer. The CNN captures the
interdependencies among features and feeds them into ST-GRU and LT-GRU. ST-GRU captures short-
term dynamic temporal dependencies, while LT-GRU captures long-term dynamic temporal dependencies.
Finally, the outputs of ST-GRU and LT-GRU are fed into the Fully-Connected Neural Network for the
purpose of achieving the final classification. Moreover, due to the inability of DNNs to capture the fixed
linear scaling relationship between features and targets (Li et al., 2023) [40], an additional logistic regression
component is designed to run in parallel with the three sub-models, aiming to capture the linear scaling
variations in the entire time series data. The output of the entire framework is contributed by the following
four parts:

YˆN = Regression(F1 ({XcN , YN }), F2 ({Xm


N
, YN }), F3 ({Xcm
N
, YN }), AutoLR(X N , YN )). (1)

In Eq. (1), F1 , F2 and F3 represents the DNN network structure designed in this study, taking into
account the influence of different feature types. It consists of four components, including CNN, ST-GRU,
LT-GRU and FNN. AutoLR refers to the automatic logistic regression component designed in this study.
Regression is a linear regression method used to assess the importance of different sub-models and the
automatic logistic regression component in the network.

7
Figure 3: IFMFNet Prediction Framework

3.2.2. Detailed description of each component


Based on Figure 3, in this section, we will sequentially elucidate the functionality and specific model
details of each component in the framework.
Convolution Component: In time series classification, a single convolutional layer can capture local
patterns between certain features, and therefore, stacking multiple convolutional layers can extract more
complex feature information (Cui et al., 2016). The first DNN block of the proposed IFMFNet submodel in
this paper is a convolutional layer composed of multiple filters with a width of 1 and a height of n, where the
height of the convolutional layer is the same as the number of variables. By sliding the convolutional layer,
it can extract the interdependencies between variables while capturing very short-term dynamic temporal
changes. Let Xi ∈ {Xc , Xm , Xcm }, where the k-th filter scans the input matrix and produces:

hk = RELU (Ak ∗ Xi + bk ) (2)

In Eq. (2), bk represents the bias term, ∗ denotes the convolution operation, Ak denotes the weights of the
convolutional layer, and the output hk is a vector. The RELU function is defined as RELU (X) = max(0, x).
The size of the output matrix of the convolutional layer is M × f iltern , where f iltern represents the number
of filters, and M is the number of times each filter slides from top to bottom.
Recurrent component: The GRU exhibits more efficient training efficiency compared to LSTM,
while ensuring satisfactory temporal prediction accuracy (Hewamalage et al., 2021). Within the recurrent
component block constructed in this study, the primary block incorporates a GRU recurrent layer (Chung
8
et al., 2014), where the input from the convolutional layer is fed into the recurrent component, and the
RELU function is used as the hidden update activation function. The computation of the hidden state for
time step t in the recurrent unit is as follows:

rt = σ (xt Axr + ht−1 Ahr + br )


zt = σ (xt Axu + ht−1 Ahu + bu )
(3)
ct = RELU (xt Axc + rt ⊗ (ht−1 Ahc ) + bc )
ht = (1 − zt ) ⊗ ht−1 + zt ⊗ ct
In Eq. (3), ⊗ represents the element-wise matrix multiplication, σ denotes the sigmoid activation func-
tion, and xt represents the input of the layer at time t. rt refers to the reset gate, which determines how
much historical information from the previous state can be recorded in the current candidate set. The up-
date gate, represented as zt , controls the extent to which the information recorded in the previous time step
is inputted into the current state ht . represents the output at time t. The overall output of the recurrent
layer is the hidden layer state at time t.
Recurrent-skip component: Since product deletion decisions rely on the information of the product
throughout almost the entire sales period to determine decision recommendations, the input time series
is often very long. As mentioned in the introduction, the optimal strategy for predicting product sales is
to incorporate the sales data from three months prior and capture this long-term dependency. However,
traditional RNN models like GRU and LSTM suffer from the vanishing gradient problem as the input
sequence grows, which limits their ability to capture longer-term dynamic temporal information. To address
this limitation, this study references the work of (Lai et al., 2018) and introduces a recurrent-skip component
to enhance the model’s ability to capture long-range information. Specifically, a skip interval is introduced
between the hidden units of the entire input sequence, allowing the short-term hidden units to be ignored
in favor of longer-term hidden units as inputs. The update process can be expressed as:

rt = σ (xt Axr + ht−o Ahr + br )


zt = σ (xt Axu + ht−o Ahu + bu )
(4)
ct = RELU (xt Axc + rt ⊗ (ht−o Ahc ) + bc )
ht = (1 − zt ) ⊗ ht−o + zt ⊗ ct
Similarly, the input of the recurrent-skip component is the output of the previous convolutional layer,
where σ represents the desired interval between skip hidden units. Taking the example of product A
mentioned in the introduction, let’s assume that product A roughly follows a periodic pattern with a cycle
of three months. In this case, σ = 90 represents the skip interval for the recurrent-skip component. If the
product’s periodic pattern is unknown or not apparent, the parameter σ needs to be explored during the
actual model construction to ensure that the model has the ability to capture long-range information under
suitable σ. In our subsequent experiments, we observed that even for products without a clearly defined
periodic pattern, the introduction of the recurrent-skip component still improved the model’s classification
accuracy. Therefore, this recurrent-skip mechanism is not limited to data with explicit periodic patterns
and can be applied more broadly.
Fully connected layer: The recurrent layer and the recurrent-skip layer are parallel components. The
outputs of these two blocks are combined into a single sequence, which is then inputted into a fully connected
neural network. This input comprises the hidden states from the recurrent layer and the hidden states from
the recurrent-skip layer. It’s important to note that the hidden states with skip intervals from the recurrent-
skip layer will not be included as inputs to the fully connected neural network. For example, if o = 90 is
the skip interval, the states between time t − o and t will not be included as inputs. The input to the fully
connected layer consists of the hidden state hR
t at time t from the recurrent component, as well as the hidden
states from t−o+1 to t−o+W from the recurrent-skip component, denoted as hSt−o+1 , hSt−o+2 , · · · , hSt−o+W .
The output of the fully connected layer is:

X
o−1
hD
t = AR h R
t + ASt hSt−i + b (5)
i=0−w

9
Automatic logistic regression Component: In the process of time series classification modeling,
deep learning networks such as CNN and RNN, both in terms of network construction and activation
functions, are non-linear models. However, they have a weakness in capturing linear relationships effectively
(Li et al., 2023). Therefore, in order to improve the model’s ability to capture the linear scale transformation
relationship of features and time series, this study incorporates an additional linear block, focusing on the
local scale issue, in addition to the DNN framework as the main building component. In the IFMFNet
architecture, we adopt the classic logistic regression model as the linear component.
On the other hand, we want to avoid inputting features in the training data that do not have a linear
relationship with the target variable into the logistic regression block. This is because it can negatively
impact the prediction accuracy of the linear model. Therefore, before training the logistic regression model,
we perform a Spearman correlation test (Ramsey, 1989) to improve the training speed, reduce memory over-
head, and eliminate the influence of non-linearly correlated variables on the logistic regression component.
The specific algorithm is presented in algorithm 1.

Algorithm 1 Automatic logistic regression detection algorithm



Input: Multivariate time series: D = Y, f1 , f2 , f3 , · · · , fi , fi ∈ X ∪ Ye , time step: T .
Output: Feature D̃ with linear influence, D̃ is a subset of D, and Y ∈ D e
1: Define D̃ as empty sets.
2: for each fi in D do

6 n (fn −Yn )2 √
3: Let ρ = 1 − n3√−n , ρ n − 1 − N (0, 1)
4: if (1 − normcdf (ρ n − 1)) ∗ 2 < 0.05, normcdf is the cumulative probability of the standard normal
distribution then
5: Put fi into D̃;
6: end if
7: end for
8: return D̃

By Algorithm 1, the final logistic regression model considers input features represented as D̃. The
prediction results of the logistic regression component are denoted as hLR
t , and the coefficients of the LR
model are represented as ALR , with a bias term of B LR . The input window size on the input matrix is
denoted as W LR . The logistic regression model can be described as follows:

1
hLR
t = ( ) (6)

LR
− ALR
k D̃t−k +b
LR

1 + e k=0
By merging the output of the neural network component and the logistic regression component, the final
prediction of IFMFNet is obtained:
(
0, if th <= hD
t + ht
L
Ŷt = . (7)
1, else

In Eq. (7), Ŷt represents the model’s final prediction at time t. Given a threshold th, if hD L
t +ht is lower than
the given threshold, the corresponding product is determined to be in good operational condition during
that period and does not require deletion. Otherwise, the product is marked for deletion. The threshold th
for product deletion decisions may vary across different domains in practical applications.

3.2.3. Detailed description of each component


According to the framework proposed in the previous section, this section focuses on the specific process
of feature construction. Consumer-side features reflect the satisfaction of consumers with the product, while
sales-side features reflect the sales performance of the product. In the consumer-side context, consumer
10
selection based on product reviews is considered more important and prevalent compared to transaction
volume and order ratings (Archak et al., 2011). Furthermore, extracting consumer sentiment from reviews
can reveal the product’s shortcomings and limitations (Netzer et al., 2012). On the sales side, product-related
decisions not only consider the product’s own features but also require adjustments based on current market
information. Utilizing this information can facilitate effective decision-making, such as predicting the sales
of new products based on the sales of similar products, which can guide inventory and sales strategies (Lei
et al., 2023). For platforms like online retail where products are not categorized, one approach is to employ
clustering algorithms to characterize similar products, assisting in the formulation and implementation of
related product decisions (Holỳ et al., 2017). Based on the above description, in addition to consumer
ratings and company product sales trends, we will use sentiment analysis and cluster analysis to further
characterize different types of features for product deletion decisions.
Feature construction: Considering the feasibility of data collection, the simplest and most relevant
features for product deletion decisions are presented in Table 2. By utilizing the collected information and
combining sentiment analysis and clustering analysis, we can develop a simple yet scalable product deletion
decision model. The specific features involved are outlined in Table 1.

Table 1: Simplified consumer sales feature


Types Feature name Symbol
Nt
Reviews freview
Consumer Feature:Xc
Nt
Rating frating
Nt
Sales fsales
Market Feature:Xm
Nt
Price fprice
N
Length flength
N
Width fwidth
Product Feature:Xp
N
Height fheight
N
Weight fweight

In Table 1, Xc and Xm represent features from the consumer side and sales market side, respectively.
These features in this section exhibit temporal variability, meaning they reflect different states of the product
over time. Xp represents the inherent characteristics of the product, which are time-invariant and reflect
external factors of the product. By clustering Xp , we can obtain valuable features, which will be further
discussed later.
Sentiment analysis: User review texts carry certain emotions, and identifying these emotions can be
beneficial in analyzing consumer satisfaction with a product during a specific time period. If a product
receives low sentiment scores in consumer reviews, it indicates that consumers are not sufficiently satisfied
with the product. While sentiment analysis is not the main focus of this study, a classic approach using
TextBlob is employed to calculate sentiment scores for the reviews (www.github.com/sloria/TextBlob).
TextBlob provides a floating-point value output ranging from -1 to 1 for sentiment analysis tasks, where
-1.0 represents negative polarity and 1.0 represents positive polarity. The score can also be 0, indicating
a neutral evaluation of a statement if a review contains words that are not present in the training set. In
Nt
Table 2, freview represents the sentiment score of the product at time t calculated using TextBlob.
Cluster analysis: Even for online retail products that already have category labels, it is still meaningful
to perform clustering based on product-related features. For example, a product may belong to the cosmet-
ics category, but the cosmetics category itself can be further divided into luxury products and affordable
products, each with its specific market environment. Additionally, the user base of any platform remains
relatively stable in the short term, and the demand for certain products also does not undergo significant
changes. Therefore, we aim to characterize the relative status of products by considering both the situation
11
of homogeneous products and the product’s own features. Specifically, we employ the classical K-means
algorithm to cluster the product features that do not vary with time and assign each product to a category
(Hartigan and Wong, 1979). To ensure the quality of clustering, the silhouette coefficient is used to measure
the clustering quality of each individual category (Kaufman and Rousseeuw, 2009).
Assuming the existence of K clusters represented as C1 , C2 , · · · , CK , the objective of K-means clustering
is to minimize the total deviation:

X
K X X
M
2
fnm − um
k (8)
k=1 Xn ∈Ck m=1
 
In Eq. (8), fn = fn1 , fn2 , · · · , fnM represents the sample data, Uk = u1k , u2k , · · · , uM
k is an M-
dimensional vector representing the centroid of cluster K, and the calculation formula for um k is as follows:
1
um
k = xm , ∀m = 1, 2, · · · , M, k = 1, 2, · · · , K (9)
|Ck |fn ∈Ck n
Based on the clustering results and the features of each product, market environment features are con-
structed to depict the relative status of the products at different times. The specific implementation process
is shown in Table 2.

Algorithm 2 Construction process of relative status feature


Input: The non-temporal features of a product Xp
Output: The market environment features for each product: fiN−tnew
1: Set the maximum number of clusters K0 , and initialize the non-temporal features K = 2;
2: Randomly select K samples from Xp as the initial cluster centroids;
3: Assign all the sample data by minimizing the total deviation to the cluster centroids;
4: Update the positions of the cluster centroids based on the sample data assigned to each category;
5: Repeat steps 3 to 4 until the cluster centroids stop moving;
6: Calculate the value of the silhouette coefficient, and let K = K + 1;
7: Repeat steps 3 to 6 until K > K0 , then return the optimal number of clusters K corresponding to the
highest silhouette coefficient value and output the corresponding clustering results;
8: Given the time-varying product features Xc and Xm with the category Ck of each product N , at time
t, we can calculate the relative status features for each product:

fiN t
fiN−tnew
= P fiM , i ∈ Xc ∪ Xm
|Ci |
i∈Ck

9: return fiN−tnew

Based on the clustering results of the time-invariant product features Xp , we can calculate new features
fiN−tnew for each time-varying feature i in Xc ∪ Xm . These new features can be constructed based on more
complex data characteristics. Furthermore, these new features can reflect the relative status of the products.
For example, by considering the ratio between the sentiment score of product N and the average sentiment
score of its category, we can determine the relative user satisfaction of product N at different times. Through
these new features, products from different industries can be compared, allowing for the identification of
differences in specific indicators. This facilitates the development of a generalized product deletion decision
model capable of recognizing different products.

4. A case study
We evaluated the proposed temporal classification framework for product deletion decision-making on
a real e-commerce dataset. Our evaluation aimed to address two main questions: Firstly, what are the
12
advantages of the proposed framework compared to traditional classical machine learning classification
algorithms? Secondly, what is the specific robustness of the framework in practical classification processes?

4.1. Data
The dataset used in this experiment is sourced from the publicly available e-commerce dataset of the
Olist store in Brazil a(www.kaggle.com/jainaashish/orders-merged). This dataset comprises 100,000 order
records from multiple markets in Brazil, spanning from September 2016 to September 2018. It provides
a comprehensive analysis of product status from various perspectives, including market features such as
order quantity, price, customer reviews, and shipping fees, as well as product attribute features such as
size, inventory, and product category.Regarding the labeling for the model, products that have not recorded
any sales for more than three months are categorized as ”exit products,” while the remaining products are
classified as ”non-exit products.” To enhance the model’s accuracy, products that had no sales for three
months but resumed sales subsequently have been excluded from the dataset.

4.2. Experimental setup


Due to the limitations of traditional machine learning models in addressing long sequential time series
classification problems, this study compares the proposed prediction framework with several approaches,
including Bagging algorithms such as Random Forest, Boosting algorithms like Xgboost, classical Multi-
Layer Perceptron (MLP), Gated Recurrent Unit (GRU), and Convolutional Neural Networks (CNN) (Chung
et al., 2014; Breiman, 2001; Chen and Guestrin, 2016; Riedmiller, 1994). The prediction horizon is divided
into 1, 3, 6, and 9, and evaluation metrics such as accuracy, precision, recall, and f1-score based on the
confusion matrix are used (Lever, 2016).Given the importance of correctly identifying both ”exit” and ”non-
exit” products, the f1-score places equal importance on precision and recall, reflecting a balance between the
two metrics. This is because predicting an actual ”exit” product as ”non-exit” and vice versa would both
yield unfavorable results.
It is important to note that the nature of time series classification requires predicting the market exit
status of products based on historical data, including today and future periods. For an exiting product, only
the last day of observation indicates the possibility of exit. This leads to an extremely imbalanced dataset,
where the number of samples labeled as ”exit” is significantly smaller than the number of samples labeled
as ”non-exit”.
Therefore, the challenge and focus of the prediction task lie in correctly identifying the products that
should actually exit the market and minimizing the occurrence of false predictions, where non-exit samples
are predicted as exit and vice versa. Thus, while maintaining a reasonably high accuracy, this study
prioritizes the discussion of precision, recall, and f1-score. The emphasis is on correctly identifying the
exiting products, while minimizing false predictions of both types.

4.2.1. Basic comparison algorithm analysis


To facilitate comparison and analysis, the input window for all models is set to a uniform length of 90
days. We aim to observe the prediction horizons of these algorithms, which involve using the past 90 days of
data to predict the market situation for the next n days. In this study, we consider prediction horizons of 1
day, 3 days, 6 days, and 9 days.In relative terms, longer prediction horizons hold greater importance, as they
allow businesses to have sufficient time to take appropriate actions based on the predicted outcomes. To
assess the significance of the sub-components of the proposed framework, in addition to other algorithms, we
conducted separate tests on IFMFNet without the recurrent skip component (referred to as IFMFNet-Skip)
and without the logistic regression verification component (referred to as IFMFNet-Log). The results are
presented in Table 2.
According to the comparison results of the basic algorithms in Table 2, we can obtain three valuable
analysis results.
Firstly, Both the logistic regression component and the recurrent skip component are meaningful. The
recurrent skip component contributes more significantly than the logistic regression component, and even a
single IFMFNet component performs better than other models.

13
Table 2: Comparison Results of Basic Algorithms
Horizon
Methods Metrics
1 3 6 9
Accuracy 0.9244 0.9226 0.9137 0.9168
Precision 0 0 0 0
XGboost Recall 0.098 0.0912 0 0
F1 0 0 0 0
Accuracy 0.9157 0.909 0.9187 0.9187
Precision 0 0 0 0
RandomForest Recall 0 0 0 0
F1 0 0 0 0
Accuracy 0.8973 0.9019 0.8906 0.8881
Precision 0.2258 0.1094 0.0463 0.0968
MLP Recall 0.0946 0.0236 0.0188 0.0451
F1 0.1333 0.0389 0.0267 0.0615
Accuracy 0.9685 0.9428 0.9563 0.9519
Precision 0 0 0 0
CNN Recall 0 0 0 0
F1 0 0 0 0
Accuracy 0.8777 0.892 0.874 0.8752-
Precision 0.1748 0.1958 0.1456 0.134
GRU Recall 0.1353 0.1053 0.1128 0.0977
F1 0.1525 0.1369 0.1271 0.113
Accuracy 0.8988 0.8847 0.8755 0.8596
Precision 0.1942 0.1843 0.234 0.1897
IFMFNet-Log Recall 0.3045 0.3008 0.2068 0.2218
F1 0.2372 0.2286 0.2196 0.2045
Accuracy 0.8689 0.8914 0.9014 0.8838
Precision 0.2565 0.3117 0.3297 0.2522
IFMFNet-Skip Recall 0.3007 0.2432 0.2256 0.218
F1 0.2768 0.2732 0.2679 0.2339
Accuracy 0.8601 0.9007 0.9034 0.9006
Precision 0.2537 0.3687 0.3512 0.319
IFMFNet Recall 0.348 0.2466 0.2218 0.1955
F1 0.2934 0.2955 0.2719 0.2424

14
Except for accuracy, all comparative algorithms exhibit performance metrics distributed around 0 to
0.2 for different horizons. However, IFMFNet-Log, IFMFNet-Skip, and IFMFNet achieve precision, recall,
and f1-scores of almost 0.2 or higher. Even a model with a single IFMFNet component outperforms other
comparative algorithms.
Furthermore, IFMFNet-Log shows a precision score ranging from 0.1843 to 0.2340, a recall score ranging
from 0.2218 to 0.3045, and an f1-score ranging from 0.2045 to 0.2372. IFMFNet-Skip exhibits a precision
score ranging from 0.2522 to 0.3297, a recall score ranging from 0.2180 to 0.3007, and an f1-score rang-
ing from 0.2339 to 0.2768. Relatively speaking, the performance of IFMFNet-Skip is better than that of
IFMFNet-Log, indicating that the long-term temporal feature capturing of the recurrent skip component
contributes more to the model than the linear relationship capturing of the logistic regression verifica-
tion component. However, both components contribute significantly to the improvement of the model’s
performance.Therefore, the combined IFMFNet, which incorporates both components, exhibits the best
performance. It shows a precision score ranging from 0.2537 to 0.3687, a recall score ranging from 0.1955
to 0.3480, and an f1-score ranging from 0.2424 to 0.2934.
Secondly, Conventional models are inadequate for product discontinuation prediction, and only prediction
models specifically designed for sequential data can be effective.
As shown in Table 2, Xgboost, Random Forest, and CNN models have precision, recall, and f1 -scores of 0,
indicating their complete inability to predict product discontinuation. This is due to the inherent limitations
of these algorithms in applying to time-series classification tasks and capturing temporal dependencies in
sequential data. Regarding the MLP model, its precision scores range from 0.0968 to 0.2258, recall scores
range from 0.0188 to 0.0946, and f1-scores range from 0.0267 to 0.1333 across different horizons. Although
these scores remain low, the MLP model is able to identify a small number of discontinuing products. This
limited performance could be attributed to the simplistic nature of MLP, which simply concatenates the input
time window and trains a fully connected neural network, thereby failing to capture temporal dependencies.
In contrast, the GRU model demonstrates superior performance compared to MLP, with precision scores
ranging from 0.1340 to 0.1958, recall scores ranging from 0.0977 to 0.1953, and f1-scores ranging from
0.1130 to 0.1525. This improvement can be attributed to the fact that GRU is a RNN architecture that
is capable of capturing sequential patterns. Furthermore, the proposed IFMFNet model in this study is
able to capture both short-term and long-term sequential patterns more effectively, exhibiting significant
superiority in performance compared to the comparative algorithms.
Thirdly, The accuracy of the IFMFNet model tends to decrease as the prediction horizon increases.
The f1-score provides a more comprehensive and balanced analysis of the model’s performance compared
to precision and recall. Therefore, in this study, we analyze the performance of the proposed framework
based on the f1-score for different horizons. Across different horizon values, there is a clear difference in the
predictive performance of IFMFNet-Log, IFMFNet-Skip, and IFMFNet. For IFMFNet-Log, the f1-scores for
horizon values of 1, 3, 6, and 9 are 0.2372, 0.2286, 0.2196, and 0.2045, respectively. For IFMFNet-Skip, the
f1-scores are 0.2768, 0.2732, 0.2679, and 0.2339, and for IFMFNet, the f1-scores are 0.2934, 0.2955, 0.2719,
and 0.2424. It can be observed that the comprehensive f1-scores of all three models exhibit a decreasing
trend as the horizon increases. This indicates that as the prediction length increases, the uncertainty of
events also increases, leading to higher model bias.
Regardless of whether it is IFMFNet-Log, IFMFNet-Skip, or IFMFNet, the framework proposed in this
study demonstrates excellent predictive performance for product deletion decisions. It outperforms other
algorithms and is more applicable to product deletion. However, the accuracy of this data-driven approach,
as presented in this paper, remains at a relatively moderate level. In practical product deletion decisions, it is
advisable to combine the model’s predictions with manual review. Relying solely on the model’s predictions
for product deletion decisions may introduce certain biases.Using the model as an automated early warning
tool and incorporating it into the product deletion decision-making process can effectively improve the
efficiency of product deletion decisions and shorten the decision-making cycle. The model serves as an
assistive tool rather than the sole basis for product deletion decisions, allowing for human intervention and
decision-making.

15
4.2.2. Sensitivity analysis of skip length on model performance
Given the significant contribution of the skip component to IFMFNet, we conducted a sensitivity analysis
on the core parameter of the skip component, namely the skip length. Considering the characteristics of
the skip parameter, if the underlying data has a monthly cycle, an IFMFNet with a skip length of 45 will
inevitably fail to capture the long-term temporal features of the data. Only when the skip length is set
correctly can the model capture the long-term cyclic patterns.
In this study, we processed the data by refining the monthly records into daily records, allowing for finer
granularity. Consequently, we examined the effects of different skip parameter values (0, 15, 30, 45, 60) on
IFMFNet. Additionally, since the data’s cyclic patterns may vary with time offsets, we also compared the
differences in skip parameter performance across different time windows. The specific results are presented
in Table 3.

Table 3: Performance Variations of IFMFNet with Different skip and windows


Windows
Skip Metrics
60 90 120 150 180
Accuracy 0.8249 0.8352 0.8151 0.878 0.8065
Precision 0.2369 0.2647 0.2271 0.3852 0.1347
0 Recall 0.2646 0.2784 0.818 0.1787 0.1111
F1 0.25 0.2714 0.2515 0.2441 0.1218
Accuracy 0.9067 0.9107 0.8499 0.8485 0.826
Precision 0.3417 0.4309 0.2989 0.2567 0.2568
15 Recall 0.277 0.3045 0.268 0.3299 0.323
F1 0.306 0.3568 0.2826 0.2887 0.2861
Accuracy 0.9134 0.9007 0.8283 0.8335 0.8674
Precision 0.3533 0.3687 0.2515 0.327 0.4268
30 Recall 0.1993 0.2466 0.2818 0.391 0.2863
F1 0.2549 0.2955 0.2658 0.3562 0.3427
Accuracy 0.8929 0.8526 0.8336 0.842 0.8627
Precision 0.288 0.3275 0.2744 0.2968 0.3947
45 Recall 0.3007 0.3169 0.3093 0.3162 0.2564
F1 0.2942 0.3235 0.2908 0.3062 0.3109
Accuracy - 0.8783 0.8571 0.8534 0.8576
Precision - 0.2834 0.2952 0.3261 0.3871
60 Recall - 0.3058 0.2131 0.3093 0.3077
F1 - 0.2942 0.2475 0.3175 0.3429

According to Table 3, when the window is set to 60, the f1-score achieves its maximum value with a skip
of 15, and the second highest value is obtained with a skip of 45. This indicates that the IFMFNet algorithm
exhibits an increasing-decreasing-increasing trend with skip intervals of 15 and 30 days for capturing the
cyclic patterns. Similarly, when the window is set to 90 and 120, the f1-score reaches its maximum value with
a skip of 15 and the second highest value with a skip of 45, confirming that the model captures long-term
features based on a 30-day cycle in the dataset.
In contrast to windows of 60, 90, and 120 days, when the window is set to 150 days, the f1-score achieves
its maximum value with a skip of 30 and the second highest value with a skip of 60. This is because as
the model’s time window continues to increase, the long-term temporal features of the data undergo some
changes, resulting in a transition from a skip of 15 to 30, maintaining the capture of the 30-day cyclic
patterns. When the window is set to 180, the situation regarding the maximum and second highest values
of the f1-score is the same as when the window is set to 150, again maintaining the 30-day cycle.

16
Therefore, different skip intervals should be selected for different time windows to effectively utilize the
cyclic skip component for capturing long-term temporal features. Otherwise, the effectiveness of the skip
mechanism will be compromised. Furthermore, when considering the data time window, even if the entire
dataset exhibits a clear and fixed cyclic pattern, it is still necessary to find the appropriate skip interval to
maximize the effectiveness of the cyclic skip component across different time windows.

4.2.3. Model threshold analysis


In the previous analysis, it was mentioned that the task of product deletion involves highly imbalanced
samples. Therefore, it is essential to set an appropriate threshold that achieves acceptable levels of precision
and recall. Considering this special circumstance of imbalanced samples, the ROC curve is a suitable
evaluation metric. It determines the True Positive Rate and False Positive Rate at different thresholds, thus
helping to identify a better threshold for model classification.Therefore, in the extended analysis of section
4.2.2, this paper explores the sensitivity of the threshold by utilizing ROC curve analysis under different
skip and window parameters (Hanley and McNeil, 1982).
According to Figure 4, it can be observed that the optimal threshold for IFMFNet varies with different
skip and window parameters. In the range of 0 to 1, when skip is set to 0 and window is set to 60, the
optimal threshold is around the middle. However, when skip is 15 and window is 60, the optimal threshold is
skewed towards the left. This indicates that the optimal threshold varies with different skip windows, which
can be attributed to the differences in capturing long-term features of time series and the identification of
effective variable cycles.
In practical product deletion decisions, if the variable cycles of the product differ from the time windows
of the training data, it is necessary to set a reasonable threshold to ensure the effectiveness of the product
deletion decision. Otherwise, the predicted results of product deletion decisions are likely to be distorted.
Taking actions based on erroneous predictions can lead to wasted resources and reduced profitability for the
company.

17
c
Figure 4: ROC curves of IFMFNet under different skips and windows

According to Figure 4, it can be observed that the optimal threshold for IFMFNet varies with different
skip and window parameters. In the range of 0 to 1, when skip is set to 0 and window is set to 60, the
optimal threshold is around the middle. However, when skip is 15 and window is 60, the optimal threshold is
skewed towards the left. This indicates that the optimal threshold varies with different skip windows, which
can be attributed to the differences in capturing long-term features of time series and the identification of
effective variable cycles.
In practical product deletion decisions, if the variable cycles of the product differ from the time windows
of the training data, it is necessary to set a reasonable threshold to ensure the effectiveness of the product
deletion decision. Otherwise, the predicted results of product deletion decisions are likely to be distorted.
Taking actions based on erroneous predictions can lead to wasted resources and reduced profitability for the
company.

4.2.4. Ablation analysis


We considers the effects of different types of features and constructs corresponding sub-models. Overall,
IFMFNet performs well in the task of product deletion, and its output is contributed by F1 , F2 , F3 and
AutoLR. However, are all sub-models meaningful? To address this question, we conducted a model ablation
analysis by removing different sub-models F1 , F2 and F3 , and outputting the corresponding scores. The
meanings of different ablation models are as follows:
Ŷwithout F1 represents the output when the consumer sub-model is removed from the model.
Ŷwithout F2 represents the output when the market sub-model is removed from the model.
Ŷwithout F3 represents the output when the consumer-market sub-model is removed from the model.
18
Ŷ represents the output of the original IFMFNet model.
This analysis allows us to observe the impact of the interaction and fusion of different types of features
on product deletion prediction. The results are presented in Figure 5.

Figure 5: Comparison of sub model performance

According to Figure 5, it can be observed that compared to Ŷ , the performance of models Ŷwithout F1 ,
Ŷwithout F2 , Ŷwithout F3 and all show a noticeable decline. The precision, recall, and f1-scores of Ŷ are 0.3270,
0.3910, and 0.3562, respectively. For sub-model Ŷwithout F1 , the corresponding scores are 0.2880, 0.3007, and
0.2942. For sub-model Ŷwithout F2 , the scores are 0.3137, 0.3008, and 0.3071. For sub-model Ŷwithout F3 , the
scores are 0.2644, 0.3797, and 0.3117.
On the other hand, removing the sub-model F3 results in the poorest performance, indicating that the
contribution of F3 is the highest. The next significant contributor is F2 , followed by F1 . Since F3 considers
more comprehensive information by incorporating the interaction between consumer and market features,
its contribution is the greatest.
For predicting product deletion, consumer satisfaction is more important than the sales performance
of the product. This is because the loss incurred by removing the consumer sub-model is higher than
removing the market sub-model. In the process of selling products, businesses should prioritize considering
consumer sentiments, actively gather real-time feedback on consumer needs, and take appropriate measures
to address specific issues. By improving consumer satisfaction and avoiding a sole focus on profit, businesses
can maintain a healthy operational status for their products and reduce the likelihood of product deletion
decisions leading to market exit.Additionally, even though the consumer sub-model holds greater importance,
companies should still pay attention to the sales performance of their products. This is fundamental to the
overall development and sustainability of the business.

5. Conclusions

In this paper, we address a novel research problem: data-driven product deletion decision-making. Con-
sidering the characteristics of product deletion decisions, we propose a time-series classification prediction
framework that incorporates feature interaction fusion to tackle the sparsity issue of positive samples in
19
product deletion decisions. In addition to using a CNN-GRU network structure to capture feature de-
pendencies and long-term temporal information, we introduce a recurrent skip-connection mechanism to
enhance the model’s ability to capture long-term dependencies. Furthermore, we design a logistic regression
component that runs in parallel with the network to learn the linear scaling changes between variables.
Based on the types of features in product deletion decisions, the framework is divided into three sub-models.
The prediction results from these sub-models, along with the logistic regression component, are used to de-
termine whether a product should undergo market exit. The effectiveness and robustness of the model are
analyzed using sales data from an e-commerce platform in Brazil. Experimental results demonstrate that
our proposed prediction framework outperforms classical machine learning algorithms in product deletion
decision-making.
This paper provides a fresh perspective on product deletion research by extracting the temporal infor-
mation from multi-dimensional time series data to assist in product deletion decision-making. As the pace
of product updates accelerates, the value of data-driven product deletion decisions continues to increase.
Consider an online retail company with diverse product categories. Product deletion decisions are crucial for
optimizing resource allocation within the company. If traditional subjective evaluation methods are used,
it would require analyzing and evaluating different products at different times, which not only consumes
significant manpower but also takes up a considerable amount of time. By employing data-driven prod-
uct deletion decision-making methods, it becomes possible to monitor the status of products in real-time,
identify products in critical states, and then consider whether they should be removed individually. This
approach assists companies in formulating effective product strategies and allows for timely actions.
Although the effectiveness and robustness of the proposed prediction framework have been validated in
this study, in reality, the predictive accuracy of the framework remains relatively low, albeit significantly
better than some classical machine learning algorithms. Therefore, currently, it can only serve as an auxiliary
tool for product deletion decisions. Furthermore, in this paper, the typical phenomenon of sparse positive
samples related to product deletion, distributed throughout the entire time range, has not been extensively
explored. Although the proposed framework is able to recognize this information and make reasonable
predictions, further research can consider more complex representations to extract information from samples
distributed at the ”tail” of the time series. Specific solutions can be proposed to address this issue and
optimize the overall model to improve its predictive accuracy. Once the accuracy reaches a threshold, the
algorithm can be truly applied in practical product deletion decisions.

Acknowledgements

This work is partially supported by the National Natural Science Foundation of China (No. 72071030),
the National Key R&D Program of China (No. 2020YFB1711900) and the Planning Fund for the Humanities
and Social Sciences of Ministry of Education of China (No. 19YJA630042).

References
J.S. Ang, K.W. Ng, F.F. Chua, Modeling time series data with deep learning: A review, analysis, evaluation and future trend,
in: 2020 8th International Conference on Information Technology and Multimedia (ICIMU), IEEE, 2020, pp. 32–37.
N. Archak, A. Ghose, P.G. Ipeirotis, Deriving the Pricing Power of Product Features by Mining Consumer Reviews, Management
science 57(8) (2011) 1485–1509.
P.C. Argouslidis, G. Baltas, A. Mavrommatis, An empirical investigation into the determinants of decision speed in product
elimination decision processes, European management journal 33(4) (2015) 268–286.
G. J. Avlonitis, P. C. Argouslidis, Tracking the evolution of theory on product elimination: Past, present, and future, The
Marketing Review 12(4) (2012) 345–379.
A. Bagnall, J. Lines, J. Hills, A. Bostrom, Time-series classification with COTE: the collective of transformation-based ensem-
bles, IEEE Transactions on Knowledge and Data Engineering 27(9) (2015) 2522–2535.
M.G. Baydogan, G. Runger, E. Tuv, A bag-of-features framework to classify time series, IEEE transactions on pattern analysis
and machine intelligence 35(11) (2013) 2796–2802.
L. Breiman, Random forests, Machine learning 45 (2001) 5–32.
P. Barwise, T. Robertson, Brand portfolios, European Management Journal, 10(3) (1992) 277-285.
Y.Y. Chang, F.Y. Sun, Y.H. Wu, S.D. Lin, A memory-network based solution for multivariate time-series forecasting, arXiv
preprint arXiv:1809.02105 (2018).

20
T. Chen, C. Guestrin, Xgboost: A scalable tree boosting system, in: Proceedings of the 22nd acm sigkdd international
conference on knowledge discovery and data mining, 2016, pp. 785–794.
J. Chung, C. Gulcehre, K. Cho, Y. Bengio, Empirical evaluation of gated recurrent neural networks on sequence modeling,
arXiv preprint arXiv:1412.3555 (2014).
Z. Cui, W. Chen, Y. Chen, Multi-scale convolutional neural networks for time series classification, arXiv preprint
arXiv:1603.06995 (2016).
V. Ekambaram, K. Manglik, S. Mukherjee, S. S. K. Sajja, S. Dwivedi, V. Raykar, Attention based multi-modal new product
sales time-series forecasting, in: Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery &
data mining, 2020, pp. 3110–3118.
J.C.B. Gamboa, Deep learning for time-series analysis, arXiv preprint arXiv:1701.01887 (2017).
H. Gao, D. Kong, M. Lu, X. Bai, J. Yang, Attention convolutional neural network for advertiser-level click-through rate
forecasting, in: Proceedings of the 2018 World Wide Web Conference, 2018, pp. 1855–1864.
S. Golrizgashti, Q. Zhu, J. Sarkis, Formalizing the strategic product deletion decision: incorporating multiple stakeholder views,
Industrial Management & Data Systems 122(4) (2022) 887-919.
J.A. Hanley, B.J. McNeil, The meaning and use of the area under a receiver operating characteristic (ROC) curve, Radiology
143(1) (1982) 29–36.
J.A. Hartigan, M.A. Wong, Algorithm AS 136: A k-means clustering algorithm, Journal of the royal statistical society. series
c (applied statistics) 28(1) (1979) 100–108.
H. Hewamalage, C. Bergmeir, K. Bandara, Recurrent neural networks for time series forecasting: Current status and future
directions, International Journal of Forecasting 37(1) (2021) 388–427.
S. Hochreiter, J. Schmidhuber, Long short-term memory, Neural computation 9(8) (1997) 1735–1780.
V. Holỳ, O. Sokol, M. Černỳ, Clustering retail products based on customer behaviour, Applied Soft Computing 60 (2017)
752–762.
T. Katana, A. Eriksson, P. Hilletofth, D. Eriksson, Decision model for product rollover in manufacturing operations, Production
Planning & Control 28(15) (2017) 1264–1277.
L. Kaufman, P.J. Rousseeuw, Finding groups in data: an introduction to cluster analysis, John Wiley & Sons, 2009.
M. Riedmiller, Advanced supervised learning in multi-layer perceptrons—from backpropagation to adaptive learning algorithms,
Computer Standards & Interfaces, 16(3) (1994) 265-278.
G. Lai, W.C. Chang, Y. Yang, H. Liu, Modeling long-and short-term temporal patterns with deep neural networks, in: The
41st international ACM SIGIR conference on research & development in information retrieval, 2018, pp. 95–104.
D. Lei, H. Hu, D. Geng, J. Zhang, Y. Qi, S. Liu, Z.J.M. Shen, New product life cycle curve modeling and forecasting with
product attributes and promotion: A Bayesian functional approach, Production and Operations Management 32(2) (2023)
655–673.
J. Lever, Classification evaluation: It is important to understand both what a classification metric expresses and what it hides,
Nature methods 13(8) (2016) 603–605.
D. Li, X. Li, K. Lin, J. Liao, R. Du, W. Lu, A. Madden, A Multiple Long short-term model for Product Sales Forecasting
based on Stage Future Vision with Prior Knowledge, Information Sciences 625 (2023) 97-124.
B. Lim, S. Ö. Arık, N. Loeff, T. Pfister, Temporal fusion transformers for interpretable multi-horizon time series forecasting,
International Journal of Forecasting 37(4) (2021) 1748–1764.
C.T. Lin, H. Chiu, P.Y. Chu, Agility index in the supply chain, International Journal of production economics 100(2) (2006)
285–299.
K.H. Lin, L.H. Shih, An optimization model of product line rollover: A case study of the notebook computer industry in
Taiwan, African Journal of Business Management 4(11) (2010) 2258-2268.
S.S. Nandram, P.K. Bindlish, Managing VUCA through integrative self-management, Springer, 2017.
O. Netzer, R. Feldman, J. Goldenberg, M. Fresko, Mine your own business: Market-structure surveillance through text mining,
Marketing Science 31(3) (2012) 369-547.
L. H. Nguyen, Z. Pan, O. Openiyi, H. Abu-gellban, M. Moghadasi, F. Jin, Self-boosted time-series forecasting with multi-task
and multi-view learning, arXiv preprint arXiv:1909.08181 (2019).
F. J. Ordóñez, D. Roggen, Deep convolutional and lstm recurrent neural networks for multimodal wearable activity recognition,
Sensors 16(1) (2016) 115.
P. Pourhejazy, J. Sarkis, Q. Zhu, A fuzzy-based decision aid method for product deletion of fast moving consumer goods,
Expert Systems with Applications 119(1) (2019) 272–288.
Y. Qian, M. Bi, T. Tan, K. Yu, Very deep convolutional neural networks for noise robust speech recognition, IEEE/ACM
Transactions on Audio, Speech, and Language Processing 24(12) (2016) 2263–2276.
P. H. Ramsey, Critical values for Spearman’s rank order correlation, Journal of educational statistics 14(3) (1989) 245–253.
P. Shah, D.A. Laverie, D.F. Davis, Research paper Brand deletion, Journal of Brand Strategy 5(4) (2017) 434–452.
Q. Shi, J. Yin, J. Cai, A. Cichocki, T. Yokota, L. Chen, M. Yuan, J. Zeng, Block Hankel tensor ARIMA for multiple short
time series forecasting, in: Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, 2020, pp. 5758–5766.
X. Tang, H. Yao, Y. Sun, C. Aggarwal, P. Mitra, S. Wang, Joint modeling of local and global temporal dynamics for multivariate
time series forecasting with missing values, in: Proceedings of the AAAI Conference on Artificial Intelligence, volume 34,
2020, pp. 5956–5963.
P.C. Teoh, K. Case, Failure modes and effects analysis through knowledge modelling, Journal of Materials Processing Technology
153-154(10) (2004) 253–260.
D. Thara, B. PremaSudha, F. Xiong, Epileptic seizure detection and prediction using stacked bidirectional long short term
memory, Pattern Recognition Letters 128(1) (2019) 529–535.

21
E. Tsironi, P. Barros, C. Weber, S. Wermter, An analysis of convolutional long short-term memory recurrent neural networks
for gesture recognition, Neurocomputing 268(13) (2017) 76–86.
Z. Wang, W. Yan, T. Oates, Time series classification from scratch with deep neural networks: A strong baseline, in: 2017
International joint conference on neural networks (IJCNN), IEEE, 2017, pp. 1578–1585.
Y. Zheng, Q. Liu, E. Chen, Y. Ge, J.L. Zhao, Exploiting multi-channels deep convolutional neural networks for multivariate
time series classification, Frontiers of Computer Science 10 (2016) 96–112.
Q. Zhu, P. Shah, Product deletion and its impact on supply chain environmental sustainability, Resources, Conservation and
Recycling 132 (2018) 1–2.
Q. Zhu, P. Shah, J. Sarkis, Addition by subtraction: Integrating product deletion with lean and sustainable supply chain
management, International Journal of Production Economics 205 (2018) 201–214.
Q. Zhu, S. Golrizgashti, J. Sarkis, Product deletion and supply chain repercussions: risk management using FMEA, Bench
marking: An International Journal 28(2) (2021) 409–437.

22

You might also like