You are on page 1of 4

2021 IEEE 6th International Conference on Intelligent Computing and Signal Processing (ICSP 2021)

The Design of Cross-border E-commerce


Recommendation System Based on Big Data
2021 6th International Conference on Intelligent Computing and Signal Processing (ICSP) | 978-1-6654-0413-6/20/$31.00 ©2021 IEEE | DOI: 10.1109/ICSP51882.2021.9409014

Technology
Jin Chen Chunqiong WU*
Yango University Yango University
FuZhou, China Engineering Research Center of Business Intellgent in Bin Data
for Fujian Province
FuZhou, China
Corresponding author: cqwu@ygu.edu.cn

Abstract—With the outbreak of the new technological and processing, the entire analysis and processing process is
revolution cross-border e-commerce came into being, building a faced with huge, high-speed diversified information volume of
new way for enterprises to export goods, a trade method that binds real data [2].
commodity trade with Internet information technology to form a
convenient and open trade system and achieve trade The recommendation system helps users search for
interconnection of global economies. Many Chinese enterprises information resources that they may like now and in the future
participate in cross-border e-commerce exports, but their export from the large amount of information in the network by
marketing strategies have drawbacks, and their marketing analyzing the connection patterns between users' evaluations
targeting and precision are not high, which hinder Chinese cross­ and historical preferences of different items in big data sources,
border e-commerce enterprises from carrying out international so as to further provide users with corresponding
marketing. In the context of big data, cross-border e-merchants recommendation services [3]. The improvement of
can use collected data to establish a customer database, build a recommendation algorithm is based on the use of big data
customer portrait model through machine learning technology, processing technology, and the massive data processing
and use personalized recommendation systems to realize accurate technology is more adapted to the needs of recommendation
marketing with two-way customer interaction. To address the system with the development of parallel computing. Among
problems in cross-border e-commerce marketing in the context of them, Hadoop is an open source framework capable of
big data, this paper studies the user behavior data generated by
distributed processing of massive data, Map Reduce is the core
cross-border e-commerce. Based on a large amount of low-value
computing framework of Hadoop, Spark is a distributed batch
density behavioral data of cross-border e-commerce consumers,
this paper designs a cross-border e-commerce precision marketing
processing engine based on memory, its biggest feature is small
system applicable to processing user behavioral data, and provides latency, Spark is more suitable for data processing, machine
a reference for those who are engaged in cross-border e-commerce. learning, interactive analysis, it is more widely used in the
commodity precision marketing recommendation system Spark
Keywords-big data; cross-border e-commerce; recommendation is more suitable for data processing, machine learning,
algorithm; recommendation system; machine learning interactive analysis, and it is more widely used in the product
accurate marketing recommendation system [4].
I. I n t r o d u c t io n
II. B i g D a t a Pr o c e s s in g Te c h n o l o g y
The emergence of cross-border e-commerce has led to a shift
in the path of international trade. A new generation of The development of big data has become the technical
information technology, represented by big data, cloud support for business innovation and cross-border e-commerce.
computing, mobile Internet, Internet of Things, etc., has started In the context of big data, the boundary between retail
to emerge overlappingly, and through the continuous efforts of consumers and Internet users is disappearing, the boundary
all parties involved in the cross-border e-commerce industry, between traditional trade enterprises and cross-border e-
China has created the world's most abundant cross-border trade commerce enterprises is blurring, and data has become the
platform [1]. The expansion of the export scale of cross-border linkage axis, profoundly driving all participants in the whole
e-commerce enterprises has enabled China to transform and trade field. As cross-border e-commerce has Internet attributes,
upgrade the way it conducts trade activities. Cross-border e- big data can directly bring opportunities and challenges for
commerce has caused a huge impact on traditional marketing, cross-border e-commerce [5]. Big data technology can
and the commodity marketing model has begun to change from professionally analyze and process massive information, and it
the crude and wasteful model of the past to an increasingly can extract useful information from the redundant big data
precise and intensive model with a high return on investment, through data processing to accurately segment the global cross­
measuring marketing effectiveness as well as investing in border e-commerce market, realize the analysis of global cross­
precise market segmentation, and precision marketing has begun border e-commerce market demand, realize the tracking of
to develop rapidly. Big data is the use of all the data for analysis logistics and distribution, realize the design optimization of

978-1-6654-0413-6/21/$31.00 ©2021 IEEE


381

Authorized licensed use limited to: ULAKBIM UASL - YILDIZ TEKNIK UNIVERSITESI. Downloaded on May 14,2021 at 18:06:26 UTC from IEEE Xplore. Restrictions apply.
product categories, and realize the standardized analysis of algorithm is used to prevent the Matthew effect to optimize the
product packaging. recommendation algorithm based on hot ranking.
Data mining is to search and mine the hidden knowledge and S pa rk M l lib

Zookeeper Distributed Orchestration Service System


M a h o u t M a c h in e
information with potential business value from the huge, H B a s e C o lu m n a r L e a rn in g
M a c h in e
L e a rn in g
S pa rk S tre a m in g
S trea m
missing, chaotic, noisy, random and other application data of S to ra g e Database A lg o r ith m
L ib r a r y
A lg o r ith m P ro ce ssin g T o o ls

actual production and business processes [6]. From the L ib r a r y

viewpoint of data itself, data mining usually requires data S tr o m stre a m in g


Pig Streaming Shark Big Data
cleaning, data transformation, data mining practical, Hive Data r e a l-tim e
Data Analytics
implementation process, pattern evaluation and knowledge Warehouse c o m p u tin g
Warehouse fra m e w o rk Query System
representation. The basic process of big data mining is shown in
Figure 1. Spark Distributed In­ MapReduce Sqoop inter­
Memory Computing distributed computing database ETL
Framework framework tool

Kafka Releases
Subscription
YARN Resource Management Scheduler
Messaging
System

Flume Log
HDFS Distributed File Storage System
Collection

Figure 2. Hadoop ecosystem architecture

B. Content-based recommendation algorithms


The content-based recommendation algorithm requires
inputting metadata of items, finding the intrinsic connection
Figure 1. Basic process of big data mining between items, and then recommending products for users based
on their past favorite items [7]. The content-based
The Hadoop open source framework provides the foundation recommendation is derived from item data information, and the
platform for big data mining, and the Hadoop ecosystem in a user can intuitively understand the reason for the
broad sense refers to the open source components or products recommendation, which has strong interpretability and can
related to big data technology, including Mapreduce, HDFS, avoid the cold start problem. The schematic diagram of the
distributed application orchestration service Zookeeper, content-based recommendation algorithm is shown in Figure 3.
structured distributed data warehouse Hive, unstructured
distributed data warehouse HBase, general-purpose computing C. Recommendation algorithm based on purchase probability
engine Spark, log collection tool Flume, distributed message prediction
queue Kafka, etc. The Hadoop ecosystem architecture is shown The core of this recommendation algorithm is to predict the
in Figure 2. probability of the user to purchase the item. The prediction
algorithm uses machine learning binary classification models,
III. Pr e c is e m a r k e t in g s y s t e m r e c o m m e n d a t io n
and these models can output the classification result with
ALGORITHM
probability, which is the predicted purchase probability. The
A. Recommendation algorithm based on hotness ranking main machine learning methods include random forest, logistic
regression, factorization, etc. Random forests are combinatorial
This recommendation algorithm, also known as popularity-
classifiers containing multiple weak classifiers-decision tree
based recommendation, is an algorithm that makes
stakes, whose output categories are jointly determined by the
recommendations based on the popularity of a product. Usually,
results of all weak classifiers such as plural, voting, and
this recommendation algorithm selects the products with the
weighted voting [8]. Random forest builds a single decision tree
highest popularity ranking in the e-commerce platform as the
by row sampling and complete splitting, so that each decision
recommendation results to show to customers. The ranking of
tree becomes an expert proficient in a narrow domain, and the
this recommendation algorithm is based on the product sales
final result is obtained by voting all the decision trees in the
popularity, which mainly includes parameters related to the
random forest, ensuring the algorithm has strong generalization
number of clicks, views, purchases or high ratings. The
ability.
recommendation algorithm based on hotness ranking has the
advantages of easy implementation, good performance and helps Logistic regression is a machine learning classification
to solve the user cold start problem. The recommendation model that derives its idea from linear regression, which is
algorithm based on hot ranking mainly focuses on solving essentially a log-linear model, simple to implement, easy to
business problems, and its generalization energy is poor. Usually, parallelize, easy to scale massively, fast to iterate, easy to
multi-dimensional evaluation indicators are introduced, decay interpret using features, and the predictive output of this
weights are introduced considering the time factor, and the MAB probabilistic model is between 0 and 1. Logistic regression
requires a lot of feature engineering work to obtain better

382

Authorized licensed use limited to: ULAKBIM UASL - YILDIZ TEKNIK UNIVERSITESI. Downloaded on May 14,2021 at 18:06:26 UTC from IEEE Xplore. Restrictions apply.
predictive power, its model effectiveness depends on engineers'
domain knowledge and experience, and lacks general guidance
methods [9].

Figure 4. Architecture of accurate marketing recommendation system based on


user behavior

B. Personalized recommendation-based precision marketing


system design for cross-border e-commerce
In data processing technology, Hadoop-based big data
processing method is to store the intermediate calculation results
in the disk first, and each data access operation involves access
to external storage media, which makes the recommendation
efficiency reduced [12]. spark is a memory-based computing
framework, and the continuity of the computation process will
also make the recommendation algorithm computation
efficiency increased.
User3 The functions that need to be realized within the cross-border
Figure 3. Schematic diagram of content-based recommendation algorithm e-commerce precision marketing system include: completing the
data storage function; the precision marketing system processes
IV. Th e d e s i g n o f c r o s s -b o r d e r e -c o m m e r c e massive rating and tagging data through the big data platform
RECOMMENDATION SYSTEM BASED ON BIG DATA TECHNOLOGY and completes the offline recommendation display; starting the
algorithm module after the system receives the data to update the
A. User behavior-based framework structure recommendation results in real time; when a new user logs in for
The system architecture is an abstract model of a user the first time, the system provides non-personalized product
behavior-based recommendation system, trying to depict the recommendations for that user. According to the specific
hierarchical structure and relationship of the components in the requirements of cross-border e-commerce recommendation
recommendation system. The responsibility of the filtering system based on personalized recommendation, its system
candidate layer is to filter a subset of recommended products architecture can be divided into user visualization layer, business
based on certain rules, and this layer is at the bottom of the logic layer, recommendation algorithm layer and data service
architecture [10]. The filtering candidate layer may contain layer and other layers. Among them, the user visualization layer
multiple filtering substrategies, which are parallel to each other. can interact with the system directly. The business logic layer is
The second layer of the architecture is the ranking candidate responsible for connecting the user layer with the data layer and
layer, where the ranking is based on rules, policies or machine handling user operations. In the recommendation algorithm
learning models. Machine learning algorithms such as logistic layer, the recommendation algorithm will combine with Spark
regression, random forest, factorial decomposer, and deep neural recommendation engine and use business data to provide
network are usually used as specific models [11]. The model recommendation service for users. The data service layer mainly
output needs to be transformed into a ranked list of products completes the real-time operation of the business database. The
output to the next layer after certain processing, direct architecture of the cross-border e-commerce precision
probability value reverse ranking or introducing additional marketing system based on personalized recommendation is
models to learn the ranking provided to the personalized output. shown in Figure 5.
The architecture of the product accurate marketing
C. Cross-border e-commerce real-time recommendation
recommendation system based on user behavior is shown in
design
Figure 4.
Based on the product recommendation system design of the
above model, the real-time recommendation module design
based on the model is added. The offline module performs
preference calculation based on all rating records of users, while
the real-time calculation can reflect the recent preferences of
users. When a user gives a high rating to a product, it indicates
that the user may like other products similar to the product in the
recent period; conversely, when a user gives a poor rating in
feedback, it indicates that the user will not buy other products

383

Authorized licensed use limited to: ULAKBIM UASL - YILDIZ TEKNIK UNIVERSITESI. Downloaded on May 14,2021 at 18:06:26 UTC from IEEE Xplore. Restrictions apply.
similar to the product in the recent period. Therefore, in the value and importance of using information such as consumers'
design of real-time product recommendation, in order to make personal information and transaction data of purchased goods
the recommendation result meet the user's recent preferences, for companies in implementing cross-border e-commerce
the system needs to update the recommended products to the precision marketing. On the basis of these data, data mining
user in real time based on the user's rating result of the product. techniques are used to mine the rules with business value. In this
Real-time recommendation of commodities places more paper, after studying Hadoop ecology, distributed file system,
emphasis on this phenomenon, i.e., the system's reaction when distributed computing engine, and common recommendation
one or more new data arrives, and the accuracy requirement for algorithms, we study the design of cross-border e-commerce
recommended commodities can be relaxed appropriately. recommendation system based on big data technology. We give
the general architecture design of cross-border e-commerce
accurate marketing recommendation system based on different
perspectives, hoping to obtain better recommendation effect.
Due to time constraints, the design of cross-border e-commerce
marketing system is not implemented in this paper, and the
detailed design and verification analysis will be continued in the
subsequent research.
Ac k n o w l e d g e me n t

This paper was supported by the project of “Engineering


Research Center of Business Intellgent in Bin Data for Fujian
Province” .
Re f e r en c es

[1] B. H. Wang. Research on the influencing factors of cross-border e-


commerce development in the era of big data. Journal of Jiamusi
Vocational College, 2014(12):188-189.
[2] Y. Xiong. Influencing factors and research on the development of cross­
border e-commerce in the era of big data. Modern Business, 2016(27):32-
33.
[3] L. L. Yang. Analysis of the current situation and countermeasures of
cross-border e-commerce export of Chinese tea products based on big data
and service of "one belt and one road". Statistics and Management,
2016(10):61-65.
[4] W. H. Li. Analysis of precision marketing of traditional retail enterprises
in the context of big data. Business Economic Research, 2019(15):71-74.
[5] Y. L. Yu, I. D. Yang. Bilateral matching recommendation model of cross­
border e-commerce supply and demand with perceived matchmaker
psychological behavior on cross-border e-commerce platform. Journal of
Changchun University of Technology (Social Science Edition),
2019,32(06):101-107.
[6] J. H. Li. Personalized recommendation algorithm for information of
I________________________________________________________________ j
artificial intelligence cross-border e-commerce shopping guide platform
Figure 5. Architecture of cross-border e-commerce precision marketing system based on big data. Science Technology and Engineering,
based on personalized recommendation 2019,19(14):280-285.
[7] C. C. Mo. Research on the application of big data technology in the field
The design workflow of real-time product recommendation. of cross-border e-commerce. Journal of Hubei Open Vocational College,
The system writes a received rating data to the log and uses the 2020,33(01):120-121+126.
log collection tool to collect and filter the process; the data is [8] S. Yang, Q. C. Liu. Optimization of personalized recommendation
strategy for cross-border e-commerce platform based on big data. Foreign
passed into the real-time recommendation algorithm engine, Economic and Trade Practice, 2020(11):33-36.
which is combined with the cached data and the product [9] Q. M. Yu, S. D. Zhu, L. Wu. et al. An empirical study on the impact of
similarity matrix of the main database for calculation and returns customer privacy concerns on the effectiveness of enterprise precision
the result to the database. Usually, we consider that the user's marketing in the era of big data. Journal of Chongqing University of
habits in the recent period have similarity. When the user rates a Commerce and Industry (Natural Science Edition), 2020,37(04):95-103.
product, we get that rating data, and we first extract the products [10] S. F. Diao, L. R. Feng. Application of "big data+" in the digital
similar to that product based on the product similarity matrix, transformation of cross-border e-commerce. China Statistics,
2020(05):71-73.
and extract the product's recent K ratings, and calculate the
recommendation priority score for each alternative product, [11] P. P. Xu. Analysis of repeat purchase behavior and influencing factors of
e-commerce self-owned brands. Business Economics Research,
merge and update with the previous real-time recommendation 2020(19):91-94.
results, and write back to the real-time recommendation list. [12] J. X. Chen, S. K. Zhao. Discussion on the conversion path of marketing
system of China's trade circulation enterprises--a perspective of big data
V. Co n c l u s i o n development. Business Economics Research, 2021(04):76-79.
This paper explains the concept of cross-border e-commerce
precision marketing and related algorithms, and analyzes the

384

Authorized licensed use limited to: ULAKBIM UASL - YILDIZ TEKNIK UNIVERSITESI. Downloaded on May 14,2021 at 18:06:26 UTC from IEEE Xplore. Restrictions apply.

You might also like