Professional Documents
Culture Documents
Abstract—The widespread application of cloud computing convenient way of using and service model. Major Internet
2021 5th International Conference on Electronics, Communication and Aerospace Technology (ICECA) | 978-1-6654-3524-6/21/$31.00 ©2021 IEEE | DOI: 10.1109/ICECA52323.2021.9675845
technology makes data show an explosive growth trend, and companies and scientific research institutions have invested
poses new challenges to traditional data management technology. huge human and financial resources to develop their related
Existing cloud storage systems generally use distributed hash technologies and research reasonable applications, such as
tables to access data. This key-value-based model can achieve Google’s Mapreduce technology, IBM’s Blue Cloud project,
higher access efficiency in single-dimensional queries, but it does and the Azure platform provided by Microsofe. At present,
not support multi-dimensional queries. Therefore, in recent major Internet service platforms at ho me and abroad have
years, cloud storage auxiliary indexing has become a hot topic in widely used cloud computing and big data technology and
academic research, and related results have been published in top
international conferences and top journals in the field of
have achieved good results. Most of the text and picture data
database. This paper studies the distributed multi -dimensional
in the do mestic Weibo social p latform and shopping trade
data index strategy under the cloud computing environment. The platform are stored on the cloud platform. The hotspot
work of the thesis is carried out from two aspects: multi- informat ion in a fixed time period of the Internet platform can
dimensional data index in cloud storage and distributed be summarized by the user's visits and the click-through rate
computing. of the event during the time period. The shopping platform can
lock the type of products that the user needs to buy in th e
Keywords—Distributed Computing, Multidimensional Data recent period according to the user's browsing information and
Index, Cloud Computing, Big Data analyze the reco mmendation system Reco mmend
corresponding products to users accurately. According to the
I. INT RODUCT ION results of big data analysis, government departments can keep
Multidimensional data index has always been an important abreast of social trends and make correct guidance. It can be
research problem in the field of data management. With the seen that the results of big data analysis are very important for
arrival of the big data era, traditional relational data the corresponding decision-making of enterprises and
management systems are gradually unable to meet the needs institutions, and this series of technological advances poses
of practical applications in terms of efficiency and scalability. severe challenges to the index manage ment of cloud data [12-
Large-scale distributed cloud storage systems have become a 16].
new carrier of big data. Ho w to improve the performance of The rapid progress of Internet technology and the increase
mu lti-dimensional data query in the cloud computing in the frequency of use of GIS technology in people's lives
environment is one of the core issues in the cloud computing have produced a large amount of spatial data. However, the
field [1-5]. requirements for efficient management of these spatial data
With the rapid develop ment of Internet technology, the are constantly updated with the development of time. In the
data generated in the fields of science, engineering, and cloud computing environ ment, if the reasonable storage and
business computing has shown an explosive growth trend. efficient indexing of spatial data can be realized, it will make
IDC statistical report shows that the total amount of global users more convenient and convenient to use spatial data,
data in 2009 was about 0.8ZB, and by 2010, the total amoun t make the application of spatial data more suitable for reality,
of data had reached 1.2ZB. In just one year, the amount of and have a wider range of applications. Develop greater value
data almost doubled. However, the rate of data growth is still and guide people’s daily behaviors, wh ich can contribute to
accelerating. It is estimated that by 2020, the statistical value the development of the real society [17-20].
of this data will reach 35ZB, which is 44 times the amount of As the most basic infrastructure in cloud co mputing, data
data in 2009. The rapid growth of data poses severe challenges storage systems play a very important role in cloud co mputing.
to the storage and computing capabilit ies of existing IT Through cluster technology, distributed computing,
architectures in all walks of life. By continuously increasing virtualizat ion and other technologies, a large number of cheap
system hardware investment to imp rove system scalability, the and different types of storage media are managed by the cloud
business departments have been overwhelmed. Once the storage system to form a storage resource pool to provide
concept of cloud co mputing was put forward, it has received users with services. In the cloud storage model, data storage
extensive attention from industry and academia [6-11]. and management have beco me mo re centralized and
Cloud computing comb ines the advantages of distributed decentralized. Centralization means that data is stored in the
processing, parallel processing and grid computing, and is cloud in a unified manner for users, and users can obtain data
developed on this basis. It can be said that it is the commercial as long as they request, without paying attention to where the
realization of these computer science concepts, and it is a data comes from and how to manage it. The cloud storage
Authorized licensed use limited to: ANNA UNIVERSITY. Downloaded on December 24,2022 at 13:31:58 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Fifth International Conference on Electronics, Communication and Aerospace Technology (ICECA 2021)
IEEE Xplore Part Number: CFP21J88-ART; ISBN: 978-1-6654-3524-6
system provides users with a convenient and efficient user mu lti-dimensional unified index; this kind of spatio-temporal
experience. Decentralizat ion is for cloud data centers. Un like index method is mostly extended to tree index. Thus, fro m the
traditional centralized data storage, the cloud storage system perspective of data structure, tree structure is the mainstream
uses a large-scale distributed data storage solution, and the data structure of spatio-temporal index.
data is stored in a large number of different data nodes. This
1n1 n2vs
storage architecture has obvious advantages, mainly reflected v 1 P 1 vs (1)
in the fo llowing three aspects: High scalability. The cloud n1 n2
storage system adopts a parallel expansion method. The newly
purchased data server only needs to install the operating ai zi ( zi1 2kd 1) (2)
system and cloud storage software. After a simple
configuration, it can be added to the storage pool to achieve B. Multidimensional Data Index
capacity expansion [21-24]. Multidimensional data indexing has always been one of
the key research issues in the database field. There are already
II. T HE PROPOSED MET HODOLOGY
some relatively mature indexing technologies in relational
A. Distributed Computing databases. This section focuses on the analysis of existing
Most of the current spatio-temporal indexes are serial mu ltid imensional indexes based on tree structures,
spatio-temporal indexes in a centralized environment, and dimensionality reduction methods based on space curve filling,
and bitmap indexes.
most of the spatio-temporal data indexes evolved from spatial
indexes, especially so me based on the evolved spatio-temporal The tree structure is mo re efficient than sequential file
data indexes. In theory, high-dimensional and its variants can query, and the maintenance cost is also less. Therefore, the
be used as high-dimensional spatiotemporal indexes. They use tree structure has attracted much attention in index research.
the smallest space-time bounding rectangle to cluster space- The most representative one is the B-tree index, wh ich has
time objects into a hierarchical tree structure. This constraint been successfully applied in a large number of data
boundary may not represent the entire data range, and may management systems and file systems. However, B-trees only
partially overlap. The overlap problem is one of the support one-dimensional key-value queries, and mult iple B-
bottlenecks of the indexing method based on data partition, trees need to be established for mult i-d imensional queries,
which makes the storage space occupied by the index larger
because even a simp le point query may need to check mult iple
and the maintenance of the index more co mplicated. Therefore,
query paths. Especially when they are used for current relat ive
many supporting multi-dimensional index tree structures have
time data and mobile data, obvious node overlap problems and
been proposed, such as: R tree, KD tree, quad tree, octree and
dead space problems will seriously affect index performance. so on. This article mainly analy zes R-tree index and KD-tree
index.
Authorized licensed use limited to: ANNA UNIVERSITY. Downloaded on December 24,2022 at 13:31:58 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Fifth International Conference on Electronics, Communication and Aerospace Technology (ICECA 2021)
IEEE Xplore Part Number: CFP21J88-ART; ISBN: 978-1-6654-3524-6
Authorized licensed use limited to: ANNA UNIVERSITY. Downloaded on December 24,2022 at 13:31:58 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Fifth International Conference on Electronics, Communication and Aerospace Technology (ICECA 2021)
IEEE Xplore Part Number: CFP21J88-ART; ISBN: 978-1-6654-3524-6
Authorized licensed use limited to: ANNA UNIVERSITY. Downloaded on December 24,2022 at 13:31:58 UTC from IEEE Xplore. Restrictions apply.