You are on page 1of 5

Оглавление

Edge computing.................................................................................................................1
OLAP Cube.........................................................................................................................1
RollUp Table......................................................................................................................2
Materialized view..............................................................................................................2
CRDT..................................................................................................................................2
Interval Tree Clocks............................................................................................................3
Chord Protocol (Consistent hashing)...................................................................................4
Bloom filter........................................................................................................................4

Edge computing
Edge computing is a distributed computing paradigm that brings computation and data
storage closer to the location where it is needed, to improve response times and save
bandwidth
 the late 1990s to serve web and video content from edge servers that were deployed close to
users.[2] In the early 2000s, these networks evolved to host applications and application
components at the edge servers,[3] resulting in the first commercial edge computing services[4] that
hosted applications such as dealer locators, shopping carts, real-time data aggregators, and ad
insertion engines.
OLAP Cube
OLAP databases are divided into one or more cubes. The cubes are designed in such a way
that creating and viewing reports become easy.
The OLAP cube is a data structure optimized for very quick data analysis. The term cube
here refers to a multi-dimensional dataset, which is also sometimes called a hypercube if
the number of dimensions is greater than 3.
The OLAP Cube consists of numeric facts called measures which are categorized by
dimensions. OLAP Cube is also called the hypercube.

The cube can store and analyze multidimensional data in a logical and orderly
manner.

How does it work?

A Data warehouse would extract information from multiple data sources and formats
like text files, excel sheet, multimedia files, etc.

The extracted data is cleaned and transformed. Data is loaded into an OLAP server
(or OLAP cube) where information is pre-calculated in advance for further analysis.

RollUp Table
Rollup tables are tables where totals for (combinations of) conditions are saved. A “summary
table”, that holds pre-aggregated values for all conditions you may need to fetch totals for.
If your data table has millions of entries, you can imagine it makes more sense directing your
queries against a very small rollup table, than at the huge primary dataset.
An example rollup table could look like this:

category color total


shirts red 23578347
shirts green 14364323
shirts blue 46723343
accessorie
red 3452465
s
accessorie
green 867665
s
accessorie
blue 7609852
s
pants red 56878766
pants green 87067876
pants blue 759457363

Materialized view
In computing, a materialized view is a database object that contains the results of
a query. For example, it may be a local copy of data located remotely, or may be a
subset of the rows and/or columns of a table or join result, or may be a summary
using an aggregate function.
The process of setting up a materialized view is sometimes called materialization.
[1]
 This is a form of caching the results of a query, similar to memoization of the value
of a function in functional languages, and it is sometimes described as a form
of precomputation.[2][3] As with other forms of precomputation, database users
typically use materialized views for performance reasons, i.e. as a form of
optimization

CRDT
A conflict-free replicated data type (CRDT) is an abstract data type, with a well-
defined interface, designed to be replicated at multiple processes and exhibiting the
following properties: (1) any replica can be modified without coordinating with
another replicas; (2) when any two replicas have received the same set of updates,
they reach the same state, deterministically, by adopting mathematically sound rules
to guarantee state convergence.
Development was initially motivated by collaborative text editing and mobile
computing. CRDTs have also been used in online chat systems, online gambling,
and in the SoundCloud audio distribution platform.

Concurrent updates to multiple replicas of the same data, without coordination between the
computers hosting the replicas, can result in inconsistencies between the replicas, which in the
general case may not be resolvable. Restoring consistency and data integrity when there are
conflicts between updates may require some or all of the updates to be entirely or partially
dropped.

Interval Tree Clocks


The issue with such systems is that, even though there may not be a lot of active
devices at any given point in time, the number of unique node identifiers in version
vectors keeps increasing. We call that issue actor explosion.

Classic causality tracking mechanisms, such as version vectors and vector clocks, have
been designed under the assumption of a fixed, well known, set of participants.
These mechanisms are less than ideal when applied to dynamic scenarios, subject to
variable numbers of participants and churn. E.g. in the Amazon Dynamo system old
entries on version vectors are pruned to conserve space, and errors can be
introduced.

Interval Tree Clocks (ITC) is a new clock mechanism that can be used in scenarios
with a dynamic number of participants, allowing a completely decentralized creation
of processes/replicas without need for global identifiers. The mechanism has a
variable size representation that adapts automatically to the number of existing
entities, growing or shrinking appropriately.

Chord Protocol (Consistent hashing)


Chord is based on consistent hashing, which assigns hash keys to nodes in a way that
doesn't need to change much as nodes join and leave the system.
The Chord protocol supports just one operation: given a key, it will determine the node
responsible for storing the key's value. Chord does not itself store keys and values, but
provides primitives that allow higher-layer software to build a wide variety of storage
system; the Cooperative File System (CFS) is one such use of the Chord primitive.

 when adding or removing servers, the number of keys that need to be relocated is
minimized.
 is a special kind of hashing such that when a hash table is resized, only n/m keys need to
be remapped on average whereas n is the number of keys and m is the number of slots. In
contrast, in most traditional hash tables, a change in the number of array slots causes nearly
all keys to be remapped because the mapping between the keys and the slots is defined by
a modular operation.

Each machine acting as a Chord server has a unique 160-bit Chord node identifier,
produce by a SHA hash of the node's IP address. Chord views the IDs as occupying a
circular identifier space. Hash keys are also mapped into this identifier space, by
hashing them to 160-bit key identifiers. Chord defines the node responsible for a key to
be that key's ``successor.'' The successor of a key or node identifier j is j if j is a node
identifier, and otherwise it is the first node encountered in the identifier space clockwise
from the position of j. Chord's primary task is to find these successors.

Bloom filter
A Bloom filter is a data structure designed to tell you, rapidly and memory-
efficiently, whether an element is present in a set.

The price paid for this efficiency is that a Bloom filter is a probabilistic data
structure: it tells us that the element either definitely is not in the set or may be in
the set.

Bit vector

Each empty cell in that table represents a bit, and the number below it its index. To
add an element to the Bloom filter, we simply hash it a few times and set the bits in
the bit vector at the index of those hashes to 1.
To test for membership, you simply hash the string with the same hash functions,
then see if those values are set in the bit vector. If they aren't, you know that the
element isn't in the set. If they are, you only know that it might be, because another
element or some combination of other elements could have set the same bits.

You might also like