You are on page 1of 17

ARC II, LESSON I

DISTRIBUTED SYSTEMS
 Network might not be reliable
 Hardware components may fail independently
 No shared memory
 No clock synchronization
 Malicious agents:
1. A server that is misbehaving
2. An agent that interacts with block chains
OLTP – normalized, write/read-
write intense, data bases, large nr.
of short online transactions

OLAP – denormalized, rea-


intense, ad-hoc (as the need arises)
queries, data stores, historical data
DWH (DATA WAREHOUSE)
 Systems of denormalized databases that hold historical data in a relational database.
 They use special schemas and are designed for analytic/business purposes.
 Yes, we use relational DBs, BUT we also use schemas and we DO NOT normalize data.
 We use dimensional modelling.
 Denormalization means dividing data into fact tables (business processes) and dimension
tables (contain info about some specific aspect of fact tables).
STAR SCHEMA – DIMENSIONS
SNOWFLAKE SCHEMA –
DIMENSIONS^2
DATA LAKES
A storage and analytics
system that is meant to
ingest big volumes of data
and allow for ETL and
analytics use cases on it.
Basically a very big storage
that doesn’t care about
the format of data.

ETL – extract, transform, load


(copying data into a destination
system which represents data
differently)
DISTRIBUTED FILE SYSTEMS THAT
SATISFY DATA LAKE STORAGE
REQUIREMENTS
MAP REDUCE
 1. Map
 2. Shuffle
 3. Reduce
OLAP CUBE
 An OLAP cube is a  multi-dimensional array of data.
 Online Analytical Processing (OLAP) is a computer-
based technique of analyzing data to look for insights.
 An OLAP cube is a data structure that overcomes the
limitations of relational databases by providing rapid
analysis of data. Cubes can display and sum large
amounts of data while also providing users with
searchable access to any data points. This way, the data
can be rolled up, sliced, and diced as needed to handle
the widest variety of questions that are relevant to a
user's area of interest.
OLAP CUBE
 OLAP предоставляет удобные быстродействующие средства
доступа, просмотра и анализа деловой информации. Пользователь
получает естественную, интуитивно понятную модель данных,
организуя их в виде многомерных кубов (Cubes). Осями
многомерной системы координат служат основные атрибуты
анализируемого бизнес-процесса. Например, для продаж это могут
быть товар, регион, тип покупателя. В качестве одного из
измерений используется время
 В качестве мер в трехмерном кубе, изображенном на рис. 2,
использованы суммы продаж, а в качестве измерений - время,
товар и магазин. Измерения представлены на определенных
уровнях группировки: товары группируются по категориям,
магазины - по странам, а данные о времени совершения операций -
по месяцам. Чуть позже мы рассмотрим уровни группировки
(иерархии) подробнее.
OLAP CUBE
 Двумерное представление куба
можно получить, "разрезав" его
поперек одной или нескольких осей
(измерений): мы фиксируем
значения всех измерений, кроме
двух, - и получаем обычную
двумерную таблицу. 
EDGE COMPUTING
 Edge computing is a distributed
computing paradigm that brings
computation and data storage closer to
the location where it is needed, to
improve response times and save
bandwidth.
 The word edge in this context means
literal geographic distribution. Edge
computing is computing that’s done at
or near the source of the data, instead of
relying on the cloud at one of a dozen
data centers to do all the work. It doesn’t
mean the cloud will disappear. It means
the cloud is coming to you.
Article: https://www.theverge.com/circuitbreaker/2018/5/7/17327584/edge-computing-cloud-google-microsoft-
apple-amazon
LATENCY
 One great driver for edge computing is the speed of light. If a Computer A needs to ask
Computer B, half a globe away, before it can do anything, the user of Computer A perceives
this delay as latency. 
SECURITY
 It might be weird to think of it this way, but the security and privacy features of an iPhone are
well accepted as an example of edge computing. Simply by doing encryption and storing
biometric information on the device, Apple offloads a ton of security concerns from the
centralized cloud to its diasporic users’ devices.
BANDWIDTH
 For instance, if you buy one security camera, you can probably stream all of its footage to the
cloud. If you buy a dozen security cameras, you have a bandwidth problem. But if the cameras
are smart enough to only save the “important” footage and discard the rest, your internet pipes
are saved.
EXAMPLE
 Self-driving cars are, as far as I’m aware, the ultimate example of edge computing. Due to
latency, privacy, and bandwidth, you can’t feed all the numerous sensors of a self-driving car
up to the cloud and wait for a response. Your trip can’t survive that kind of latency, and even if
it could, the cellular network is too inconsistent to rely on it for this kind of work.

You might also like