You are on page 1of 7

What is In-Memory Analytics?

To put it simple: Performing analysis on the memory/RAM. Flash memory’s price is


decreasing and the RAM can store Terabyte-size datasets. And since 64 bit processors can
address up to 16 exabytes of data , in memory analytics is becoming more affordable by the
year.

Why do we need it?

Consumerization is the key for unlocking BI data. Gartner research has concluded that for a
successful BI and Big Data implementation consumerization of data would be very
important. In memory analytics provides the capability to perform analysis on terabytes of
data.

More accessible data would lead to more analysis performed by the business users. Of the 9
emergent technologies impacting BI the most, five are consumer-behavior-driven and one of
the key technology is “In memory analytics”.

Gartner : “In-Memory analytics will have an industry impact comparable to web and cloud”

Drivers for In-Memory Analytics:

– Declining memory prices

– 64-bit computing

– Multicore

Benefits:

– Fast queries against big data and improved response time on analytical applications

– Less need for aggregates and low latency application messaging

– Real-time end-user analysis and enabling self service BI

– Detecting correlations/patterns in realtime

– Running OLTP and OLAP application in the same physical dataset.

SAP HANA and Oracle Exalytics

( Couple of in-memory analytics solutions available today )


Both products provide a platform for in-memory database and a slew of tools and software
for creating analytical processes and process terabytes of data in a flash.

SAP HANA

“SAP HANA enables businesses to make smarter, faster decisions through real-time analysis
and reporting combined with dramatically accelerated business processes. Lack of delay
between insight and action turns your business into a real-time business.”

Oracle Exalytics

“The industry’s first in-memory BI machine that delivers the fastest performance for business
intelligence and planning applications.

Oracle Exalytics In-Memory Machine is the world‟s first engineered system specifically
designed to deliver high performance analysis, modeling and planning. Built using industry-
standard hardware, market-leading business intelligence software and in-memory database
technology, Oracle Exalytics is an optimized system that delivers answers to all your business
questions with unmatched speed, intelligence, simplicity and manageability.”

2. In-database-processing with Big Data:


In-database processing enables blending and analysis of large sets of data without moving the
data out of a database, and can provide significant performance improvements and deeper
insights over traditional approaches, which require data to be moved into a separate
environment for processing or force a user to leverage only a subset of data.Doing so
eliminates the time and effort required to transform data and move it back and forth between
a database and a separate analytics application.

In-database analytics allows analytical data marts to be consolidated in the enterprise data
warehouse. Data retrieval and analysis are much faster and corporate information is more
secure because it doesn’t leave the EDW. This approach is useful for helping companies
make better predictions about future business risks and opportunities, identify trends, and
spot anomalies to make informed decisions more efficiently and affordably

Companies use in-database analytics for applications requiring intensive processing – for
example, fraud detection, credit scoring, risk management, trend and pattern recognition, and
balanced scorecard analysis. In-database analytics also facilitates ad hoc analysis, allowing
business users to create reports that do not already exist or drill deeper into a static report to
get details about accounts, transactions, or records.

Advanced Analytics – the Traditional Way:

In-database Analytics - The Latest Way:


Advantages:

 It is using the database engine to access data. That means as long as your data fit into
your database, the analytics can use that. It is especially important in big data era.
 It eliminates the data loading process. That means it is fast, and also you can do the
analytics even though the data won't fit into your machine (not the database server) 's
memory/disk.

 Replace SQL coding for large scale data blending and analytic processes within
databases.

 Leverage the full dataset for accurate analytic insights

SMP
Multiprocessing (MP), involves computer hardware and software architecture where there are
multiple(two or more) processing units executing programs for the single operating(computer)
system.

SMP i.e. symmetric multiprocessing, refers to the computer architecture where multiple
identical processors are interconnected to a single shared main memory, with full
accessibility to all the I/O devices, unlike asymmetric MP. In other words, all the processors
have common shared(common) memory and same data path or I/O bus as shown in the
figure.

Characteristics of SMP

Identical: All the processors are treated equally i.e. all are identical.
Communication: Shared memory is the mode of communication among processors.

Complexity: Are complex in design, as all units share same memory and data bus.

Expensive: They are costlier in nature.

Unlike asymmetric where a task is done only by Master processor, here tasks of the operating
system are handled individually by processors.

Applications

This concept finds its application in parallel processing, where time-sharing systems(TSS)
have assigned tasks to different processors running in parallel to each other, also in TSS that
uses multithreading i.e. multiple threads running simultaneously.

Advantages

Throughput: Since tasks can be run by all the processors unlike in asymmetric, hence
increased degree of throughput(processes executed in unit time).

Reliability: Failing a processor doesn’t fail whole system, as all are equally capable
processors, though throughput do fail a little.

Disadvantages

Complex design: Since all the processors are treated equally by OS, so designing and
management of such OS become difficult.

Costlier: As all the processors share the common main memory, on account of which size of
memory required is larger implying more expensive.

MPP (massively parallel processing):

MPP (massively parallel processing) is the coordinated processing of a program by multiple


processors working on different parts of the program. Each processor has its own operating
system and memory. MPP speeds the performance of huge databases that deal with massive
amounts of data.

A massively parallel processing (MPP) system consists of a large number of small


homogeneous processing nodes interconnected via a high-speed network. The processing
nodes in an MPP machine are independent—they typically do not share memory, and
typically each processor may run its own instance of an operating system, although there may
be systemic controller applications hosted on leader processing nodes that instruct the
individual processing nodes in the MPP configuration on the tasks to perform.
Nodes on MPP machines may also be connected directly to their own I/O devices, or I/O may
be channeled into the entire system via high-speed interconnects. Communication between
nodes is likely to occur in a coordinated fashion, where all nodes stop processing and
participate in an exchange of data across the network, or in an uncoordinated fashion, with
messages targeted for specific recipients being injected into the network independently.

Because data can be streamed through the network and targeted for specific nodes, an MPP
machine is nicely suited for data parallel applications. In this case, all processors execute the
same program on different data streams. In addition, because individual processors can
execute different programs, an MPP machine is nicely suited to coarse-grained parallelism
and can be configured for pipelined execution as well.

You might also like