Professional Documents
Culture Documents
Big data analytics is the use of advanced analytic techniques against very large, diverse data sets that
include structured, semi-structured and unstructured data, from different sources, and in different sizes
from terabytes to zettabytes.
In today’s world, there are a lot of data. Big companies utilize those data for their business
growth. By analyzing this data, the useful decision can be made in various cases as
discussed below:
1. Tracking Customer Spending Habit, Shopping Behavior: In big retails store (like
Amazon, Walmart, Big Bazar etc.) management team has to keep data of customer’s
spending habit (in which product customer spent, in which band they wish to spent, how
frequently they spent), shopping behavior, customer’s most liked product (so that they can
keep those products in the store). Which product is being searched/sold most, based on
that data, production/collection rate of that product get fixed.
Banking sector uses their customer’s spending behavior-related data so that they can
provide the offer to a particular customer to buy his particular liked product by using
bank’s credit or debit card with discount or cashback. By this way, they can send the right
offer to the right person at the right time.
2. Recommendation: By tracking customer spending habit, shopping behavior, Big
retails store provide a recommendation to the customer. E-commerce site like Amazon,
Walmart, Flipkart does product recommendation. They track what product a customer is
searching, based on that data they recommend that type of product to that customer.
As an example, suppose any customer searched bed cover on Amazon. So, Amazon got
data that customer may be interested to buy bed cover. Next time when that customer will
go to any google page, advertisement of various bed covers will be seen. Thus,
advertisement of the right product to the right customer can be sent.
YouTube also shows recommend video based on user’s previous liked, watched video
type. Based on the content of a video, the user is watching, relevant advertisement is
shown during video running. As an example suppose someone watching a tutorial video of
Big data, then advertisement of some other big data course will be shown during that
video.
3. Smart Traffic System: Data about the condition of the traffic of different road,
collected through camera kept beside the road, at entry and exit point of the city, GPS
device placed in the vehicle (Ola, Uber cab, etc.). All such data are analyzed and jam-free
or less jam way, less time taking ways are recommended. Such a way smart traffic system
can be built in the city by Big data analysis. One more profit is fuel consumption can be
reduced.
4. Secure Air Traffic System: At various places of flight (like propeller etc) sensors
present. These sensors capture data like the speed of flight, moisture, temperature, other
environmental condition. Based on such data analysis, an environmental parameter within
flight are set up and varied.
By analyzing flight’s machine-generated data, it can be estimated how long the machine
can operate flawlessly when it to be replaced/repaired.
5. Auto Driving Car: Big data analysis helps drive a car without human interpretation. In
the various spot of car camera, a sensor placed, that gather data like the size of the
surrounding car, obstacle, distance from those, etc. These data are being analyzed, then
various calculation like how many angles to rotate, what should be speed, when to stop,
etc carried out. These calculations help to take action automatically.
6. Virtual Personal Assistant Tool: Big data analysis helps virtual personal assistant tool
(like Siri in Apple Device, Cortana in Windows, Google Assistant in Android) to provide
the answer of the various question asked by users. This tool tracks the location of the user,
their local time, season, other data related to question asked, etc. Analyzing all such data,
it provides an answer.
As an example, suppose one user asks “Do I need to take Umbrella?”, the tool collects
data like location of the user, season and weather condition at that location, then analyze
these data to conclude if there is a chance of raining, then provide the answer.
7. IoT:
Manufacturing company install IOT sensor into machines to collect operational
data. Analyzing such data, it can be predicted how long machine will work
without any problem when it requires repairing so that company can take action
before the situation when machine facing a lot of issues or gets totally down.
Thus, the cost to replace the whole machine can be saved.
In the Healthcare field, Big data is providing a significant contribution. Using
big data tool, data regarding patient experience is collected and is used by
doctors to give better treatment. IoT device can sense a symptom of probable
coming disease in the human body and prevent it from giving advance treatment.
IoT Sensor placed near-patient, new-born baby constantly keeps track of various
health condition like heart bit rate, blood presser, etc. Whenever any parameter
crosses the safe limit, an alarm sent to a doctor, so that they can take step
remotely very soon.
8. Education Sector: Online educational course conducting organization utilize big data
to search candidate, interested in that course. If someone searches for YouTube tutorial
video on a subject, then online or offline course provider organization on that subject send
ad online to that person about their course.
9. Energy Sector: Smart electric meter read consumed power every 15 minutes and sends
this read data to the server, where data analyzed and it can be estimated what is the time in
a day when the power load is less throughout the city. By this system manufacturing unit
or housekeeper are suggested the time when they should drive their heavy machine in the
night time when power load less to enjoy less electricity bill.
10. Media and Entertainment Sector: Media and entertainment service providing
company like Netflix, Amazon Prime, Spotify do analysis on data collected from their
users. Data like what type of video, music users are watching, listening most, how long
users are spending on site, etc are collected and analyzed to set the next business strategy.
Hadoop Ecosystem
Difficulty Level : Easy
Last Updated : 02 Aug, 2021
Yet Another Resource Negotiator, as the name implies, YARN is the one who
helps to manage the resources across the clusters. In short, it performs
scheduling and resource allocation for the Hadoop System.
Consists of three major components i.e.
0. Resource Manager
1. Nodes Manager
2. Application Manager
Resource manager has the privilege of allocating resources for the applications
in a system whereas Node managers work on the allocation of resources such as
CPU, memory, bandwidth per machine and later on acknowledges the resource
manager. Application manager works as an interface between the resource
manager and node manager and performs negotiations as per the requirement of
the two.
MapReduce:
With the help of SQL methodology and interface, HIVE performs reading and
writing of large data sets. However, its query language is called as HQL (Hive
Query Language).
It is highly scalable as it allows real-time processing and batch processing both.
Also, all the SQL datatypes are supported by Hive thus, making the query
processing easier.
Similar to the Query Processing frameworks, HIVE too comes with two
components: JDBC Drivers and HIVE Command Line.
JDBC, along with ODBC drivers work on establishing the data storage
permissions and connection whereas HIVE Command line helps in the
processing of queries.
Mahout:
It’s a platform that handles all the process consumptive tasks like batch
processing, interactive or iterative real-time processing, graph conversions, and
visualization, etc.
It consumes in memory resources hence, thus being faster than the prior in terms
of optimization.
Spark is best suited for real-time data whereas Hadoop is best suited for
structured data or batch processing, hence both are used in most of the
companies interchangeably.
Apache HBase:
It’s a NoSQL database which supports all kinds of data and thus capable of
handling anything of Hadoop Database. It provides capabilities of Google’s
BigTable, thus able to work on Big Data sets effectively.
At times where we need to search or retrieve the occurrences of something small
in a huge database, the request must be processed within a short quick span of
time. At such times, HBase comes handy as it gives us a tolerant way of storing
limited data
Other Components: Apart from all of these, there are some other components too that
carry out a huge task in order to make Hadoop capable of processing large datasets. They
are as follows:
Solr, Lucene: These are the two services that perform the task of searching and
indexing with the help of some java libraries, especially Lucene is based on Java
which allows spell check mechanism, as well. However, Lucene is driven by
Solr.
Zookeeper: There was a huge issue of management of coordination and
synchronization among the resources or the components of Hadoop which
resulted in inconsistency, often. Zookeeper overcame all the problems by
performing synchronization, inter-component based communication, grouping,
and maintenance.
Oozie: Oozie simply performs the task of a scheduler, thus scheduling jobs and
binding them together as a single unit. There is two kinds of jobs .i.e Oozie
workflow and Oozie coordinator jobs. Oozie workflow is the jobs that need to be
executed in a sequentially ordered manner whereas Oozie Coordinator jobs are
those that are triggered when some data or external stimulus is given to it.
242. What do you mean by MapReduce? *
MapReduce is a programming paradigm that enables massive scalability across hundreds or
thousands of servers in a Hadoop cluster. As the processing component, MapReduce is the heart of
Apache Hadoop. The term "MapReduce" refers to two separate and distinct tasks that Hadoop programs
perform.
243. What do you mean by Hadoop? *
Hadoop is an open-source software framework for storing data and running applications on
clusters of commodity hardware. It provides massive storage for any kind of data, enormous
processing power and the ability to handle virtually limitless concurrent tasks or jobs.
244. Write down the names of some components of Hadoop. *
Hadoop Ecosystem
Difficulty Level : Easy
Last Updated : 02 Aug, 2021
Note: Apart from the above-mentioned components, there are many other components too
that are part of the Hadoop ecosystem.
All these toolkits or components revolve around one term i.e. Data. That’s the beauty of
Hadoop that it revolves around data and hence making its synthesis easier.
HDFS:
Yet Another Resource Negotiator, as the name implies, YARN is the one who
helps to manage the resources across the clusters. In short, it performs
scheduling and resource allocation for the Hadoop System.
Consists of three major components i.e.
0. Resource Manager
1. Nodes Manager
2. Application Manager
Resource manager has the privilege of allocating resources for the applications
in a system whereas Node managers work on the allocation of resources such as
CPU, memory, bandwidth per machine and later on acknowledges the resource
manager. Application manager works as an interface between the resource
manager and node manager and performs negotiations as per the requirement of
the two.
MapReduce:
With the help of SQL methodology and interface, HIVE performs reading and
writing of large data sets. However, its query language is called as HQL (Hive
Query Language).
It is highly scalable as it allows real-time processing and batch processing both.
Also, all the SQL datatypes are supported by Hive thus, making the query
processing easier.
Similar to the Query Processing frameworks, HIVE too comes with two
components: JDBC Drivers and HIVE Command Line.
JDBC, along with ODBC drivers work on establishing the data storage
permissions and connection whereas HIVE Command line helps in the
processing of queries.
Mahout:
It’s a platform that handles all the process consumptive tasks like batch
processing, interactive or iterative real-time processing, graph conversions, and
visualization, etc.
It consumes in memory resources hence, thus being faster than the prior in terms
of optimization.
Spark is best suited for real-time data whereas Hadoop is best suited for
structured data or batch processing, hence both are used in most of the
companies interchangeably.
Apache HBase:
It’s a NoSQL database which supports all kinds of data and thus capable of
handling anything of Hadoop Database. It provides capabilities of Google’s
BigTable, thus able to work on Big Data sets effectively.
At times where we need to search or retrieve the occurrences of something small
in a huge database, the request must be processed within a short quick span of
time. At such times, HBase comes handy as it gives us a tolerant way of storing
limited data
Other Components: Apart from all of these, there are some other components too that
carry out a huge task in order to make Hadoop capable of processing large datasets. They
are as follows:
Solr, Lucene: These are the two services that perform the task of searching and
indexing with the help of some java libraries, especially Lucene is based on Java
which allows spell check mechanism, as well. However, Lucene is driven by
Solr.
Zookeeper: There was a huge issue of management of coordination and
synchronization among the resources or the components of Hadoop which
resulted in inconsistency, often. Zookeeper overcame all the problems by
performing synchronization, inter-component based communication, grouping,
and maintenance.
Oozie: Oozie simply performs the task of a scheduler, thus scheduling jobs and
binding them together as a single unit. There is two kinds of jobs .i.e Oozie
workflow and Oozie coordinator jobs. Oozie workflow is the jobs that need to be
executed in a sequentially ordered manner whereas Oozie Coordinator jobs are
those that are triggered when some data or external stimulus is given to it.
Apache Sqoop is a big data tool for transferring data between Hadoop and relational database
servers. Sqoop is used to transfer data from RDBMS (relational database management system) like
MySQL and Oracle to HDFS (Hadoop Distributed File System).
248. What is Flume? *
Apache Flume is an open-source, powerful, reliable and flexible system used to collect, aggregate and
move large amounts of unstructured data from multiple data sources into HDFS/Hbase (for
example) in a distributed fashion via it's strong coupling with the Hadoop cluster.
249. State the difference between Sqoop and Flume. *
Sqoop and Flume both are meant to fulfill data ingestion needs but they serve different purposes. Apache
Flume works well for streaming data sources that are generated continuously in Hadoop environment
such as log files from multiple servers whereas whereas Apache Sqoop works well with any RDBMS has
JDBC connectivity.
Sqoop is actually meant for bulk data transfers between Hadoop and any other structured data stores.
Flume collects log data from many sources, aggregating it, and writing it to HDFS.
Flume:
Flume is a framework for populating Hadoop with data. Agents are populated throughout ones IT
infrastructure – inside web servers, application servers and mobile devices, for example – to collect data
and integrate it into Hadoop.
Flume helps to collect data from a variety of sources, like logs, jms, Directory etc. Multiple flume agents
can be configured to collect high volume of data. It scales horizontally.
Flume is a better choice when moving bulk streaming data from various sources like JMS or Spooling
directory whereas Sqoop is an ideal fit if the data is sitting in databases like Teradata, Oracle, MySQL
Server, Postgres or any other JDBC compatible database then it is best to use Apache Sqoop.
Sqoop:
Sqoop is a connectivity tool for moving data from non-Hadoop data stores – such as relational databases
and data warehouses – into Hadoop. It allows users to specify the target location inside of Hadoop and
instruct Sqoop to move data from Oracle,Teradata or other relational databases to the target.
Sqoop helps to move data between Hadoop and other databases and it can transfer data in parallel for
performance.
Apache Sqoop provides direct input i.e. it can map relational databases and import directly into HBase
and Hive.
264. State the difference between true positive and true negative. *
265. State the difference between false negative and false positive. *
266. State the difference between true negative and false positive. *
267. What is the difference between linear regression and logistic regression? *
It is an algorithm used for solving It is a model used for both classification and
1. classification problems. regression.
It is not used to find the best margin, it tries to find the “best” margin (distance
instead, it can have different decision between the line and the support vectors)
boundaries with different weights that that separates the classes and thus reduces
2. are near the optimal point. the risk of error on the data.
It works with already identified It works well with unstructured and semi-
3. identified independent variable. structured data like text and images.
6. Problems to apply logistic regression Problems that can be solved using SVM
algorithm.
1. Image Classification
1. Cancer Detection: It can be used to 2. Recognizing handwriting
detect if a patient has cancer(1) or not(0)
3. Cancer Detection
2. Test Score: Predict if the student is
passed(1) or not(0).
3. Marketing: Predict if a customer will
S.No
. Logistic Regression Support Vector Machine
A graph database is defined as a specialized, single-purpose platform for creating and manipulating
graphs. Graphs contain nodes, edges, and properties, all of which are used to represent and store data in
a way that relational databases are not equipped to do.
Graph analytics is another commonly used term, and it refers specifically to the process of analyzing data
in a graph format using data points as nodes and relationships as edges. Graph analytics requires a
database that can support graph formats; this could be a dedicated graph database, or a converged
database that supports multiple data models, including graph.
There are two popular models of graph databases: property graphs and RDF graphs. The property graph
focuses on analytics and querying, while the RDF graph emphasizes data integration. Both types of
graphs consist of a collection of points (vertices) and the connections between those points (edges). But
there are differences as well.
A spatial database is a database that is enhanced to store and access spatial data or data
that defines a geometric space. These data are often associated with geographic locations
and features, or constructed features like cities. Data on spatial databases are stored as
coordinates, points, lines, polygons and topology. Some spatial databases handle more
complex data like three-dimensional objects, topological coverage and linear networks.
276. What is the difference between IaaS and PaaS? *
IAAS is used
by network PAAS is used SAAS is used by
Uses architects. by developer. end user.
It is a cloud
It is service computing
model that model that It is a service
provide delivers tools model in cloud
visualized that is used computing that
computing for host software
resources over development make available
Model internet. of application. for client.
There is no
In which you requirement
required about
knowledge of technicalities
It required subject to company
Technical technical understand handle
understanding. knowledge. basic setup. everything.
It popular
between It is popular
developer between
who focus on consumer and
It is popular the company.such
between development as file sharing,
developer and of apps and email and
Popularity. researchers. scripts. networking.
Amazon web
services, sun, Facebook, and M.S office web,
vcloud google search Facebook and
Cloud services. express. engine. google apps.
IAAS is used
by network PAAS is used SAAS is used by
Uses architects. by developer. end user.
PAAS give
access to run
time
IAAS give environment
access to the to
resources like deployment
virtual and
machines and development SAAS give
virtual tools for access to the
Access storage. application. end user.
It is a cloud
It is service computing
model that model that It is a service
provide delivers tools model in cloud
visualized that is used computing that
computing for host software
resources over development make available
Model internet. of application. for client.
There is no
In which you requirement
required about
knowledge of technicalities
It required subject to company
Technical technical understand handle
understanding. knowledge. basic setup. everything.
Amazon web
services, sun, Facebook, and M.S office web,
vcloud google search Facebook and
Cloud services. express. engine. google apps.
IAAS is used
by network PAAS is used SAAS is used by
Uses architects. by developer. end user.
PAAS give
access to run
time
IAAS give environment
access to the to
resources like deployment
virtual and
machines and development SAAS give
virtual tools for access to the
Access storage. application. end user.
It is a cloud
It is service computing
model that model that It is a service
provide delivers tools model in cloud
visualized that is used computing that
computing for host software
resources over development make available
Model internet. of application. for client.
There is no
In which you requirement
required about
knowledge of technicalities
It required subject to company
Technical technical understand handle
understanding. knowledge. basic setup. everything.
It popular
between It is popular
developer between
who focus on consumer and
It is popular the company.such
between development as file sharing,
developer and of apps and email and
Popularity. researchers. scripts. networking.
Amazon web
services, sun, Facebook, and M.S office web,
vcloud google search Facebook and
Cloud services. express. engine. google apps.
Software as a service (SaaS) differs from the traditional model because the software (application) is
already installed and configured. You can simply provision the server for an instance in cloud, and in a
couple hours, you'll have the application ready for use. This reduces the time spent on installation and
configuration and can reduce the issues that get in the way of the software deployment.
2. Lower costs
SaaS can provide beneficial cost savings since it usually resides in a shared or multi-tenant environment,
where the hardware and software license costs are low compared with the traditional model.
Another advantage is that you can rapidly scale your customer base since SaaS allows small and
medium businesses to use a software that otherwise they would not use due to the high cost of licensing.
Maintenance costs are reduced as well, since the SaaS provider owns the environment and it is split
among all customers that use that solution.
Usually, SaaS solutions reside in cloud environments that are scalable and have integrations with other
SaaS offerings. Compared with the traditional model, you don't have to buy another server or software.
You only need to enable a new SaaS offering and, in terms of server capacity planning, the SaaS provider
will own that. Additionally, you'll have the flexibility to be able to scale your SaaS use up and down
based on specific needs.
With SaaS, the provider upgrades the solution and it becomes available for their customers. The costs and
effort associated with upgrades and new releases are lower than the traditional model that usually forces
you to buy an upgrade package and install it (or pay for specialized services to get the environment
upgraded).
SaaS offerings are easy to use since they already come with baked-in best practices and samples. Users
can do proof-of-concepts and test the software functionality or a new release feature in advance. Also,
you can have more than one instance with different versions and do a smooth migration. Even for large
environments, you can use SaaS offerings to test the software before buying.
Software as a service (SaaS) differs from the traditional model because the software (application) is
already installed and configured. You can simply provision the server for an instance in cloud, and in a
couple hours, you'll have the application ready for use. This reduces the time spent on installation and
configuration and can reduce the issues that get in the way of the software deployment.
2. Lower costs
SaaS can provide beneficial cost savings since it usually resides in a shared or multi-tenant environment,
where the hardware and software license costs are low compared with the traditional model.
Another advantage is that you can rapidly scale your customer base since SaaS allows small and
medium businesses to use a software that otherwise they would not use due to the high cost of licensing.
Maintenance costs are reduced as well, since the SaaS provider owns the environment and it is split
among all customers that use that solution.
Usually, SaaS solutions reside in cloud environments that are scalable and have integrations with other
SaaS offerings. Compared with the traditional model, you don't have to buy another server or software.
You only need to enable a new SaaS offering and, in terms of server capacity planning, the SaaS provider
will own that. Additionally, you'll have the flexibility to be able to scale your SaaS use up and down
based on specific needs.
With SaaS, the provider upgrades the solution and it becomes available for their customers. The costs and
effort associated with upgrades and new releases are lower than the traditional model that usually forces
you to buy an upgrade package and install it (or pay for specialized services to get the environment
upgraded).
SaaS offerings are easy to use since they already come with baked-in best practices and samples. Users
can do proof-of-concepts and test the software functionality or a new release feature in advance. Also,
you can have more than one instance with different versions and do a smooth migration. Even for large
environments, you can use SaaS offerings to test the software before buying.
Multi-tenancy is a kind of software architecture in which a single deployment of a software application serves
multiple customers. Each customer is called a tenant. Tenants may be given the ability to customize some
parts of the application, now a days applications are designed in a such a way that per tenant, the storage area
is segregated by having different database altogether or having a different sachems inside a single database or
same database with discriminators.
Automated Provisioning
The users should be able to access the SaaS applications on the fly, which means the process of provisioning
the users with the services needs to be automated. SaaS applications are typically used by B2B/B2C
customers and this requirement demands creating companies/users just by invoking web services and provide
the access credentials. Most of the SaaS applications provide this critical feature and a great example would
be CREST API from Microsoft. Cloud Services Broker (CSB) platforms can automate this procedure to
provide access to SaaS applications on demand basis. Another important characteristic is the de-provisioning
ability - remove the access from the user/organizations whenever the customer decides not to use the Software
as a Service applications. A good example for this is Salesforce, used by sales folks to manage the sales
related operations. Typically, Salesforce tenant gets created for an organization with unique identification by
invoking APIs of Saleforce. Another set of APIs are called to create users under the tenant and the access
credentials are shared to user. Also delete API is called for when an organization decides to discontinue the
application.
Single Sign On
An enterprise organization would want to have a single identity system in place in order to authenticate the
various systems which are going to be consumed by users. Also, it is important for enterprises to have a single
page to provide login credentials and access all Software as a Service applications provisioned to the
respective users. So, Software as a Service applications should be easily integrated with various identity
management systems without much change. It is also a big maintenance overhead for enterprises to store &
maintain multiple credentials per system which are used by enterprise users. So it becomes important to
enable Single Sign On for SaaS applications to authenticate against existing identity system and provide an
experience of logging in once and use the various systems. Typically, Software as a Service applications use
SAML or OpenID kind of impersonations to enable this critical piece. Also, another important factor is that
the SaaS applications are multi-tenant, each tenant would want to authenticate against their own identity &
access management system.
Subscription-based Billing
SaaS applications pricing do not involve the complexity of license cost & upgrade cost etc. Generally, the
Software as a Service applications are subscription based, and this enables customers to buy the SaaS
applications whenever they require them and discontinue whenever the enterprise decides that they are not
needed any more. SaaS applications generally follow seat based charging type- the number of quantity
purchased will decide the amount to be paid. It can have various pricing models and billing cycles such as
monthly/quarterly/half yearly/annually fixed etc. Few modern SaaS applications also provide the ability to
charge based on usage based billing. Another important characteristic is that the SaaS applications should be
able to be invoiced. Typically CSB platforms will look for this critical feature so that they can dispatch a
single invoice to their customers.
High Availability
SaaS applications are shared by multiple tenants and the availability of kind of applications are expected to be
really high throughout. So the Software as a Service applications should provide a high degree of SLA to their
customers. Applications should be accessible 24x7 across globe. Also SaaS applications should expose
management & monitoring API to continuously check the health/availability factor.
Elastic Infrastructure
SaaS applications usage is generally not predictable, consumption can dramatically vary in some months. The
infrastructure on the applications deployed should really have an ability to expand/shrink the resources used
behind the show. These days, SaaS applications are designed in such a way that it identifies the behavior of
the infrastructure. Monitoring agents reside within the deployment resources intimate the respective
management servers about the accessibility of the resources. Typicality, policies and procedures are built as
part of the core architecture to expand/shrink the infrastructure resources. Micro architecture based SaaS
applications are the classic examples. Tools like Docker and Kubernetes are using to manage the elasticity of
the SaaS applications. Another way is to build a policy engine to receive and react for an event; an event
could be expand/shrink the infrastructure resources.
Data Security
Ensuring that the data/business information is protected from corruption and unauthorized access is very
important in today’s world. Since the Software as a Service applications are designed to be shared by different
tenants, it becomes extremely important to know how well the data is secured. Certain types of data must be
enabled with encrypted storage for a particular tenant and the same should not be accessible to another tenant.
So, having a good Key Management Framework or ability to integrate/interface with external Key
Management Frameworks becomes essential part of SaaS applications. Also integration with CASB (Cloud
Access Security Brokers) system will increase the confidence with respect to data security. A very strong Role
Based Access Controls need to be ensured in order to protect the data.
Application Security
SaaS applications should be equipped with protection against vulnerabilities. Typically, they should be
protected against OWASP/SAN identified vulnerabilities. Also, strong identity and access management
controls should be enabled for SaaS applications. The other aspects that make the Software as a Service
application secure are the following:
Strong session management, protection against hijack the session
Identifying unauthorized session, protection against multi-session etc.
Usage of cookies not storing sensitive data, follow Cookie etc.
Step-Up authentication like password lock out etc.
Multi factor authentication
Strong implementation on separation of duties
Protection against DoS/DDoS
Protection against buffer overflow attacks
Also integration points open with CASB will help in gaining confidence of the customers.
Rate Limiting/QoS
Every business has preferred/important users apart from the regular list of users using the applications. These
days, in order to provide better service to all class of customers, rate limiting is a good feature to have. The
number of hits/ number of transaction can be technically limited to ensure the smooth business transactions.
Also, SaaS applications can be enabled with Rate limiting/QoS configure-ability which helps organizations to
manage their user base.
Audit
Generally SaaS applications are equipped with providing audit logs of business transactions and this enables
customers to work out a business strategy by applying business intelligence plans. These services also should
be able to comply with government regulations and internal policies.
1. Access management
Access management is critical for every SaaS application due to the presence of sensitive data. SaaS
customers need to know whether the single point of access into the public cloud can expose confidential
information. It is also worthwhile to ask questions about the design of access control systems and identify
whether there are any chances for network security issues, like deficient patching and lack of monitoring.
2. Misconfigurations
Most SaaS products add more layers of complexity into their system, thus increasing the chances for
misconfigurations to arise. Even small configuration mistakes can affect the availability of the cloud
infrastructure.
One of the most well-known misconfiguration mistakes occurred in February 2008 when Pakistan
Telecom tried to block YouTube within Pakistan due to some supposedly blasphemous videos. Their
attempt to create a dummy route for YouTube made the platform globally unavailable for two hours.
3. Regulatory compliance
When you are ensuring that your suppliers have strong endpoint security measures in place, ask these
questions:
What is the relevant jurisdiction that governs customer data, and how is it determined?
Do your cloud applications comply with regulatory, privacy, and data protection requirements like GDPR,
HIPAA, SOX, and more?
Are your cloud providers ready to undergo external security audits?
Does your cloud service provider hold any security certifications like ISO, ITIL, and more?
4. Storage
Before you purchase new software, it is important to check where all the data is stored. SaaS users can
ask the following questions to cross-check data storage policies:
Does your SaaS provider allow you to have any control over the location of data stored?
Is data stored with the help of a secure cloud service provider like AWS or Microsoft, or is it stored in a
private data center?
Are security solutions like data encryption available in all stages of data storage?
Can end users share files and objects with other users within and outside their domain?
5. Retention
You need to check how long the SaaS environment retains the sensitive information you enter into the
system. It is recommended to check who owns the data available in the cloud: the SaaS provider or the
user? What is the cloud data retention policy, who enforces it, and are there any exceptions to this?
6. Disaster recovery
Disasters can happen out of the blue and have the capacity to shake the foundations of your business. You
need to ask these questions to get yourself ready to face any impending disasters.
What happens to the cloud application and all your data stored in it during a natural disaster? Does the
force majeure clause in your master service agreement come into play? Does your service provider
promise a complete restoration? If yes, check how long that will take and its procedures.
7. Privacy and data breaches
Security and data breaches are a common security threat that organizations face every day. Ask these
questions to know how well your supplier can mitigate and overcome privacy and data breaches.
Types of SaaS
SaaS model is singled out as a separate market direction for a reason. The diversity of all the
possibilities of the market for cloud solutions is astonishing.
Let’s consider some of the available options. Here are the main types of SaaS.
The billing software In simple words, this niche contains SaaS It’s great news for such
products to cover all payment SaaS providers
procedures. It makes a payment and as Xero, Tipalti or Refrens
after-payment reports the single-click
processes. This market is predicted to
reach $20 billion by 2026.
PaaS can streamline workflows when multiple developers are working on the
same development project. If other vendors must be included, PaaS can provide great
speed and flexibility to the entire process. PaaS is particularly beneficial if you need to
create customized applications.
286. What do you mean by PaaS delivery? *
Platform-as-a-service (PaaS) is a type of cloud computing model in which a service provider delivers a
platform to customers. The platform enables the organization to develop, run, and manage business
applications without the need to build and maintain the infrastructure such software development
processes require.
287. Write down the advantages of PaaS. *
PaaS works well for small businesses and startup companies for two very basic reasons. First, it’s cost
effective, allowing smaller organizations access to state-of-the-art resources without the big price tag.
Most small firms have never been able to build robust development environments on premises, so PaaS
provides a path for accelerating software development. Second, it allows companies to focus on what
they specialize in without worrying about maintaining basic infrastructure.
Other advantages include the following:
There are always two sides to every story. While it’s easy to make the case for PaaS, there’s bound to be
some challenges as well. Some of these hurdles are simply the flip side of the positives and the nature of
the beast. Others can be overcome with advanced planning and preparation.
Examples of PaaS
AWS Elastic Beanstalk.
Windows Azure.
Heroku.
Force.com.
Google App Engine.
OpenShift.
IaaS is advantageous to companies in scenarios where scalability and quick provisioning are key. In
other words, organizations experiencing rapid growth but lacking the capital to invest in hardware are
great candidates for IaaS models. IaaS can also be beneficial to companies with steady application
workloads that simply want to offload some of the routine operations and maintenance involved in
managing infrastructure.
Pay for What You Use: Fees are computed via usage-based metrics
Reduce Capital Expenditures: IaaS is typically a monthly operational expense
Dynamically Scale: Rapidly add capacity in peak times and scale down as needed
Increase Security: IaaS providers invest heavily in security technology and expertise
Future-Proof: Access to state-of-the-art data center, hardware and operating systems
Self-Service Provisioning: Access via simple internet connection
Reallocate IT Resources: Free up IT staff for higher value projects
Reduce Downtime: IaaS enables instant recovery from outages
Boost Speed: Developers can begin projects once IaaS machines are provisioned
Enable Innovation: Add new capabilities and leverage APIs
Level the Playing Field: SMBs can compete with much larger firms
There are many benefits to using IaaS in an organization, but there are also challenges. Some of these
hurdles can be overcome with advanced preparation, but others present risks that a customer should
weigh in on before deployment.
Challenges may include the following:
Unexpected Costs: Monthly fees can add up, or peak usage may be more than expected
Process Changes: IaaS may require changes to processes and workflows
Runaway Inventory: Instances may be deployed, but not taken down
Security Risks: While IaaS providers secure the infrastructure, businesses are responsible
for anything they host
Lack of Support: Live help is sometimes hard to come by
Complex Integration: Challenges with interaction with existing systems
Security Risks: New vulnerabilities may emerge around the loss of direct control
Limited Customization: Public cloud users may have limited control and ability to
customize
Vendor Lock-In: Moving from one IaaS provider to another can be challenging
Broadband Dependency: Only as good as the reliability of the internet connection
Providers Not Created Equally: Vendor vetting and selection can be challenging
Managing Availability: Even the largest service providers experience downtime
Confusing SLAs: Service level agreements (SLAs) can be difficult to understand
Regulatory Uncertainty: Evolving federal and state laws can impact some industries’ use
of IaaS, especially across country borders
Vendor Consolidation: Providers may be acquired or go out of business
Third-Party Expertise: Lack of mature service providers, guidance or ecosystem support