You are on page 1of 48

PRIYADARSHINI BHAGWATI COLLEGE OF ENGINEERING, NAGPUR

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING


SESSION 2016-17 (EVEN SEM)

WINTER- 2016 PAPER SOLUTION

SEMESTER B.E VIII SEM (CSE) (CBS)

SUBJECT CLUSTERING AND CLOUD COMPUTING

DEPARTMENT COMPUTER SCIENCE & ENGINEERING

NAME OF FACULTY MS. SHRUNKHALA S. WANKHEDE

Q.1.a) What are the various leagal issues faced when using cloud 7 Marks
model? Discuss in detail.

ANSWER:

Cross Border Legal Issues

● Cloud Cloud inherently being stateless and serves located in different locations
and countries creates issues related to conflict of laws, applicable law and
jurisdiction.

● Cross-border data flow, potentially conflicting regulations, applicable regulations

Involvement of multiple parties

● Cloud services usually involve multiple parties which makes onus and liability
shift on one another. Liability and responsibility of sub-contractors is often
limited or disclaimed in entirety.

● Contractual privity lacks between the parties which makes it difficult for the client
to bind a provider for a breach.

● Agreements should include liability of provider for acts of sub-contractor.

● Right to conduct due diligence and to understand the model of delivery of services
should be given to the customer.

Privacy and Security

CCC W-16 PAPER SOLUTION Page 1


● Multi-tenant architecture

● Data from different user are usually stored on a single virtual server

● Multiple virtual servers run on a single physical server

● Data security depends upon the integrity of the virtualization

Service Level Agreements

● Cloud services are usually provided on standard service level agreements which
are usually non-negotiable.

● Even if negotiation is not agreeable for SLA, higher degree of reporting should be
integrated in the agreement.

● Additional options for termination should be provided.

Issues with Service Level Agreements

● Standard mass market contracting terms are used

● Non-negotiable (often click through)

● Little opportunity to conduct due diligence

● Strong limits on liability (including direct liability)

● Terms often subject to change without little notice

● Risk is generally shifted to user through provider friendly agreements

Audit Trail

● As data is on continuous move and flow in the cloud services, client should have
the right to know where and by whom its data is stored, accessed, transferred and
altered.

● Confirm whether the vendor provides the audit trails rights or not.

Exit Issues

● In case a user has to change provider in the future the options for portability and
interoperability are critical issues to be considered.

● In case of exit can the records be successfully accessed?

CCC W-16 PAPER SOLUTION Page 2


● Can data be extracted from the cloud?

● Obligations of each party in case of exit.

Hacking of cloud vendor

● In the event that cloud vendor system is hacked, does the owner of the data has
the right to move against the vendor for claiming lost profits.

Jurisdictional Issues

● In cloud services location of data is usually uncertain. The owner of data is not
aware of the country where the data is stored. The physical location of the data
raises the question of law to be governed and jurisdiction. Its important to be
aware of the prevailing law in that particular nation.

● What if a dispute arises, what will be the place of jurisdiction. The owner of the
data should be aware of the country’s court system which will govern the conflict
arose between the parties.

● For eg. The owner is based at India and cloud service provider is based in the US.
The vendor would prefer jurisdiction of American Court. But can the owner afford
to contest the matter in American court.

Q.1.b) What is cloud? Explain the advantages and disadvantages of 6 Marks


cloud computing.

ANSWER:

Cloud computing is a type of Internet-based computing that provides shared


computer processing resources and data to computers and other devices on demand.
It is a model for enabling ubiquitous, on-demand access to a shared pool of
configurable computing resources (e.g., computer networks, servers, storage,
applications and services), which can be rapidly provisioned and released with
minimal management effort.

Advantages:

Easy implementation: Cloud hosting allows business to retain the same applications
and business processes without having to deal with the backend technicalities.
Readily manageable by the Internet, a cloud infrastructure can be accessed by
enterprises easily and quickly.

Accessibility: Access your data anywhere, anytime. An Internet cloud infrastructure


maximizes enterprise productivity and efficiency by ensuring your application is

CCC W-16 PAPER SOLUTION Page 3


always accessible. This allows for easy collaboration and sharing among users in
multiple locations.

No hardware required: Since everything will be hosted in the cloud, a physical


storage center is no longer needed. However, a backup could be worth looking into in
the event of a disaster that could leave your company's productivity stagnant.

Cost per head: Overhead technology costs are kept at a minimum with cloud hosting
services, enabling businesses to use the extra time and resources for improving the
company infrastructure.

Flexibility for growth: The cloud is easily scalable so companies can add or subtract
resources based on their needs. As companies grow, their system will grow with
them.

Efficient recovery: Cloud computing delivers faster and more accurate retrievals of
applications and data. With less downtime, it is the most efficient recovery plan.

Disadvantages:

No longer in control: When moving services to the cloud, you are handing over your
data and information. For companies who have an in-house IT staff, they will be
unable to handle issues on their own. However, Stratosphere Networks has a 24/7
live help desk that can rectify any problems immediately.

May not get all the features: Not all cloud services are the same. Some cloud
providers tend to offer limited versions and enable the most popular features only, so
you may not receive every feature or customization you want. Before signing up,
make sure you know what your cloud service provider offers.

Doesn't mean you should do away with servers: You may have fewer servers to
handle which means less for your IT staff to handle, but that doesn't mean you can let
go of all your servers and staff. While it may seem costly to have data centers and a
cloud infrastructure, redundancy is key for backup and recovery.

No Redundancy: A cloud server is not redundant nor is it backed up. As technology


may fail here and there, avoid getting burned by purchasing a redundancy plan.
Although it is an extra cost, in most cases it will be well worth it.

Bandwidth issues: For ideal performance, clients have to plan accordingly and not
pack large amounts of servers and storage devices into a small set of data centers.

Q.2.a) Discuss the challenges in cloud computing. 6 Marks

ANSWER:

CCC W-16 PAPER SOLUTION Page 4


Cloud computing challenges have always been there. Companies are increasingly
aware of the business value that cloud computing brings and are taking steps towards
transition to the cloud. A smooth transition entails a thorough understanding of the
benefits as well as challenges involved. Like any new technology, the adoption of
cloud computing is not free from issues. Some of the most important challenges are as
follows.

1. Security and Privacy

The main challenge to cloud computing is how it addresses the security and
privacyconcerns of businesses thinking of adopting it. The fact that the valuable
enterprise data will reside outside the corporate firewall raises serious concerns.
Hacking and various attacks to cloud infrastructure would affect multiple clients even
if only one site is attacked. These risks can be mitigated by using security applications,
encrypted file systems, data loss software, and buying security hardware to track
unusual behavior across servers.

It is difficult to assess the costs involved due to the on-demand nature of the services.
Budgeting and assessment of the cost will be very difficult unless the provider has
some good and comparable benchmarks to offer. The service-level agreements (SLAs)
of the provider are not adequate to guarantee the availability and scalability.
Businesses will be reluctant to switch to cloud without a strong service quality
guarantee.

3. Interoperability and Portability

Businesses should have the leverage of migrating in and out of the cloud and
switching providers whenever they want, and there should be no lock-in period.
Cloud computing services should have the capability to integrate smoothly with the
on-premise IT.

4. Reliability and Availability

Cloud providers still lack round-the-clock service; this results in frequent outages. It is
important to monitor the service being provided using internal or third-party tools. It
is vital to have plans to supervise usage, SLAs, performance, robustness, and business
dependency of these services.

CCC W-16 PAPER SOLUTION Page 5


5. Performance and Bandwidth Cost

Businesses can save money on hardware but they have to spend more for the
bandwidth. This can be a low cost for smaller applications but can be significantly
high for the data-intensive applications. Delivering intensive and complex data over
the network requires sufficient bandwidth. Because of this, many businesses are
waiting for a reduced cost before switching to the cloud.

Q.2.b) Compare and contrast cloud computing, cluster computing 7 Marks


and grid computing.

ANSWER:

CCC W-16 PAPER SOLUTION Page 6


Q.3.a) What is the role of network in cloud? List and explain 7 Marks
protocols used in cloud.

ANSWER:

Protocols used in cloud:

1. Gossip Protocol

● It is a communication protocol.

● Similarly referred as a epidemic protocol.

● The dissemination protocol used to spread info. basically it work by using


flooding agent in n/w.
2. Connection-less n/w protocol (CLNP)

● Works on layer-3 protocol OSI model.

● Mechanism of fragmentation(data unit identification, length of data and offset


address).

● Exactly same like IP but basic diff. is that CNLP addr. Size is 20 bytes as compared
to IP(4 byte).

3. State Routing Protocol (SRP)

● A router communicate with each other, routing algo. is used to choose the path to
route the info.

● Ex. of this types of protocols are IP and IPX.

● Routing info. Protocol (RIP) is used to know the info. about path/routing.

4. Internet Group Management Protocol (IGMP)

● Communication protocol used to multicast the data to the nodes in a n/w via
router.

● Can be used for streaming video, gaming over cloud.

● It operates on n/w layer just like other management protocol like ICMP.

● Ex. Watching online video over cloud.

5. Secure Shell protocol (SSHP)

CCC W-16 PAPER SOLUTION Page 7


● Cryptographic n/w protocol allow remote login securely over internet.

● Advantage is remote login with encryption & access information.

● 2 versions  SSH-1 & SSH-2.

● The decryption algo. is placed inside a remote server for decryption of user_id and
password.

● It was a replacement of telnet, Rlogin, RSH because it encrypt login_id & password
login time.

6. Coverage Enhanced Ethernet Protocol (CEE)

● N/w traffic & packet loss issues are solved.

● Packet will be loss when new packets are comes via switch so data will loss so it is
solution over it.

● Advantages – handle packet traffic on data-link layer, lower cost, for storaging
the packets.

7. Extensible Messaging & Presence Protocol (XMPP)

● Used for publish subscriber system, video & file transfer in cloud.

● Developed by jabber open source community in 1999 & its freeware protocol.

● In dec. 2011 Microsoft released an XMPP interface for messenger service.

8. Advanced Message Queuing Protocol (AMQP)

● Msg/info. are routed in P-t-P manner over cloud.

● Wire less protocol provide description format of data send on n/w.

● Provide a guarantees of msg delivery & work on application layer.

● Used in cloud but now-a-days used in Red hat, Microsoft, Apache etc.

9. Enhanced Interior Gateway Routing Protocol (EIGRP)

● Replaced by IGRP in 1993 because didn’t support of IP classes of IP4 but it


supports.

CCC W-16 PAPER SOLUTION Page 8


● If router didn’t have a valid path to the desti.So it discard the packet &
dynamically create a path.

● Features – support load balancing on parallel linked site.

10. Media Transfer Protocol (MTP)

● Transfer the media files, audio files, metadata to & from the portable device over a
cloud.

● Actually PTP used to transfer the media files.

● Used for downloading photographs from cloud.

● Disadvantage- not to transmit the video files so solution over their AVTP is used.

● It initialized the initiators to identify the capability of device with respect to file
format & functionality.

● MTP is a part of windows media player from windows NT MTP is used in


Microsoft series.

Q.3.b) Explain deployment models in cloud computing. 6 Marks

ANSWER:

Public Cloud
Public Cloud allows systems and services to be easily accessible to general public.
The IT giants such as Google, Amazon and Microsoft offer cloud services via
Internet.
Benefits
There are many benefits of deploying cloud as public cloud model. The following
diagram shows some of those benefits:

Cost Effective
Since public cloud shares same resources with large number of customers it turns
out inexpensive.

Reliability
The public cloud employs large number of resources from different locations. If any
of the resources fails, public cloud can employ another one.

Flexibility
The public cloud can smoothly integrate with private cloud, which gives customers a
flexible approach.
CCC W-16 PAPER SOLUTION Page 9
Location Independence
Public cloud services are delivered through Internet, ensuring location
independence.

Utility Style Costing


Public cloud is also based on pay-per-use model and resources are accessible
whenever customer needs them.

High Scalability
Cloud resources are made available on demand from a pool of resources, i.e., they can
be scaled up or down according the requirement.

Disadvantages
Here are some disadvantages of public cloud model:

Low Security
In public cloud model, data is hosted off-site and resources are shared publicly,
therefore does not ensure higher level of security.

Less Customizable
It is comparatively less customizable than private cloud.

Private Cloud
Private Cloud allows systems and services to be accessible within an organization.
The Private Cloud is operated only within a single organization. However, it may be
managed internally by the organization itself or by third-party.
Benefits
There are many benefits of deploying cloud as private cloud model. The following
diagram shows some of those benefits:

High Security and Privacy


Private cloud operations are not available to general public and resources are
shared from distinct pool of resources. Therefore, it ensures
high security and privacy.

More Control
The private cloud has more control on its resources and hardware than public cloud
because it is accessed only within an organization.

Cost and Energy Efficiency


The private cloud resources are not as cost effective as resources in public clouds
but they offer more efficiency than public cloud resources.

Disadvantages
Here are the disadvantages of using private cloud model:

CCC W-16 PAPER SOLUTION Page 10


Restricted Area of Operation
The private cloud is only accessible locally and is very difficult to deploy globally.

High Priced
Purchasing new hardware in order to fulfill the demand is a costly transaction.

Limited Scalability
The private cloud can be scaled only within capacity of internal hosted resources.

Additional Skills
In order to maintain cloud deployment, organization requires skilled expertise.

Hybrid Cloud
Hybrid Cloud is a mixture of public and private cloud. Non-critical activities are
performed using public cloud while the critical activities are performed using private
cloud.
Benefits
There are many benefits of deploying cloud as hybrid cloud model. The following
diagram shows some of those benefits:

Scalability
It offers features of both, the public cloud scalability and the private cloud scalability.

Flexibility
It offers secure resources and scalable public resources.

Cost Efficiency
Public clouds are more cost effective than private ones. Therefore, hybrid clouds can
be cost saving.

Security
The private cloud in hybrid cloud ensures higher degree of security.

Disadvantages
Networking Issues
Networking becomes complex due to presence of private and public cloud.

Security Compliance
It is necessary to ensure that cloud services are compliant with security policies of
the organization.

Infrastructure Dependency
The hybrid cloud model is dependent on internal IT infrastructure, therefore it is
necessary to ensure redundancy across data centers.

CCC W-16 PAPER SOLUTION Page 11


Community Cloud
Community Cloud allows system and services to be accessible by group of
organizations. It shares the infrastructure between several organizations from a
specific community. It may be managed internally by organizations or by the
third-party.
Benefits
There are many benefits of deploying cloud as community cloud model.

Cost Effective
Community cloud offers same advantages as that of private cloud at low cost.

Sharing Among Organizations


Community cloud provides an infrastructure to share cloud resources and
capabilities among several organizations.

Security
The community cloud is comparatively more secure than the public cloud but less
secured than the private cloud.

Issues
➢ Since all data is located at one place, one must be careful in storing data in
community cloud because it might be accessible to others.

➢ It is also challenging to allocate responsibilities of governance, security and cost


among organizations.

Q.4.a) Explain cloud computing architecture and cloud 7 Marks


components.

ANSWER:

Cloud Computing architecture comprises of many cloud components, which are


loosely coupled. We can broadly divide the cloud architecture into two parts:

 Front End
 Back End
Each of the ends is connected through a network, usually Internet. The following
diagram shows the graphical view of cloud computing architecture:

CCC W-16 PAPER SOLUTION Page 12


Front End
The front end refers to the client part of cloud computing system. It consists of
interfaces and applications that are required to access the cloud computing platforms,
Example - Web Browser.

BackEnd
The back End refers to the cloud itself. It consists of all the resources required to
provide cloud computing services. It comprises of huge data storage, virtual
machines, security mechanism, services, deployment models, servers, etc.

Cloud infrastructure consists of servers, storage devices, network, cloud


management software, deployment software, and platform virtualization.

Hypervisor
CCC W-16 PAPER SOLUTION Page 13
Hypervisor is a firmware or low-level program that acts as a Virtual Machine
Manager. It allows to share the single physical instance of cloud resources between
several tenants.

Management Software
It helps to maintain and configure the infrastructure.

Deployment Software
It helps to deploy and integrate the application on the cloud.

Network
It is the key component of cloud infrastructure. It allows to connect cloud services
over the Internet. It is also possible to deliver network as a utility over the Internet,
which means, the customer can customize the network route and protocol.

Server
The server helps to compute the resource sharing and offers other services such as
resource allocation and de-allocation, monitoring the resources, providing security
etc.

Storage
Cloud keeps multiple replicas of storage. If one of the storage resources fails, then it
can be extracted from another one, which makes cloud computing more reliable.

Q.4.b) List and explain three service models of cloud computing. 6 Marks

ANSWER:

Infrastructure as a Service (IAAS)

Infrastructure as a Service (IAAS) is a form of cloud computing that provides


virtualized computing resources over the internet. In a IAAS model, a third party
provider hosts hardware, software, servers, storage and other infrastructure
components on the behalf of its users. IAAS providers also host users’ applications
and handle tasks including system maintenance backup and resiliency planning.
IAAS platforms offer highly scalable resources that can be adjusted on-demand which
makes it a well-suited for workloads that are temporary, experimental or change
unexpectedly. Other characteristics of IAAS environments include the automation of
administrative tasks, dynamic scaling, desktop virtualization and policy based
services. Other characteristics of IAAS include the automation of administrative tasks,
dynamic scaling, desktop virtualization and policy based services.

CCC W-16 PAPER SOLUTION Page 14


Technically, the IaaS market has a relatively low barrier of entry, but it may require
substantial financial investment in order to build and support the cloud infrastructure.
Mature open-source cloud management frameworks like OpenStack are available to
everyone, and provide strong a software foundation for companies that want to build
their private cloud or become a public cloud provider.

Platform as a Service (PAAS)

Platform as a Service (PAAS) is a cloud computing model that delivers applications


over the internet. In a PAAS model, a cloud provider delivers hardware and software
tolls, usually those needed for application development, to its users as a service. A
PAAS provider hosts the hardware and software on its own infrastructure. As a result,
PAAS frees users from having to install in-house hardware and software to develop or
run a new application.
PAAS doesn’t replace a business' entire infrastructure but instead a business relies on
PAAS providers for key services, such as Java development or application hosting. A
PAAS provider, however, supports all the underlying computing and software; users
only need to login and start using the platform-usually through a Web browser
interface. PAAS providers then charge for that access on a per-use basis or on monthly
basis.
Some of the main characteristics of PAAS are :
 Scalability and auto-provisioning of the underlying infrastructure.
 Security and redundancy.
 Build and deployment tools for rapid application management and
deployment.
 Integration with other infrastructure components such as web services,
databases, and LDAP.
 Multi-tenancy, platform service that can be used by many concurrent users.
 Logging, reporting, and code instrumentation.
 Management interfaces and/or API.

Software as a Service (SAAS)

Software as a Service(SAAS) is a software distribution model in which applications


are hosted by a vendor or service provider and made available to customers over a
CCC W-16 PAPER SOLUTION Page 15
network, typically the Internet. SAAS has become increasingly prevalent delivery
model as underlying technologies that support Web services and service- oriented
architecture (SOA) mature and new development approaches, such as Ajax, become
popular. SAAS is closely related to the ASP (Application service provider) and on
demand computing software delivery models. IDC identifies two slightly different
delivery models for SAAS namely the hosted application model and the software
development model.
Some of the core benefits of using SAAS model are:
 Easier administration.
 automatic updates and patch management.
 compatibility: all users will have the same version of software.
 easier collaboration, for the same reason.
 global accessibility.

Q.5.a) What is the use and need of big data? Explain its 7 Marks
characteristics.

ANSWER:

Use and Need of big data

● When big data is effectively and efficiently captured, processed, and analyzed,
companies are able to gain a more complete understanding of their business,
customers, products, competitors, etc. which can lead to efficiency improvements,
increased sales, lower costs, better customer service, and/or improved products
and services.

● For example: – Manufacturing companies deploy sensors in their products to


return a stream of telemetry. Sometimes this is used to deliver services like
OnStar, that delivers communications, security and navigation services.

● Perhaps more importantly, this telemetry also reveals usage patterns, failure
rates and other opportunities for product improvement that can reduce
development and assembly costs.

● The proliferation of smart phones and other GPS devices offers advertisers an
opportunity to target consumers when they are in close proximity to a store, a
coffee shop or a restaurant. This opens up new revenue for service providers and
offers many businesses a chance to target new customers.

CCC W-16 PAPER SOLUTION Page 16


● Retailers usually know who buys their products. Use of social media and web log
files from their ecommerce sites can help them understand who didn’t buy and
why they chose not to, information not available to them today.

● This can enable much more effective micro customer segmentation and targeted
marketing campaigns, as well as improve supply chain efficiencies.

● Other widely-cited examples of the effective use of big data exist in the following
areas:

■ Using information technology (IT) logs to improve IT troubleshooting and


security breach detection, speed, effectiveness, and future occurrence
prevention.

■ Use of voluminous historical call center information more quickly, in order to


improve customer interaction and satisfaction.

■ Use of social media content in order to better and more quickly understand
customer sentiment about you/your customers, and improve products,
services, and customer interaction.

■ Fraud detection and prevention in any industry that processes financial


transactions online, such as shopping, banking, investing, insurance and
health care claims.

■ Use of financial market transaction information to more quickly assess risk


and take corrective action.

Characteristics Of 'Big Data'

(i)Volume – The name 'Big Data' itself is related to a size which is enormous. Size of
data plays very crucial role in determining value out of data. Also, whether a
particular data can actually be considered as a Big Data or not, is dependent upon
volume of data. Hence, 'Volume' is one characteristic which needs to be considered
while dealing with 'Big Data'.

(ii)Variety – The next aspect of 'Big Data' is its variety.

Variety refers to heterogeneous sources and the nature of data, both structured and
unstructured. During earlier days, spreadsheets and databases were the only sources
of data considered by most of the applications. Now days, data in the form of emails,
photos, videos, monitoring devices, PDFs, audio, etc. is also being considered in the
analysis applications. This variety of unstructured data poses certain issues for
storage, mining and analysing data.

CCC W-16 PAPER SOLUTION Page 17


(iii)Velocity – The term 'velocity' refers to the speed of generation of data. How fast
the data is generated and processed to meet the demands, determines real potential
in the data.

Big Data Velocity deals with the speed at which data flows in from sources like
business processes, application logs, networks and social media sites,
sensors, Mobile devices, etc. The flow of data is massive and continuous.

(iv)Variability – This refers to the inconsistency which can be shown by the data at
times, thus hampering the process of being able to handle and manage the data
effectively.

Q.5.b) What is Hadoop? Why do we need it? How it differs from 7 Marks
RDBMS?

ANSWER: Hadoop is an open source, Java-based programming framework that


supports the processing and storage of extremely large data sets in a distributed
computing environment. It is part of the Apache project sponsored by the Apache
Software Foundation.
Hadoop makes it possible to run applications on systems with thousands of
commodity hardwarenodes, and to handle thousands of terabytes of data.

Need of Hadoop
Hadoop makes data sharing with its high sharing Ability:

The organizations use big data to improve the functionality of each and every
business unit. This includes research, design, development, marketing, advertising,
sales and customer handling. Sharing is difficult for to share across different platforms.
Hadoop is used to create a pond. It is a repository of various sources of data, intrinsic
or extrinsic sources of data.

Continuity and Stability:

It is generated continuously. Be it your social media presence, mobile platforms and


other related services. These activities generate data on each second and the volume
of it is huge. The solutions need to be scaled quickly and in a cost effective and secure
manner.

Hadoop supports Advanced Analytics:

As compared to the traditional tool, Hadoop provides more accurate facts and figures.
Hadoop supports advanced features like data visualization and predictive analytics in

CCC W-16 PAPER SOLUTION Page 18


order to provide and represent the useful insights in a best graphical manner. It can
help to optimize the performance using a single server and handle huge volume of
information.

Hadoop is considered affordable for both enterprise and small business which makes
it an attractive solution with endless potential. With the passage of time, companies
and enterprises are getting closer to Hadoop. They are moving to implement big data
to support the marketing and other efforts and resources.

Q.6.a) Explain in steps, how Hadoop is configured 7 Marks

ANSWER:

CCC W-16 PAPER SOLUTION Page 19


CCC W-16 PAPER SOLUTION Page 20
Q.6.b) What is Map Reduce? Explain how Map Reduce program is 7 Marks
executed.

ANSWER:

Whatis MapReduce?
MapReduce is a processing technique and a program model for distributed
computing based on java. The MapReduce algorithm contains two important tasks,
namely Map and Reduce. Map takes a set of data and converts it into another set of
data, where individual elements are broken down into tuples (key/value pairs).
Secondly, reduce task, which takes the output from a map as an input and combines
those data tuples into a smaller set of tuples. As the sequence of the name
MapReduce implies, the reduce task is always performed after the map job.

MapReduce programs work in two phases:

1. Map phase

2. Reduce phase.

CCC W-16 PAPER SOLUTION Page 21


Input to each phase are key-value pairs. In addition, every programmer needs to
specify two functions: map function and reduce function.

The data goes through following phases

Input Splits:

Input to a MapReduce job is divided into fixed-size pieces called input splits Input
split is a chunk of the input that is consumed by a single map

Mapping

This is very first phase in the execution of map-reduce program. In this phase data in
each split is passed to a mapping function to produce output values. In our example,
job of mapping phase is to count number of occurrences of each word from input
splits (more details about input-split is given below) and prepare a list in the form of
<word, frequency>

Shuffling

This phase consumes output of Mapping phase. Its task is to consolidate the relevant
records from Mapping phase output. In our example, same words are clubed together
along with their respective frequency.

Reducing

In this phase, output values from Shuffling phase are aggregated. This phase combines
values from Shuffling phase and returns a single output value. In short, this phase
summarizes the complete dataset.

How MapReduce works

Lets understand this with an example –

Consider you have following input data for your MapReduce Program

Welcome to Hadoop Class

Hadoop is good

Hadoop is bad

CCC W-16 PAPER SOLUTION Page 22


The final output of the MapReduce task is

bad 1

Class 1

good 1

Hadoop 3

is 2

to 1

Welcome 1

Q.7.a) Explain security concerns in cloud computing. 7 Marks

ANSWER:
CCC W-16 PAPER SOLUTION Page 23
1. Data Breaches

Cloud computing and services are relatively new, yet data breaches in all forms have
existed for years. The question remains: “With sensitive data being stored online
rather than on premise, is the cloud inherently less safe?”

A study conducted by the Ponemon Institute entitled “Man In Cloud Attack” reports
that over 50 percent of the IT and security professionals surveyed believed their
organization’s security measures to protect data on cloud services are low. This study
used nine scenarios, where a data breach had occurred, to determine if that belief was
founded in fact.

2. Hijacking of Accounts

The growth and implementation of the cloud in many organizations has opened a
whole new set of issues in account hijacking.

Attackers now have the ability to use your (or your employees’) login information to
remotely access sensitive data stored on the cloud; additionally, attackers can falsify
and manipulate information through hijacked credentials.

3. Insider Threat

An attack from inside your organization may seem unlikely, but the insider threat
does exist. Employees can use their authorized access to an organization’s
cloud-based services to misuse or access information such as customer accounts,
financial forms, and other sensitive information.

4. Malware Injection

Malware injections are scripts or code embedded into cloud services that act as “valid
instances” and run as SaaS to cloud servers. This means that malicious code can be
injected into cloud services and viewed as part of the software or service that is
running within the cloud servers themselves.

5. Abuse of Cloud Services

The expansion of cloud-based services has made it possible for both small and
enterprise-level organizations to host vast amounts of data easily. However, the

CCC W-16 PAPER SOLUTION Page 24


cloud’s unprecedented storage capacity has also allowed both hackers and authorized
users to easily host and spread malware, illegal software, and other digital properties.

6. Insecure APIs

Application Programming Interfaces (API) give users the opportunity to customize


their cloud experience.

However, APIs can be a threat to cloud security because of their very nature. Not only
do they give companies the ability to customize features of their cloud services to fit
business needs, but they also authenticate, provide access, and effect encryption.

7. Denial of Service Attacks

Unlike other kind of cyberattacks, which are typically launched to establish a


long-term foothold and hijack sensitive information, denial of service assaults do not
attempt to breach your security perimeter. Rather, they attempt to make your website
and servers unavailable to legitimate users. In some cases, however, DoS is also used
as a smokescreen for other malicious activities, and to take down security appliances
such as web application firewalls.

8. Insufficient Due Diligence

Most of the issues we’ve looked at here are technical in nature, however this
particular security gap occurs when an organization does not have a clear plan for its
goals, resources, and policies for the cloud. In other words, it’s the people factor.

9. Shared Vulnerabilities

Cloud security is a shared responsibility between the provider and the client.

This partnership between client and provider requires the client to take preventative
actions to protect their data. While major providers like Box, Dropbox, Microsoft, and
Google do have standardized procedures to secure their side, fine grain control is up
to you, the client.

10. Data Loss

Data on cloud services can be lost through a malicious attack, natural disaster, or a
data wipe by the service provider. Losing vital information can be devastating to

CCC W-16 PAPER SOLUTION Page 25


businesses that don’t have a recovery plan. Amazon is an example of an organization
that suffered data loss by permanently destroying many of its own customers’ data in
2011.

Q.7.b) Explain how data security is maintained in cloud 7 Marks


computing.

ANSWER:

CCC W-16 PAPER SOLUTION Page 26


Q.8.a) Explain the importance of Authentication and Authorization 7 Marks
in cloud computing.

ANSWER:

CCC W-16 PAPER SOLUTION Page 27


CCC W-16 PAPER SOLUTION Page 28
CCC W-16 PAPER SOLUTION Page 29
CCC W-16 PAPER SOLUTION Page 30
Q.8.b) Explain various technologies used for data security in cloud 6 Marks
computing.

ANSWER:
CCC W-16 PAPER SOLUTION Page 31
Computer and network security is fundamentally about three goals/objectives:
-- confidentiality (C)
-- integrity (I), and
-- availability (A).

Confidentiality refers to keeping data private. Privacy is of the amount


importance as data leaves the borders of the organization. Not only must internal
secrets and sensitive personal data be safeguarded, but metadata and transactional
data can also leak important details about firms or individuals. Confidentiality is
supported by, among other things, technical tools such as encryption and access
control, as well as legal protections

Integrity is a degree confidence that the data in the cloud is what is supposed
to be there, and is protected against accidental or intentional alteration without
authorization. It also extends to the hurdles of synchronizing multiple databases.
Integrity is supported by well audited code, well-designed distributed systems, and
robust access control mechanisms.

Availability means being able to use the system as anticipated. Cloud


technologies can increase availability through widespread internet-enabled access,
but the client is dependent on the timely and robust provision of resources.
Availability is supported by capacity building and good architecture by the provider,
as well as well-defined contracts and terms of agreement

Latest technologies used in data security in cloud computing:

Latest Training Program on Cloud Computing and Windows Azure In order to address
the aforementioned challenges, Fujitsu Laboratories developed new cloud
information gateway technology that can flexibly control data, including data content,
transmitted from the inside of a company to a cloud and between multiple clouds.

In addition to the option of blocking confidential data, the data gateway also includes
the following three features.

1. Data Masking Technology

2. Secure Logic Migration and Execution Technology

3. Data Traceability Technology

Data Masking Technology :

● Data masking is a technique that is intended to remove all identifiable and


distinguishing characteristics from data in order to render it anonymous and yet
still be operable.
CCC W-16 PAPER SOLUTION Page 32
● This technique is aimed at reducing the risk of exposing sensitive information.

● Data masking has also been known by such names as data obfuscation,
de-identification, or depersonalization.

● Using masking technology, when data passes through the information gateway,
confidential parts of the data can be deleted or changed before the data are
transmitted to an external cloud.

Secure Logic Migration and Execution Technology:

● For confidential data that cannot be released outside of the company, even
formed by concealing certain aspects of the data, by simply defining the security
level of data, the information gateway can transfer the cloud-based application to
the in-house sandbox for execution.

● The sandbox will block access to data or networks that lack pre-authorized access,
so even applications transferred from the cloud can be safely executed.

Data Traceability Technology :

● The information gateway tracks all information flowing into and out of the cloud,
so these flows and their content can be checked.

● Data traceability technology uses the logs obtained on data traffic as well as the
characteristics of the related text to make visible the data used in the cloud.

Authentication and Identity:

● Maintaining confidentiality, integrity, and availability for data security is a


function of the correct application and configuration of familiar network, system,
and application security mechanisms at various levels in the cloud infrastructure.

● Authentication of users takes several forms, but all are based on a combination of
authentication factors: something an individual knows (such as a password),
something they possess (such as a security token), or some measurable quality
that is intrinsic to them (such as a fingerprint).

Q.9.a) List and explain features of C#. Net. 7 Marks

ANSWER:

CCC W-16 PAPER SOLUTION Page 33


1. SIMPLE

1. Pointers are missing in C#.

2. Unsafe operations such as direct memory manipulation are not allowed.

3. In C# there is no usage of "::" or "->" operators.

4. Since it's on .NET, it inherits the features of automatic memory management


and garbage collection.

5. Varying ranges of the primitive types like Integer,Floats etc.

6. Integer values of 0 and 1 are no longer accepted as boolean values.Boolean


values are pure true or false values in C# so no more errors of "="operator and
"=="operator.

7. "==" is used for comparison operation and "=" is used for assignment
operation.

2. MODERN

1. C# has been based according to the current trend and is very powerful and
simple for building interoperable, scable, robust applications.

2. C# includes built in support to turn any component into a web service that can
be invoked over the internet from any application running on any platform.

3. OBJECT ORIENTED

1. C# supports Data Encapsulation, inheritance,polymorphism, interfaces.

2. (int,float, double) are not objects in java but C# has introduces


structures(structs) which enable the primitive types to become objects.

int i=1;
string a=i Tostring(); //conversion (or) Boxing

4. TYPE SAFE

1. In C# we cannot perform unsafe casts like convert double to a boolean.

2. Value types (priitive types) are initialized to zeros and reference types (objects
and classes) are initialized to null by the compiler automatically.
CCC W-16 PAPER SOLUTION Page 34
3. arrays are zero base indexed and are bound checked.

4. Overflow of types can be checked.

5. INTEROPERABILITY

1. C# includes native support for the COM and windows based applications.

2. Allowing restriced use of native pointers.

3. Users no longer have to explicityly implement the unknown and other COM
interfacers, those features are built in.

4. C# allows the users to use pointers as unsafe code blocks to manipulate your
old code.

5. Components from VB NET and other managed code languages and directlyt be
used in C#.

6. SCALABLE AND UPDATEABLE

1. .NET has introduced assemblies which are self describing by means of their
manifest. manifest establishes the assembly identity, version, culture and
digital signature etc. Assemblies need not to be register anywhere.

2. To scale our application we delete the old files and updating them with new
ones. No registering of dynamic linking library.

3. Updating software components is an error prone task. Revisions made to the


code. can effect the existing program C# support versioning in the language.
Native support for interfaces and method overriding enable complex frame
works to be developed and evolved over time.

7. Abstraction in C#

The word abstract means a concept or an idea not associated with any specific
instance. In programming we apply the same meaning of abstraction by making
classes not associated with any specific instance. The abstraction is done when we
need to only inherit from a certain class, but do not need to instantiate objects of that
class. In such case the base class can be regarded as "Incomplete". Such classes are
known as an "Abstract Base Class".

8. Encapsulation in C#

CCC W-16 PAPER SOLUTION Page 35


The object oriented programming will give the impression very unnatural to a
programmer with a lot of procedural programming experience. In Object Oriented
programming Encapsulation is the first place. Encapsulation is the procedure of
covering up of data and functions into a single unit (called class). An encapsulated
object is often called an abstract data type. In this article let us see about it in a
detailed manner.

9. Inheritance in C#

Inheritance is one of the three foundational principles of Object-Oriented


Programming (OOP) because it allows the creation of hierarchical classifications.
Using inheritance you can create a general class that defines traits common to a set of
related items. This class can then be inherited by other, more specific classes, each
adding those things that are unique to it.

In the language of C#, a class that is inherited is called a base class. The class that does
the inheriting is called the derived class. Therefore a derived class is a specialized
version of a base class. It inherits all of the variables, methods, properties, and
indexers defined by the base class and adds its own unique elements.
10. Polymorphism in C#

Polymorphism means the same operation may behave differently on different classes.

 Example of Compile Time Polymorphism: Method Overloading

 Example of Run Time Polymorphism: Method Overriding

 Example of Compile Time Polymorphism

 Method Overloading: Method with same name but with different arguments is
called method overloading.

 Method Overloading forms compile-time polymorphism.

11. Exception handling in C#

Exception handling is a built-in mechanism in .NET framework to detect and handle


run time errors. The .NET framework contains many standard exceptions. The
exceptions are anomalies that occur during the execution of a program. They can be
because of user, logic or system errors. If a user (programmer) does not provide a
mechanism to handle these anomalies, the .NET run time environment provides a
default mechanism that terminates the program execution.

C# provides the three keywords try, catch and finally to do exception handling. The
try block encloses the statements that might throw an exception whereas catch
handles an exception if one exists. The finally can be used for doing any clean-up
process.

CCC W-16 PAPER SOLUTION Page 36


Q.9.b) What is dataset in C#? How to declare dataset in C#? Explain 7 Marks
important property of dataset.

ANSWER:

DataSet:

The dataset represents a subset of the database. It does not have a continuous
connection to the database. To update the database a reconnection is required. The
DataSet contains DataTable objects and DataRelation objects. The DataRelation
objects represent the relationship between two tables.

The DataSet, which is an in-memory cache of data retrieved from a data source, is a
major component of the ADO.NET architecture. The DataSet consists of a collection of
DataTable objects that you can relate to each other with DataRelation objects.

How to Declare Dataset in C#

Using system.Data;
DataSet objDS = new DataSet();

● Dataset is mostly used to populate the server control like DataGrid, DataList,
DropDown and datarepeater. You can pass data in binary format means
serialization is also possible using Dataset.

● Before using Dataset it must be connected with some source, once you connected
with the source you can populate any of the control with the filled dataset.
Most Important Property of DataSet
Table: Most Important property of your Dataset, you can come to know whether
your queries return something or not. It contains the collection of tables in your
dataset. It’s like a one dimensional array. You can assign any names to each tables of
your dataset
if (objDS.Tables[0].Rows.Count > 0)
{
//do some action
}

DataSetName: You can give one name to your dataset, which is easy to
remember the purpose of dataset. You can do this in two different way
objDS.DataSetName = "EmployeeDS";
DataSet objDS = new DataSet("EmployeeDS");

HasError: It is used to check the error in your DataSet.

CCC W-16 PAPER SOLUTION Page 37


if (objDS.HasErrors){ //do some action }

Relation: It is used to define the relationship between different tables. It defines the
relationship on the basis of certain keys

Following table shows some important properties of the DataSet class:

Properties Description

CaseSensitive Indicates whether string comparisons within the data


tables are case-sensitive.

Container Gets the container for the component.

DataSetName Gets or sets the name of the current data set.

DefaultViewManager Returns a view of data in the data set.

DesignMode Indicates whether the component is currently in design


mode.

EnforceConstraints Indicates whether constraint rules are followed when


attempting any update operation.

Events Gets the list of event handlers that are attached to this
component.

ExtendedProperties Gets the collection of customized user information


associated with the DataSet.

HasErrors Indicates if there are any errors.

IsInitialized Indicates whether the DataSet is initialized.

Locale Gets or sets the locale information used to compare strings


within the table.

Namespace Gets or sets the namespace of the DataSet.

Prefix Gets or sets an XML prefix that aliases the namespace of the
DataSet.

Relations Returns the collection of DataRelation objects.

Tables Returns the collection of DataTable objects.

Q.10.a) Write a program in C# to design calculator as console based 7 Marks

CCC W-16 PAPER SOLUTION Page 38


application.

ANSWER:

class Program

static void Main(string[] args)

int num1;

int num2;

string operand;

float answer;

Console.Write("Please enter the first integer: ");

num1 = Convert.ToInt32(Console.ReadLine());

Console.Write("Please enter an operand (+, -, /, *): ");

operand = Console.ReadLine();

Console.Write("Please enter the second integer: ");

num2 = Convert.ToInt32(Console.ReadLine());

switch (operand)

case "-":

answer = num1 - num2;

break;

case "+":

answer = num1 + num2;

break;

case "/":

answer = num1 / num2;

break;

CCC W-16 PAPER SOLUTION Page 39


case "*":

answer = num1 * num2;

break;

default: answer = 0;

break;

Console.WriteLine(num1.ToString() + " " + operand + " " + num2.ToString() + "


= " + answer.ToString());

Console.ReadLine();

Q.10.b) What is ADO. NET? Explain its architecture in detail 7 Marks

ANSWER:

ADO.NET provides a bridge between the front end controls and the back end
database. The ADO.NET objects encapsulate all the data access operations and the
controls interact with these objects to display data, thus hiding the details of
movement of data.

ADO.NET Architecture

CCC W-16 PAPER SOLUTION Page 40


ADO.NET consist of a set of Objects that expose data access services to the .NET
environment. It is a data access technology from Microsoft .Net Framework , which
provides communication between relational and non relational systems through a
common set of components .

System.Data namespace is the core of ADO.NET and it contains classes used by all data
providers. ADO.NET is designed to be easy to use, and Visual Studio provides several
wizards and other features that you can use to generate ADO.NET data access code.

Data Providers and DataSet

The two key components of ADO.NET are Data Providers and DataSet . The Data
Provider classes are meant to work with different kinds of data sources. They are
used to perform all data-management operations on specific databases. DataSet class
provides mechanisms for managing data when it is disconnected from the data
source.

Data Providers

The .Net Framework includes mainly three Data Providers for ADO.NET. They are the
Microsoft SQL Server Data Provider , OLEDB Data Provider and ODBC Data Provider .
SQL Server uses the SqlConnection object , OLEDB uses the OleDbConnection Object
and ODBC uses OdbcConnection Object respectively.

ASP.NET SQL Server Connection

ASP.NET OLEDB Connection

ASP.NET ODBC Connection

CCC W-16 PAPER SOLUTION Page 41


A data provider contains Connection, Command, DataAdapter, and DataReader
objects. These four objects provides the functionality of Data Providers in the
ADO.NET.

Connection

The Connection Object provides physical connection to the Data Source. Connection
object needs the necessary information to recognize the data source and to log on to it
properly, this information is provided through a connection string.

ASP.NET Connection

Command

The Command Object uses to perform SQL statement or stored procedure to be


executed at the Data Source. The command object provides a number of Execute
methods that can be used to perform the SQL queries in a variety of fashions.

ASP.NET Command

DataReader

The DataReader Object is a stream-based , forward-only, read-only retrieval of query


results from the Data Source, which do not update the data. DataReader requires a
live connection with the databse and provides a very intelligent way of consuming all
or part of the result set.

ASP.NET DataReader

DataAdapter

DataAdapter Object populate a Dataset Object with results from a Data Source . It is a
special class whose purpose is to bridge the gap between the disconnected Dataset
objects and the physical data source.

ASP.NET DataAdapter

DataSet

CCC W-16 PAPER SOLUTION Page 42


DataSet provides a disconnected representation of result sets from the Data Source,
and it is completely independent from the Data Source. DataSet provides much
greater flexibility when dealing with related Result Sets.

DataSet contains rows, columns,primary keys, constraints, and relations with other
DataTable objects. It consists of a collection of DataTable objects that you can relate to
each other with DataRelation objects. The DataAdapter Object provides a bridge
between the DataSet and the Data Source.

Q.11.a) Explain Azure life cycle in detail. 7 Marks

ANSWER:

The objective of Windows Azure is to automate the service life cycle as much as
possible. Windows Azure service life cycle has five distinct phases and four different
roles.

Figure : The Windows Azure service life cycle

CCC W-16 PAPER SOLUTION Page 43


The five phases are as follows:

Design and development: In this phase, the on-premise team plans, designs,
and develops a cloud service for Windows Azure. The design includes quality
attribute requirements for the service and the solution to fulfill them. This
phase is conducted completely on-premise, unless there is some proof of
concept (POC) involved. The key roles involved in this phase are on-premise
stakeholders. For the sake of simplicity, I have combined these on-site design
roles into a developer role.

Testing: In this phase, the quality attributes of the cloud service are tested.
This phase involves on-premise as well as Windows Azure cloud testing. The
tester role is in charge of this phase and tests end-to-end quality attributes of
the service deployed into cloud testing or staging environment.

Provisioning: Once the application is tested, it can be provisioned to Windows


Azure cloud. The deployer role deploys the cloud service to the Windows
Azure cloud. The deployer is in charge of service configurations and makes
sure the service definition of the cloud service is achievable through
production deployment in Windows Azure cloud. The configuration settings
are defined by the developer, but the production values are set by the deployer.
In this phase, the role responsibilities transition from on-premise to the
Windows Azure cloud. The fabric controller in Windows Azure assigns the
allocated resources as per the service model defined in the service definition.
The load balancers and virtual IP address are reserved for the service.

Deployment: In the deployment phase, the fabric controller commissions the


allocated hardware nodes into the end state and deploys services on these
nodes as defined in the service model and configuration. The fabric controller
also has the capability of upgrading a service in running state without
disruptions. The fabric controller abstracts the underlying hardware
commissioning and deployment from the services. The hardware
commissioning includes commissioning the hardware nodes, deploying
operating system images on these nodes, and configuring switches, access
routers, and load-balancers for the externally facing roles (e.g., Web role).

Maintenance: Windows Azure is designed with the assumption that failure


will occur in hardware and software. Any service on a failed node is
redeployed automatically and transparently, and the fabric controller
automatically restarts any failed service roles. The fabric controller allocates
new hardware in the event of a hardware failure. Thus, fabric controller always
maintains the desired number of roles irrespective of any service, hardware or
operating system failures. The fabric controller also provides a range of
dynamic management capabilities like adding capacity, reducing capacity and
service upgrades without any service disruptions. Figure 2 illustrates the
fabric controller architecture.

CCC W-16 PAPER SOLUTION Page 44


Q.11.b) Compare Azure table storage and SQL Database. 6 Marks

ANSWER:

CCC W-16 PAPER SOLUTION Page 45


CCC W-16 PAPER SOLUTION Page 46
Q.12.b) Explain how Azure maximizes data availability and 6 Marks
minimizes security risks.

ANSWER:
Maximizing data availability:

WA provides robust availability based on extensive redundancy achieved with


visualization:
CCC W-16 PAPER SOLUTION Page 47
a. Replicated data

Azure provides failover clustering of thrice replicated data and Hosted application
instances running when failure occurs.

b. Geographically distributed data:

Customers can leverage the geographically distributed nature if WA infrastructure by


creating a second storage account to provide hot-failover capability. Customers may
also write customized roles to extract data from storage for offsite private backup.

Minimizing security risks:

a. Secure socket layer transmission encryption for web roles:

Azure services can enable Transport layer security (TLS) to use secure HTTP protocol
(HTTPS) for transmission of encrypted requests to and responses from production
Hosted services and storage accounts for web roles.

b. Encrypting information in Azure storage services:

NET 3.5 provides implementations of many standard cryptographic algorithms


including symmetric (shared secret key) and asymmetrical (Public Key Infrastructure
PKI)

c. Azure’s conformance with SAS70 and ISO/IEC 27001:2005 certifications.

i. Statement on Auditing Standards no. 70 (SAS 70)

It includes the service auditor’s opinion on the fairness of presentation of the


service organizations description of controls that had been placed in operation
and the suitability of the design of the controls to achieve the specified control
objectives.

ii. The ISO/IEC 27001:2005 standard

An ISO/IEC 27001 complaint system will provide a systematic approach to


ensuring the availability, confidentiality and integrity of corporate information.
Using controls based on identifying and combating the entire range of potential
risks to the organizations information assets.

CCC W-16 PAPER SOLUTION Page 48

You might also like