Professional Documents
Culture Documents
What is DevOps?
DevOps stands for Development and Operations. It is a so ware engineering practice
that focuses on bringing together the development team and the operations team
for the purpose of automating the project at every stage. This approach helps in
easily automating the project service management in order to aid the objectives at
the operational level and improve the understanding of the technological stack used
in the production environment.
This way of practice is related to agile methodology and it mainly focuses on team
communication, resource management, and teamwork. The main benefits of
following this structure are the speed of development and resolving the issues at the
production environment level, the stability of applications, and the innovation
involved behind it.
DevOps
DevOps Tools
DevOps is a methodology aimed at increased productivity and quality of product
development. The main tools used in this methodology are:
Version Control System tools. Eg.: git.
Continuous Integration tools. Eg.: Jenkins
Continuous Testing tools. Eg.: Selenium
Configuration Management and Deployment tools. Eg.:Puppet, Chef, Ansible
Continuous Monitoring tool. Eg.: Nagios
Containerization tools. Eg.: Docker
DevOps Tools
These days, the market window of products has reduced drastically. We see new
products almost daily. This provides a myriad of choices to consumers but it comes at
a cost of heavy competition in the market. Organizations cant afford to release big
features a er a gap. They tend to ship off small features as releases to the customers
at regular intervals so that their products don't get lost in this sea of competition.
Customer satisfaction is now a motto to the organizations which has also become
the goal of any product for its success. In order to achieve this, companies need to do
the below things:
Frequent feature deployments
Reduce time between bug fixes
Reduce failure rate of releases
Quicker recovery time in case of release failures.
In order to achieve the above points and thereby achieving seamless product
delivery, DevOps culture acts as a very useful tool. Due to these advantages,
multi-national companies like Amazon and Google have adopted the
methodology which has resulted in their increased performance.
It also helps in bringing consistency and improving the product development process
by employing means of design streamlining, extensive documentation, control, and
change implementation during various phases/releases of the project.
Based on the above flow, we can have a brief overview of the CI process.
Developers regularly check out code into their local workspaces and work on the
features assigned to them.
Once they are done working on it, the code is committed and pushed to the
remote shared repository which is handled by making use of effective version
control tools like git.
The CI server keeps track of the changes done to the shared repository and it
pulls the changes as soon as it detects them.
The CI server then triggers the build of the code and runs unit and integration
test cases if set up.
The team is informed of the build results. In case of the build failure, the team
has to work on fixing the issue as early as possible, and then the process repeats.
Doing this would drastically speed up the workflow followed by the developer to
develop the project due to the lack of manual intervention steps to rebuild the
project and run the automated test cases every time the changes are made.
This phase allows for automation of code validation, build, and testing. This ensures
that the changes are made properly without development environment errors and
also allows the identification of errors at an initial stage.
Tools like Jenkins, circleCI, etc are used here.
Deployment:
DevOps aids in the deployment automation process by making use of tools and
scripts which has the final goal of automating the process by means of feature
activation. Here, cloud services can be used as a force that assists in upgrade from
finite infrastructure management to cost-optimized management with the potential
to infinite resources.
Tools like Microso Azure, Amazon Web Services, Heroku, etc are used.
Operations:
This phase usually occurs throughout the lifecycle of the product/so ware due to the
dynamic infrastructural changes. This provides the team with opportunities for
increasing the availability, scalability, and effective transformation of the product.
Tools like Loggly, BlueJeans, Appdynamics, etc are used commonly in this phase.
Monitoring:
Monitoring is a permanent phase of DevOps methodology. This phase is used for
monitoring and analyzing information to know the status of so ware applications.
Tools like Nagios, Splunk, etc are commonly used.
Agile Methodology
This type of branching is done once a set of features meant for a release are
completed, they can be cloned into a branch called the release branch. Any
further features will not be added to this branch.
Only bug fixes, documentation, and release-related activities are done in a
release branch.
Once the things are ready, the releases get merged into the main branch and are
tagged with the release version number.
These changes also need to be pushed into the develop branch which would
have progressed with new feature development.
The branching strategies followed would vary from company to company based on
their requirements and strategies.
17. Can you list down certain KPIs which are used for gauging
the success of DevOps?
KPIs stands for Key Performance Indicators. Some of the popular KPIs used for
gauging the success of DevOps are:
Application usage, performance, and traffic
Automated Test Case Pass Percentage.
Application Availability
Change volume requests
Customer tickets
Successful deployment frequency and time
Error/Failure rates
Failed deployments
Meantime to detection (MTTD)
Meantime to recovery (MTTR)
Challenges Operations
team would Quality
Developers require Assurance team
tend to focus uniform would require
a lot of time technology to keep track of
on tooling that can be what has been
rather than used by changed in the
delivering the different feature and
results. skillset when it has
groups been changed.
easily.
Need Operation
Developers Quality
team need a
need to Assurance team
central
respond to need to focus
governing
new on reducing
tool to
features/bugs human error
monitor
and scale the risk as much as
different
efforts based possible for
systems and
on the bug-free
its
demand. product.
workloads.
Work side by side with the development team while creating the deployment
and test case automation. This is the first and the obvious step in achieving shi
le . This is done because of the well-known fact that the failures that get notices
in the production environment are not seen earlier quite o en. These failures
can be linked directly to:
Different deployment procedures used by the development team while
developing their features.
Production deployment procedures sometimes tend to be way different
than the development procedure. There can be differences in tooling and
sometimes the process might also be manual.
Both the dev team and the operations teams are expected to take ownership to
develop and maintain standard procedures for deployment by making use of the
cloud and the pattern capabilities. This aids in giving the confidence that the
production deployments would be successful.
Usage of pattern capabilities to avoid configurational level inconsistencies in the
different environments being used. This would require the dev team and the
operation team to come together and work in developing a standard process
that guides developers to test their application in the development environment
in the same way as they test in the production environment.
Jenkins follows the master-slave architecture. The master pulls the latest code from
the GitHub repository whenever there is a commitment made to the code. The
master requests slaves to perform operations like build, test and run and produce
test case reports. This workload is distributed to all the slaves in a uniform manner.
Jenkins also uses multiple slaves because there might be chances that require
different test case suites to be run for different environments once the code commits
are done.
Jenkins Architecture
This concept came into prominence because of the limitations associated with the
traditional way of managing the infrastructure. Traditionally, the infrastructure was
managed manually and the dedicated people had to set up the servers physically.
Only a er this step was done, the application would have been deployed. Manual
configuration and setup were constantly prone to human errors and inconsistencies.
This also involved increased cost in hiring and managing multiple people ranging
from network engineers to hardware technicians to manage the infrastructural tasks.
The major problem with the traditional approach was decreased scalability and
application availability which impacted the speed of request processing. Manual
configurations were also time-consuming and in case the application had a sudden
spike in user usage, the administrators would desperately work on keeping the
system available for a large load. This would impact the application availability.
IaC solved all the above problems. IaC can be implemented in 2 approaches:
Imperative approach: This approach “gives orders” and defines a sequence of
instructions that can help the system in reaching the final output.
Declarative approach: This approach “declares” the desired outcome first based
on which the infrastructure is built to reach the final result.
The production traffic would be moved gradually from blue to green environment
and once it is fully transferred, the blue environment is kept on hold just in case of
rollback necessity.
In this pattern, the team has to ensure two identical prod environments but only one
of them would be LIVE at a given point of time. Since the blue environment is more
steady, the LIVE one is usually the blue environment.
Sanity testing, also known as smoke testing, is a process used to determine if it’s
reasonable to proceed to test.
Git repository provides a hook called pre-commit which gets triggered right before a
commit happens. A simple script by making use of this hook can be written to
achieve the smoke test.
The script can be used to run other tools like linters and perform sanity checks on the
changes that would be committed into the repository.
The following snippet is an example of one such script:
#!/bin/sh
files=$(git diff –cached –name-only –diff-filter=ACM | grep ‘.py$’)
if [ -z files ]; then
exit 0
fi
unfmtd=$(pyfmt -l $files)
if [ -z unfmtd ]; then
exit 0
fi
echo “Some .py files are not properly fmt’d”
exit 1
The above script checks if any .py files which are to be committed are properly
formatted by making use of the python formatting tool pyfmt. If the files are not
properly formatted, then the script prevents the changes to be committed to the
repository by exiting with status 1.
35. How can you ensure a script runs every time repository gets
new commits through git push?
There are three means of setting up a script on the destination repository to get
executed depending on when the script has to be triggered exactly. These means are
called hooks and they are of three types:
Pre-receive hook: This hook is invoked before the references are updated when
commits are being pushed. This hook is useful in ensuring the scripts related to
enforcing development policies are run.
Update hook: This hook triggers the script to run before any updates are
actually made. This hook is called once for every commit which has been pushed
to the repository.
Post-receive hook: This hook helps trigger the script a er the updates or
changes have been accepted by the destination repository. This hook is ideal for
configuring deployment scripts, any continuous integration-based scripts or
email notifications process to the team, etc.
Conclusion
DevOps is a culture-shi ing practice that has and is continuing to help lots of
businesses and organizations in a tremendous manner. It helps in bridging the gap
between the conflict of goals and priorities of the developers (constant need for
change) and the operations (constant resistance to change) team by creating a
smooth path for Continuous Development and Continuous Integration. Being a
DevOps engineer has huge benefits due to the ever-increasing demand for DevOps
practice.
Css Interview Questions Laravel Interview Questions Asp Net Interview Questions
DevOps
I. Jenkins : This is an open source automation server used as a continuous integration tool. We can build,
deploy and run automated tests with Jenkins.
II. GIT : It is a version control tool used for tracking changes in files and software.
III. Docker : This is a popular tool for containerization of services. It is very useful in Cloud based deployments.
IV. Nagios : We use Nagios for monitoring of IT infrastructure.
V. Splunk : This is a powerful tool for log search as well as monitoring production systems.
VI. Puppet : We use Puppet to automate our DevOps work so that it is reusable.
I. Release Velocity : DevOps practices help in increasing the release velocity. We can release code to
production more often and with more confidence.
II. Development Cycle : With DevOps, the complete Development cycle from initial design to production
deployment becomes shorter.
III. Deployment Rollback : In DevOps, we plan for any failure in deployment rollback due to a bug in code or
issue in production. This gives confidence in releasing feature without worrying about downtime for rollback.
IV. Defect Detection : With DevOps approach, we can catch defects much earlier than releasing to production.
It improves the quality of the software.
V. Recovery from Failure : In case of a failure, we can recover very fast with DevOps process.
VI. Collaboration : With DevOps, collaboration between development and operations professionals increases.
VII. Performance-oriented : With DevOps, organization follows performance-oriented culture in which teams
become more productive and more innovative.
3. What is the typical DevOps workflow you use in your organization?
The typical DevOps workflow in our organization is as follows:
I. CloudFormation : We use AWS CloudFormation to create and deploy AWS resources by using templates.
We can describe our dependencies and pass special parameters in these templates. CloudFormation can read
these templates and deploy the application and resources in AWS cloud.
II. OpsWorks : AWS provides another service called OpsWorks that is used for configuration management by
utilizing Chef framework. We can automate server configuration, deployment and management by using
OpsWorks. It helps in managing EC2 instances in AWS as well as any on-premises servers.
For this case, we can write a Client-side post-commit hook. This hook will execute a custom script in which we can add the
message and code that we want to run automatically with each commit.
Once the template is ready and submitted to AWS, CloudFormation will create all the resources in the template. This helps in
automation of building new environments in AWS.
Continuous Integration (CI) : In CI all the developer work is merged to main branch several times a day. This helps in
reducing integration problems.
In CI we try to minimize the duration for which a branch remains checked out. A developer gets early feedback on the new
code added to main repository by using CI.
Continuous Delivery (CD) : In CD, a software team plans to deliver software in short cycles. They perform development,
testing and release in such a short time that incremental changes can be easily delivered to production.
In CD, as a DevOps we create a repeatable deployment process that can help achieve the objective of Continuous Delivery.
I. Build Automation : In CI, we create such a build environment that even with one command build can be
triggered. This automation is done all the way up to deployment to Production environment.
II. Main Code Repository : In CI, we maintain a main branch in code repository that stores all the Production
ready code. This is the branch that we can deploy to Production any time.
III. Self-testing build : Every build in CI should be self-tested. It means with every build there is a set of tests that
runs to ensure that changes are of high quality.
IV. Every day commits to baseline : Developers will commit all of theirs changes to baseline everyday. This
ensures that there is no big pileup of code waiting for integration with the main repository for a long time.
V. Build every commit to baseline : With Automated Continuous Integration, every time a commit is made into
baseline, a build is triggered. This helps in confirming that every change integrates correctly.
VI. Fast Build Process : One of the requirements of CI is to keep the build process fast so that we can quickly
identify any problem.
VII. Production like environment testing : In CI, we maintain a production like environment also known as pre-
production or staging environment, which is very close to Production environment. We perform testing in this
environment to check for any integration issues.
VIII. Publish Build Results : We publish build results on a common site so that everyone can see these and take
corrective actions.
IX. Deployment Automation : The deployment process is automated to the extent that in a build process we can
add the step of deploying the code to a test environment. On this test environment all the stakeholders can
access and test the latest delivery.
I. First we have to set up the Security Realm. We can integrate Jenkins with LDAP server to create user
authentication.
II. Second part is to set the authorization for users. This determines which user has access to what resources.
I. Cloud Deployment : We can use Chef to perform automated deployment in Cloud environment.
II. Multi-cloud support : With Chef we can even use multiple cloud providers for our infrastructure.
III. Hybrid Deployment : Chef supports both Cloud based as well as datacenter-based infrastructure.
IV. High Availability : With Chef automation, we can create high availability environment. In case of hardware
failure, Chef can maintain or start new servers in automated way to maintain highly available environment.
I. Client : These are the nodes or individual users that communicate with Chef server.
II. Chef Manage : This is the web console that is used for interacting with Chef Server.
III. Load Balancer : All the Chef server API requests are routed through Load Balancer. It is implemented in
Nginx.
IV. Bookshelf : This is the component that stores cookbooks. All the cookbooks are stored in a repository. It is
separate storage from the Chef server.
V. PostgreSQL : This is the data repository for Chef server.
VI. Chef Server : This is the hub for configuration data. All the cookbooks and policies are stored in it. It can
scale to the size of any enterprise.
II. Automation : Ansible provides very good options for automation. With automation, people can focus on
delivering smart solutions.
III. Large-scale : Ansible can be used in small as well as very large-scale organizations.
IV. Simple DevOps : With Ansible, we can write automation in a human-readable language. This simplifies the
task of DevOps.
I. App Deployment : With Ansible, we can deploy apps in a reliable and repeatable way.
II. Configuration Management : Ansible supports the automation of configuration management across multiple
environments.
III. Continuous Delivery : We can release updates with zero downtime with Ansible.
V. Compliance : Ansible helps in verifying and organization’s systems in comparison with the rules and
regulations.
VI. Provisioning : We can provide new systems and resources to other users with Ansible.
VII. Orchestration : Ansible can be used in orchestration of complex deployment in a simple way.
Docker Hub is a central repository for container image discovery, distribution, change management, workflow automation and
team collaboration.
I. Bash : On Unix based systems we use Bash shell scripting for automating tasks.
II. Python : For complicated programming and large modules we use Python. We can easily use a wide variety of
standard libraries with Python.
III. Groovy : This is a Java based scripting language. We need JVM installed in an environment to use Groovy. It
is very powerful and it provides very powerful features.
IV. Perl : This is another language that is very useful for text parsing. We use it in web applications.
19. What is Multi-factor authentication?
In security implementation, we use Multi-factor authentication (MFA). In MFA, a user is authenticated by multiple means
before giving access to a resource or service. It is different from simple user/password based authentication.
The most popular implementation of MFA is Two-factor authentication. In most of the organizations, we use
username/password and an RSA token as two factors for authentication.
With MFA, the system becomes more secure and it cannot be easily hacked.
I. Monitor : DevOps can configure Nagios to monitor IT infrastructure components, system metrics and
network protocols.
II. Alert : Nagios will send alerts when a critical component in infrastructure fails.
IV. Report : Periodically Nagios can publish/send reports on outages, events and SLAs etc.
VI. Planning : Based on past data, Nagios helps in infrastructure planning and upgrades.
In State Stalking, we can enable stalking on a host. Nagios will monitor the state of the host very carefully and it will log any
changes in the state.
By this we can identify what changes might be causing an issue on the host.
II. Monitoring : We can monitor all the mission critical infrastructure components with Nagios.
III. Proactive Planning : With Capacity Planning and Trending we can proactively plan to scale up or scale down
the infrastructure.
The system configuration described in Puppet’s language can be distributed to a target system by using REST API calls.
I. Configuration Language : Puppet provides a language that is used to configure Resources. We have to
specify what Action has to be applied to which Resource.
The Action has three items for each Resource: type, title and list of attributes of a resource. Puppet code is
written in Manifests files.
II. Resource Abstraction : We can create Resource Abstraction in Puppet so that we can configure resources
on different platforms. Puppet agent uses a Facter for passing the information of an environment to Puppet
server. In Facter we have information about IP, hostname, OS etc of the environment.
III. Transaction : In Puppet, Agent sends Facter to Master server. Master sends back the catalog to Client.
Agent applies any configuration changes to system. Once all changes are applied, the result is sent to Server.
It is an open source system based on concepts similar to Google’s deployment process of millions of containers.
In Kubernetes we can create a cluster of servers that are connected to work as a single unit. We can deploy a containerized
application to all the servers in a cluster without specifying the machine name.
We have to package applications in such a way that they do not depend on a specific host.
Master : There is a master node that is responsible for managing the cluster. Master performs following functions in a cluster.
I. Scheduling Applications
II. Maintaining desired state of applications
III. Scaling applications
IV. Applying updates to applications
Nodes : A Node in Kubernetes is responsible for running an application. The Node can be a Virtual Machine or a Computer
in the cluster. There is software called Kubelet on each node. This software is used for managing the node and communicating
with the Master node in cluster.
There is a Kubernetes API that is used by Nodes to communicate with the Master. When we deploy an application on
Kubernetes, we request Master to start application containers on Nodes.
In a Kubernetes cluster, there is a Deployment Controller. This controller monitors the instances created by Kubernetes in a
cluster. Once a node or the machine hosting the node goes down, Deployment Controller will replace the node.
Therefore in Kubernetes cluster, Kubernetes Deployment Controller is responsible for starting the instances as well as
replacing the instances in case of a failure.
Running tests manually is a time taking process. Therefore, we first prepare automation tests and then deliver software. This
ensures that we catch any defects early in our process.
Chaos Monkey is a concept made popular by Netflix. In Chaos Monkey, we intentionally try to shut down the services or
create failures. By failing one or more services, we test the reliability and recovery mechanism of the Production architecture.
It checks whether our applications and deployment have survival strategy built into it or not.
With a Jenkins job, we can automate all these tasks. Once all the automated tests pass, we consider the build as green. This
helps in deployment and release processes to build confidence on the application software.
32. What are the main services of AWS that you have used?
We use following main services of AWS in our environment:
I. EC2 : This is the Elastic Compute Cloud by Amazon. It is used to for providing computing capability to a
system. We can use it in places of our standalone servers. We can deploy different kinds of applications on
EC2.
II. S3 : We use S3 in Amazon for our storage needs.
III. DynamoDB : We use DynamoDB in AWS for storing data in NoSQL database form.
33. Why GIT is considered better than CVS for version control system?
GIT is a distributed system. In GIT, any person can create its own branch and start checking in the code. Once the code is
tested, it is merged into main GIT repo. IN between, Dev, QA and product can validate the implementation of that code.
In CVS, there is a centralized system that maintains all the commits and changes.
GIT is open source software and there are plenty of extensions in GIT for use by our teams.
A Container uses APIs of an Operating System (OS) to provide runtime environment to an application.
A Container just provides the APIs that are required by the application.
Another concept in Serverless Architecture is to treat code as a service or Function as a Service (FAAS). We just write code
that can be run on any environment or server without the need of specifying which server should be used to run this code.
II. Automated : To enable use to make releases more often, we automate the operations from Code Check in to
deployment in Production.
III. Collaborative : DevOps is not only responsibility of Operations team. It is a collaborative effort of Dev, QA,
Release and DevOps teams.
IV. Iterative : DevOps is based on Iterative principle of using a process that is repeatable. But with each iteration
we aim to make the process more efficient and better.
V. Self-Service : In DevOps, we automate things and give self-service options to other teams so that they are
empowered to deliver the work in their domain.
As a DevOps person I give first priority to the needs of an organization and project. At some times I may have to perform a lot
of operations work. But with each iteration, I aim to bring DevOps changes incrementally to an organization.
Over time, organization/project starts seeing results of DevOps practices and embraces it fully.
Sine REST is lightweight; it has very good performance in a software system. It is also one of the foundations for creating
highly scalable systems that provide a service to large number of clients.
Another key feature of a REST service is that as long as the interface is kept same, we can change the underlying
implementation. E.g. Clients of REST service can keep calling the same service while we change the implementation from php
to Java.
I. The First Way: Systems Thinking : In this principle we see the DevOps as a flow of work from left to right.
This is the time taken from Code check in to the feature being released to End customer. In DevOps culture
we try to identify the bottlenecks in this.
II. The Second Way: Feedback Loops : Whenever there is an issue in production it is a feedback about the
whole development and deployment process. We try to make the feedback loop more efficient so that teams
can get the feedback much faster. It is a way of catching defect much earlier in process than it being reported
by customer.
III. The Third Way: Continuous Learning : We make use of first and second way principles to keep on making
improvements in the overall process. This is the third principle in which over the time we make the process and
our operations highly efficient, automated and error free by continuously improving them.
I. Automated Security Testing : We automate and integrate Security testing techniques for Software
Penetration testing and Fuzz testing in software development process.
II. Early Security Checks : We ensure that teams know about the security concerns at the beginning of a
project, rather than at the end of delivery. It is achieved by conducting Security trainings and knowledge
sharing sessions.
III. Standard Process : At DevOps we try to follow standard deployment and development process that has
already gone through security audits. This helps in minimizing the introduction of any new security loopholes
due to change in the standard process.
41. What is Self-testing Code?
Self-testing Code is an important feature of DevOps culture. In DevOps culture, development team members are expected to
write self-testing code. It means we have to write code along with the tests that can test this code. Once the test passes, we
feel confident to release the code.
If we get an issue in production, we first write an automation test to validate that the issue happens in current release. Once the
issue in release code is fixed, we run the same test to validate that the defect is not there. With each release we keep running
these tests so that the issue does not appear anymore.
One of the techniques of writing Self-testing code is Test Driven Development (TDD).
After that we run the automated tests. Depending on the scenario, there are stages like performance testing, security check,
usability testing etc in a Deployment Pipeline.
In DevOps, our aim is to automate all the stages of Deployment Pipeline. With a smooth running Deployment Pipeline, we can
achieve the goal of Continuous Delivery.
I. Image Repositories : In Docker Hub we can push, pull, find and manage Docker Images. It is a big library
that has images from community, official as well as private sources.
II. Automated Builds : We can use Docker Hub to create new images by making changes to source
code repository of the image.
III. Webhooks : With Webhooks in Docker Hub we can trigger actions that can create and build new images by
pushing a change to repository.
IV. Github/Bitbucket integration : Docker Hub also provides integration with Github and Bitbucket systems.
44. What are the security benefits of using Container based system?
Some of the main security benefits of using a Container based system are as follows:
I. Segregation : In a Container based system we segregate the applications on different containers. Each
application may be running on same host but in a separate container. Each application has access to ports, files
and other resources that are provided to it by the container.
II. Transient : In a Container based system, each application is considered as a transient system. It is better than
a static system that has fixed environment which can be exposed overtime.
III. Control: We use repeatable scripts to create the containers. This provides us tight control over the software
application that we want to deploy and run. It also reduces the risk of unwanted changes in setup that can
cause security loopholes.
IV. Security Patch: In a Container based system; we can deploy security patches on multiple containers in a
uniform way. Also it is easier to patch a Container with an application update.
The results of Passive checks are submitted to Nagios. There are two main use cases of Passive checks:
I. We use Passive checks to monitor asynchronous services that do not give positive result with Active checks at
regular intervals of time.
II. We can use Passive checks to monitor services or applications that are located behind a firewall.
Since Docker Container is very lightweight, multiple containers can be run simultaneously on a single server or virtual machine.
With a Docker Container we can create an isolated system with restricted services and processes. A Container has private
view of the operating system. It has its own process ID space, file system, and network interface.
If we want to find IDs of all the Docker images in our local system, we can user docker images command.
% docker images
I. Setting up Development Environment : We can use Docker to set the development environment with the
applications on which our code is dependent.
II. Testing Automation Setup : Docker can also help in creating the Testing Automation setup. We can setup
different services and apps with Docker to create the automation-testing environment.
III. Production Deployment : Docker also helps in implementing the Production deployment for an application.
We can use it to create the exact environment and process that will be used for doing the production
deployment.
50. Can we lose our data when a Docker Container exits?
A Docker Container has its own file-system. In an application running on Docker Container we can write to this file-system.
When the container exits, data written to file-system still remains. When we restart the container, same data can be accessed
again.
Docker Questions
Docker is Open Source software. It provides the automation of Linux application deployment in a software container.
Docker can package software in a complete file system that contains software code, runtime environment, system tools, &
libraries that are required to install and run the software on a server.
52. What is the difference between Docker image and Docker container?
A Docker image is an immutable file, which is a snapshot of container. We create an image with build command.
In a Docker environment, we just deploy the application in Docker. There is no OS layer in this environment. We specify
libraries, and rest of the kernel is provided by Docker engine.
If we use a json file then we have to specify in docker command that we are using a json file as follows:
% docker-compose -f docker-compose.json up
55. Can we run multiple apps on one server with Docker?
Yes, theoretically we can run multiples apps on one Docker server. But in practice, it is better to run different components on
separate containers.
With this we get cleaner environment and it can be used for multiple uses.
I. Multiple environments on same Host : We can use it to create multiple environments on the same host
server.
II. Preserve Volume Data on Container Creation : Docker compose also preserves the volume data when
we create a container.
III. Recreate the changed Containers : We can also use compose to recreate the changed containers.
IV. Variables in Compose file : Docker compose also supports variables in compose file. In this way we can
create variations of our containers.
We use Docker for the complete build flow from development work, test run and deployment to production environment.
One of the very good outcomes of open source software is Docker. It has very powerful features.
Docker has wide acceptance due to its usability as well as its open source approach of integrating with different systems.
59. What is the difference between Docker commands: up, run and start?
We have up and start commands in docker-compose. The run command is in docker.
a. Up : We use this command to build, create, start or restart all the services in a docker-compose.yml file. It
also attaches to containers for a service.
b. Run : We use this command for adhoc requests. It just starts the service that we specifically want to start.
We generally use it run specific tests or any administrative tasks.
c. Start : This command is used to start the container that were previously created but are not currently
running. This command does not create new containers.
Docker Image is the blue print that is used to create a Docker Container. Whenever we want to run a container we have to
specify the image that we want to run.
There are many Docker images available online for standard software. We can use these images directly from the source.
The standard set of Docker Images is stored in Docker Hub Registry. We can download these from this location and use it in
our environment.
We can also create our own Docker Image with the software that we want to run as a container.
Since Docker Container is very lightweight, multiple containers can be run simultaneously on a single server or virtual machine.
With a Docker Container we can create an isolated system with restricted services and processes. A Container has private
view of the operating system. It has its own process ID space, file system, and network interface.
Some of the popular Docker machine commands enable us to start, stop, inspect and restart a managed host.
Docker Machine provides a Command Line Interface (CLI), which is very useful in managing multiple hosts.
65. Why do we use Docker Machine?
There are two main uses of Docker Machine:
I. Old Desktop : If we have an old desktop and we want to run Docker then we use Docker Machine to run
Docker. It is like installing a virtual machine on an old hardware system to run Docker engine.
II. Remote Hosts : Docker Machine is also used to provision Docker hosts on remote systems. By using
Docker Machine you can install Docker Engine on remote hosts and configure clients on them.
69. What are the objects created by Docker Cloud in Amazon Web
Services (AWS) EC2?
I. VPC : Docker Cloud creates a Virtual Private Cloud with the tag name dc-vpc. It also creates Class Less
Inter-Domain Routing (CIDR) with the range of 10.78.0.0/16 .
II. Subnet : Docker Cloud creates a subnet in each Availability Zone (AZ). In Docker Cloud, each subnet
is tagged with dc-subnet.
III. Internet Gateway : Docker Cloud also creates an internet gateway with name dc-gateway and attaches it
to the VPC created earlier.
IV. Routing Table : Docker Cloud also creates a routing table named dc-route-table in Virtual Private Cloud. In
this Routing Table Docker Cloud associates the subnet with the Internet Gateway.
70. How will you take backup of Docker container volumes in AWS S3?
We can use a utility named Dockup provided by Docker Cloud to take backup of Docker container volumes in S3.
II. Services : Then we define the services that make our app in docker-compose.yml. By using this file we
can define how these services can be run together in an environment.
III. Run : The last step is to run the Docker Container. We use docker-compose up to start and run the
application.
In Pluggable Storage Driver architecture, we can use multiple kinds of file systems in our Docker Container. In Docker info
command we can see the Storage Driver that is set on a Docker daemon.
We can even plug in shared storage systems with the Pluggable Storage Driver architecture.
73. What are the main security concerns with Docker based containers?
Docker based containers have following security concerns:
I. Kernel Sharing : In a container-based system, multiple containers share same Kernel. If one container causes
Kernel to go down, it will take down all the containers. In a virtual machine environment we do not have this
issue.
II. Container Leakage : If a malicious user gains access to one container, it can try to access the other
containers on the same host. If a container has security vulnerabilities it can allow the user to access other
containers on same host machine.
III. Denial of Service : If one container occupies the resources of a Kernel then other containers will starve for
resources. It can create a Denial of Service attack like situation.
IV. Tampered Images : Sometimes a container image can be tampered. This can lead to further security
concerns. An attacker can try to run a tampered image to exploit the vulnerabilities in host machines and
other containers.
V. Secret Sharing : Generally one container can access other services. To access a service it requires a Key or
Secret. A malicious user can gain access to this secret. Since multiple containers share the secret, it may lead
to further security concerns.
We can use docker ps –a command to get the list of all the containers in Docker. This command also returns the status of these containers.
Prior to Docker, Developers would develop software and pass it to QA for testing and then it is sent to Build & Release team for deployment.
In Docker workflow, Developer builds an Image after developing and testing the software. This Image is shipped to Registry. From Registry it is
available for deployment to any system. The development process is simpler since steps for QA and Deployment etc take place before the Image
is built. So Developer gets the feedback early.
Docker is built on client server model. Docker server is used to run the images. We use Docker client to communicate with Docker server.
Clients tell Docker server via commands what to do.
Additionally there is a Registry that stores Docker Images. Docker Server can directly contact Registry to download images.
78. What are the popular tasks that you can do with Docker Command
line tool?
Docker Command Line (DCL) tool is implemented in Go language. It can compile and run on most of the common operating systems. Some of
the tasks that we can do with Docker Command Line tool are as follows:
I. We can download images from Registry with DCL.
II. We can start, stop or terminate a container on a Docker server by DCL.
III. We can retrieve Docker Logs via DCL.
IV. We can build a Container Image with DCL.
79. What type of applications- Stateless or Stateful are more suitable for
Docker Container?
It is preferable to create Stateless application for Docker Container. We can create a container out of our application and take out the configurable
state parameters from application. Now we can run same container in Production as well as QA environments with different parameters. This helps
in reusing the same Image in different scenarios. Also a stateless application is much easier to scale with Docker Containers than a stateful
application.
Docker directly works with Linux kernel level libraries. In every Linux distribution, the Kernel is same. Docker containers share same kernel as the
host kernel.
Since all the distributions share the same Kernel, the container can run on any of these distributions.
Generally we use Docker on top of a virtual machine to ensure isolation of the application. On a virtual machine we can get the advantage of
security provided by hypervisor. We can implement different security levels on a virtual machine. And Docker can make use of this to run the
application at different security levels.
We can run multiple Docker containers on same host. These containers can share Kernel resources. Each container runs on its own Operating
System and it has its own user-space and libraries.
So in a way Docker container does not share resources within its own namespace. But the resources that are not in isolated namespace are shared
between containers. These are the Kernel resources of host machine that have just one copy.
So in the back-end there is same set of resources that Docker Containers share.
Both Add and Copy commands of Dockerfile can copy new files from a source location to a destination in Container’s file path.
They behave almost same.
The main difference between these two is that Add command can also read the files from a URL.
As per Docker documentation, Copy command is preferable. Since Copy only supports copying local files to a Container, it is preferred over Add
command.
We use Docker Entrypoint to set the starting point for a command in a Docker Image.
We can use the entrypoint as a command for running an Image in the container.
E.g. We can define following entrypoint in docker file and run it as following command:
ENTRYPOINT [“mycmd”]
% docker run mycmd
We use ONBUILD command in Docker to run the instructions that have to execute after the completion of current Dockerfile build.
It is used to build a hierarchy of images that have to be build after the parent image is built.
A Docker build will execute first ONBUILD command and then it will execute any other command in Child Dockerfile.
We use EXPOSE command to inform Docker that Container will listen on a specific network port during runtime.
But these ports on Container may not be accessible to the host. We can use –p to publish a range of ports from Container.
In a Container we have an isolated environment with namespace for each resource that a kernel provides. There are mainly six types of
namespaces in a Container.
I. UTS Namespace : UTS stands for Unix Timesharing System. In UTS namespace every container gets its own hostname and
domain name.
II. Mount Namespace : This namespace provides its own file system within a container. With this namespace we get root like / in the
file system on which rest of the file structure is based.
III. PID Namespace : This namespace contains all the processes that run within a Container. We can run ps command to see the
processes that are running within a Docker container.
IV. IPC Namespace : IPC stands for Inter Process Communication. This namespace covers shared memory, semaphores, named
pipes etc resources that are shared by processes. The items in this namespace do not cross the container boundary.
V. User Namespace : This namespace contains the users and groups that are defined within a container.
VI. Network Namespace : With this namespace, container provides its own network resources like- ports, devices etc. With this
namespace, Docker creates an independent network stack within each container.
90. How will you monitor Docker in production?
Docker provides tools like docker stats and docker events to monitor Docker in production.
We can get reports on important statistics with these commands.
Docker stats : When we call docker stats with a container id, we get the CPU, memory usage etc of a container. It is similar to top command in
Linux.
Docker events : Docker events are a command to see the stream of activities that are going on in Docker daemon.
Some of the common Docker events are: attach, commit, die, detach, rename, destroy etc.
We can also use various options to limit or filter the events that we are interested in.
92. How can we control the startup order of services in Docker compose?
In Docker compose we can use the depends_on option to control the startup order of services.
With compose, the services will start in the dependency order. Dependencies can be defined in the options like- depends_on, links, volumes_from,
network_mode etc.
But Docker does not wait for until a container is ready.
93. Why Docker compose does not wait for a container to be ready before
moving on to start next service in dependency order?
The problem with waiting for a container to be ready is that in a Distributed system, some services or hosts may become unavailable sometimes.
Similarly during startup also some services may also be down.
Therefore, we have to build resiliency in our application. So that even if some services are down we can continue our work or wait for the service
to become available again.
We can use wait-for-it or dockerize tools for building this kind of resiliency.
94. How will you customize Docker compose file for different
environments?
In Docker compose there are two files docker-compose.yml and docker-compose.override.yml. We specify our base configuration in docker-
compose.yml file. For any environment specific customization we use docker-compose.override.yml file.
We can specify a service in both the files. Docker compose will merge these files based on following rules:
For single value options, new value replaces the old value.
For multi-value options, compose will concatenate the both set of values.
We can also use extends field to extend a service configuration to multiple environments. With extends, child services can use the common
configuration defined by parent service.
I. Flexibility : The businesses that have fluctuating bandwidth demands need the flexibility of Cloud Computing. If you need high
bandwidth, you can scale up your cloud capacity. When you do not need high bandwidth, you can just scale down. There is no
need to be tied into an inflexible fixed capacity infrastructure.
II. Disaster Recovery : Cloud Computing provides robust backup and recovery solutions that are hosted in cloud. Due to this there
is no need to spend extra resources on homegrown disaster recovery. It also saves time in setting up disaster recovery.
III. Automatic Software Updates : Most of the Cloud providers give automatic software updates. This reduces the extra task of
installing new software version and always catching up with the latest software installs.
IV. Low Capital Expenditure : In Cloud computing the model is Pay as you Go. This means there is very less upfront capital
expenditure. There is a variable payment that is based on the usage.
V. Collaboration: In a cloud environment, applications can be shared between teams. This increases collaboration and
communication among team members.
VI. Remote Work: Cloud solutions provide flexibility of working remotely. There is no on site work. One can just connect from
anywhere and start working.
VII. Security: Cloud computing solutions are more secure than regular onsite work. Data stored in local servers and computers is
prone to security attacks. In Cloud Computing, there are very few loose ends. Cloud providers give a secure working environment
to its users.
VIII. Document Control: Once the documents are stored in a common repository, it increases the visibility and transparency among
companies and their clients. Since there is one shared copy, there are fewer chances of discrepancies.
IX. Competitive Pricing: In Cloud computing there are multiple players, so they keep competing among themselves and provide very
good pricing. This comes out much cheaper compared to other options.
X. Environment Friendly: Cloud computing saves precious environmental resources also. By not blocking the resources and
bandwidth.
On-demand Computing is the latest model in enterprise systems. It is related to Cloud computing. It means IT resources can be provided on
demand by a Cloud provider.
In an enterprise system demand for computing resources varies from time to time. In such a scenario, On-demand computing makes sure that
servers and IT resources are provisioned to handle the increase/decrease in demand.
A cloud provider maintains a poll of resources. The pool of resources contains networks, servers, storage, applications and services. This pool can
serve the varying demand of resources and computing by various enterprise clients.
There are many concepts like- grid computing, utility computing, autonomic computing etc. that are similar to on-demand computing.
I. Infrastructure as a Service (IAAS): IAAS providers give low-level abstractions of physical devices. Amazon Web Services
(AWS) is an example of IAAS. AWS provides EC2 for computing, S3 buckets for storage etc. Mainly the resources in this layer
are hardware like memory, processor speed, network bandwidth etc.
II. Platform as a Service (PAAS): PAAS providers offer managed services like Rails, Django etc. One good example of PAAS is
Google App Engineer. These are the environments in which developers can develop sophisticated software with ease.
Developers just focus on developing software, whereas scaling and performance is handled by PAAS provider.
III. Software as a Service (SAAS) : SAAS provider offer an actual working software application to clients. Salesforce and Github
are two good examples of SAAS. They hide the underlying details of the software and just provide an interface to work on the
system. Behind the scenes the version of Software can be easily changed.
An IAAS provider can give physical, virtual or both kinds of resources. These resources are used to build cloud.
IAAS provider handles the complexity of maintaining and deploying these services.
IAAS provider also handles security and backup recovery for these services. The main resources in IAAS are servers, storage, routers, switches
and other related hardware etc.
Platform as a service (PaaS) is a kind of cloud computing service. A PaaS provider offers a platform on which clients can develop, run and
manage applications without the need of building the infrastructure.
In PAAS clients save time by not creating and managing infrastructure environment associated with the app that they want to develop.
I. It allows development work on higher level programming with very less complexity.
II. Teams can focus on just the development of the application that makes the application very effective.
III. Maintenance and enhancement of the application is much easier.
IV. It is suitable for situations in which multiple developers work on a single project but are not co-located.
Biggest disadvantage of PaaS is that a developer can only use the tools that PaaS provider makes available. A developer cannot use the full range
of conventional tools.
Some PaaS providers lock in the clients in their platform. This also decreases the flexibility of clients using PaaS.
I. Private Cloud: Some companies build their private cloud. A private cloud is a fully functional platform that is owned, operated
and used by only one organization.
Primary reason for private cloud is security. Many companies feel secure in private cloud. The other reasons for building
private cloud are strategic decisions or control of operations.
There is also a concept of Virtual Private Cloud (VPC). In VPC, private cloud is built and operated by a hosting company.
But it is exclusively used by one organization.
II. Public Cloud: There are cloud platforms by some companies that are open for general public as well as big companies for use and
deployment. E.g. Google Apps, Amazon Web Services etc.
The public cloud providers focus on layers and application like- cloud application, infrastructure management etc. In this model
resources are shared among different organizations.
III. Hybrid Cloud: The combination of public and private cloud is known as Hybrid cloud. This approach provides benefits of both
the approaches- private and public cloud. So it is very robust platform.
A client gets functionalities and features of both the cloud platforms. By using Hybrid cloud an organization can create its own
cloud as well as they can pass the control of their cloud to another third party.
Scalability is the ability of a system to handle the increased load on its current hardware and software resources. In a highly scalable system it is
possible to increase the workload without increasing the resource capacity. Scalability supports any sudden surge in the demand/traffic with
current set of resources.
Elasticity is the ability of a system to increase the workload by increasing the hardware/software resources dynamically. Highly elastic systems can
handle the increased demand and traffic by dynamically commission and decommission resources. Elasticity is an important characteristic of Cloud
Computing applications. Elasticity means how well your architecture is adaptable to workload in real time.
E.g. If in a system, one server can handle 100 users, 2 servers can handle 200 users and 10 servers can handle 1000 users. But in case for adding
every X users, if you need 2X the amount of servers, then it is not a scalable design.
Let say, you have just one user login every hour on your site. Your one server can handle this load. But, if suddenly, 1000 users login at once, can
your system quickly start new web servers on the fly to handle this load? Your design is elastic if it can handle such sudden increase in traffic so
quickly.
Software as Service is a category of cloud computing in which Software is centrally hosted and it is licensed on a subscription basis. It is also
known as On-demand software. Generally, clients access the software by using a thin-client like a web browser.
Many applications like Google docs, Microsoft office etc. provide SaaS model for their software.
The benefit of SaaS is that a client can add more users on the fly based on its current needs. And client does not need to install or maintain any
software on its premises to use this software.
Cloud computing consists of different types of Datacenters linked in a grid structure. The main types of Datacenters in Cloud computing are:
I. Containerized Datacenter
As the name suggests, containerized datacenter provides high level of customization for an organization. These are traditional kind of
datacenters. We can choose the different types of servers, memory, network and other infrastructure resources in this datacenter. Also
we have to plan temperature control, network management and power management in this kind of datacenter.
In a Low-density datacenter, we get high level of performance. In such a datacenter if we increase the density of servers, the issue with
power comes. With high density of servers, the area gets heated. In such a scenario, effective heat and power management is done. To
reach high level of performance, we have to optimize the number of servers’ in the datacenter.
106. Explain the various modes of Software as a Service (SaaS) cloud
environment?
Software as a Service (SaaS) is used to offer different kinds of software applications in a Cloud environment. Generally these are offered on
subscription basis. Different modes of SaaS are:
I. Simple multi-tenancy : In this setup, each client gets its own resources. These resources are not shared with other clients. It is
more secure option, since there is no sharing of resources. But it an inefficient option, since for each client more money is needed to
scale it with the rising demands. Also it takes time to scale up the application in this mode.
II. Fine grain multi-tenancy : In this mode, the feature provided to each client is same. The resources are shared among multiple
clients. It is an efficient mode of cloud service, in which data is kept private among different clients but computing resources are
shared. Also it is easier and quicker to scale up the SaaS implementation for different clients.
107. What are the important things to care about in Security in a cloud
environment?
With growing concern of hacking, every organization wants to make its software system and data secure. Since in a cloud computing environment,
Software and hardware is not on the premises of an organization, it becomes more important to implement the best security practices.
Organizations have to keep their Data most secure during the transfer between two locations. Also they have to keep data secure when it is stored
at a location. Hackers can hack into application or they can get an unauthorized copy of the data. So it becomes important to encrypt the data
during transit as well as during rest to protect it from unwanted hackers.
Application Programming Interfaces (API) is used in cloud computing environment for accessing many services. APIs are very easy to use. They
provide a quick option to create different set of applications in cloud environment.
An API provides a simple interface that can be used in multiple scenarios.
There are different types of clients for cloud computing APIs. It is easier to serve different needs of multiple clients with APIs in cloud computing
environment.
III. Authentication : In this area, we check the credentials of a user and confirm that it is the correct user. Generally this is done by
user password and multi-factor authentication like-verification by a one-time use code on cell phone.
IV. Authorization : In this aspect, we check for the permissions that are given to a user or role. If a user is authorized to access a
service, they are allowed to use it in the cloud environment.
110. What are the main cost factors of cloud based data center?
Costs in a Cloud based data center are different from a traditional data center. Main cost factors of cloud based data center are as follows:
I. Labor cost : We need skilled staff that can work with the cloud-based datacenter that we have selected for our operation. Since
cloud is not a very old technology, it may get difficult to get the right skill people for handling cloud based datacenter.
II. Power cost : In some cloud operations, power costs are borne by the client. Since it is a variable cost, it can increase with the
increase in scale and usage.
III. Computing cost : The biggest cost in Cloud environment is the cost that we pay to Cloud provider for giving us computing
resources. This cost is much higher compared to the labor or power costs.
In a cloud-computing environment we pay for the services that we use. So main criteria to measure a cloud based service its usage.
For computing resource we measure by usage in terms of time and the power of computing resource.
For a storage resource we measure by usage in terms of bytes (giga bytes) and bandwidth used in data transfer.
Another important aspect of measuring a cloud service is its availability. A cloud provider has to specify the service level agreement (SLA) for the
time for which service will be available in cloud.
In a traditional datacenter the cost of increasing the scale of computing environment is much higher than a Cloud computing environment. Also in a
traditional data center, there are not much benefits of scaling down the operation when demand decreases. Since most of the expenditure is in
capital spent of buying servers etc., scaling down just saves power cost, which is very less compared to other fixed costs.
Also in a Cloud environment there is no need to higher a large number of operations staff to maintain the datacenter. Cloud provider takes care of
maintaining and upgrading the resources in Cloud environment.
With a traditional datacenter, people cost is very high since we have to hire a large number of technical operation people for in-house datacenter.
In a Cloud environment, it is important to optimize the availability of an application by implementing disaster recovery strategy. For disaster
recovery we create a backup application in another location of cloud environment. In case of complete failure at a data center we use the disaster
recovery site to run the application.
Another aspect of cloud environment is that servers often fail or go down. In such a scenario it is important to implement the application in such a
way that we just kill the slow server and restart another server to handle the traffic seamlessly.
114. What are the requirements for implementing IaaS strategy in Cloud?
I. Operating System (OS): We need an OS to support hypervisor in IaaS. We can use open source OS like Linux for this
purpose.
II. Networking : We have to define and implement networking topology for IaaS implementation. We can use public or private
network for this.
III. Cloud Model : We have to select the right cloud model for implementing IaaS strategy. It can be SaaS, PaaS or CaaS.
115. What is the scenario in which public cloud is preferred over private
cloud?
In a startup mode often we want to test our idea. In such a scenario it makes sense to setup application in public cloud. It is much faster and
cheaper to use public cloud over private cloud.
Remember security is a major concern in public cloud. But with time and changes in technology, even public cloud is very secure.
Cloud Computing is a highly scalable, highly available and cost effective solution for software and hardware needs of an application.
Cloud Computing provides great ease of use in running the software in cloud environment. It is also very fast to implement compared with any
other traditional strategy.
In Client Server architecture there is one to one communication between client and server. Server is often at in-house datacenter and client can
access same server from anywhere. If client is at a remote location, the communication can have high latency.
In Cloud Computing there can be multiple servers in the cloud. There will be a Cloud controller that directs the requests to right server node. In
such a scenario clients can access cloud-based service from any location and they can be directed to the one nearest to them.
Another reason for Cloud computing architecture is high availability. Since there are multiple servers behind the cloud, even if one server is down,
another server can serve the clients seamlessly.
I. Elasticity : In Cloud Computing system is highly elastic in the sense that it can easily adapt itself to increase or decrease in load.
There is no need to take urgent actions when there is surge in traffic requests.
II. Self-service provisioning : In Cloud environment users can provision new resources on their own by just calling some APIs.
There is no need to fill forms and order actual hardware from vendors.
III. Automated de-provisioning : In case demand/load decreases, extra resources can be automatically shut down in Cloud
computing environment.
IV. Standard Interface : There are standard interfaces to start, stop, suspend or remove an instance in Cloud environment. Most of
the services are accessible via public and standard APIs in Cloud computing.
V. Usage based Billing : In a Cloud environment, users are charged for their usage of resources. They can forecast their bill and
costs based on the growth they are expecting in their load.
119. How databases in Cloud computing are different from traditional
databases?
In a Cloud environment, companies often use different kind of data to store. There are data like email, images, video, pdf, graph etc. in a Cloud
environment. To store this data often NoSQL databases are used.
A NoSQL database like MongoDB provides storage and retrieval of data that cannot be stored efficiently in a traditional RDBMS.
Database like Neo4J provides features to store graph data like Facebook, LinkedIn etc. in a cloud environment.
Hadoop like database help in storing Big Data based information. It can handle very large-scale information that is generated in a large-scale
environment.
In a Cloud environment, we can create a virtual private network (VPM) that can be solely used by only one client. This is a secure network in
which data transfer between servers of same VPN is very secure.
By using VPN, an organization uses the public network in a private manner. It increases the privacy of an organization’s data transfer in a cloud
environment.
I. Network Access Server (NAS): A NAS server is responsible for setting up tunnels in a VPN that is accesses remotely. It
maintains these tunnels that connect clients to VPN.
II. Firewall : It is the software that creates barrier between VPN and public network. It protects the VPN from malicious activity that
can be done from the outside network.
III. AAA Server : This is an authentication and authorization server that controls the access and usage of VPN. For each request to
use VPN, AAA server checks the user for correct permissions.
IV. Encryption : In a VPN, encryption algorithms protect the important private data from malicious users.
122. How will you secure the application data for transport in a cloud
environment?
With ease of use in Cloud environment comes the important aspect of keeping data secure. Many organizations have data that is transferred from
their traditional datacenter to Cloud datacenter.
During the transit of data it is important to keep it secure. Once of the best way to secure data is by using HTTPS protocol over Secure Socket
Layer (SSL).
Another important point is to keep the data always encrypted. This protects data from being accessed by any unauthorized user during transit.
In Cloud computing scale is not a limit. So there are very large-scale databases available from cloud providers. Some of these are:
I. Amazon DynamoDB : Amazon Web Services (AWS) provides a NoSQL web service called DynamoDB that provides highly
available and partition tolerant database system. It has a multi-master design. It uses synchronous replication across multiple
datacenters. We can easily integrate it with MapReduce and Elastic MapReduce of AWS.
II. Google Bigtable : This is a very large-scale high performance cloud based database option from Google. It is available on Google
Cloud. It can be scaled to peta bytes. It is a Google proprietary implementation. In Bigtable, two arbitrary string values, row key
and column key, and timestamp are mapped to an arbitrary byte array. In Bigtable MapReduce algorithm is used for modifying and
generating the data.
III. Microsoft Azure SQL Database : Microsoft Azure provides cloud based SQL database that can be scaled very easily for
increased demand. It has very good security features and it can be even used to build multi-tenant apps to service multiple
customers in cloud.
124. What are the options for open source NoSQL database in a Cloud
environment?
Most of the cloud-computing providers support Open Source NoSQL databases. Some of these databases are:
I. Apache CouchDB : It is a document based NoSQL database from Apache Open Source. It is compatible with Couch
Replication Protocol. It can communicate in native JSON and can store binary data very well.
II. HBase : It is a NoSQL database for use with Hadoop based software. It is also available as Open Source from Apache. It is a
scalable and distributed Big Data database.
III. MongoDB : It is an open source database system that offers a flexible data model that can be used to store various kinds of data.
It provides high performance and always-on user experience.
125. What are the important points to consider before selecting cloud
computing?
Cloud computing is a very good option for an organization to scale and outsource its software/hardware needs. But before selecting a cloud
provider it is important to consider following points:
I. Security : One of the most important points is security of the data. We should ask the cloud provider about the options to keep
data secure in cloud during transit and at rest.
II. Data Integrity : Another important point is to maintain the integrity of data in cloud. It is essential to keep data accurate and
complete in cloud environment.
III. Data Loss : In a cloud environment, there are chances of data loss. So we should know the provisions to minimize the data loss. It
can be done by keeping backup of data in cloud. Also there should be reliable data recovery options in case of data loss.
IV. Compliance : While using a cloud environment one must be aware of the rules and regulations that have to be followed to use the
cloud. There compliance issues with storing data of a user in an external provider’s location/servers.
V. Business Continuity : In case of any disaster, it is important to create business continuity plans so that we can provide
uninterrupted service to our end users.
VI. Availability : Another important point is the availability of data and services in a cloud-computing environment. It is very important
to provide high availability for a good customer experience.
VII. Storage Cost : Since data is stored in cloud, it may be very cheap to store the data. But the real cost can come in transfer of data
when we have to pay by bandwidth usage. So storage cost of data in cloud should also include the access cost of data transfer.
VIII. Computing Cost : One of the highest costs of cloud is computing cost. It can be very high cost with the increase of scale. So
cloud computing options should be wisely considered in conjunction with computing cost charged for them.
Often an organization does not know all the options available in a Cloud computing environment. Here comes the role of a System Integrator (SI)
who specializes in implementing Cloud computing environment.
SI creates the strategy of cloud setup. It designs the cloud platform for the use of its client. It creates the cloud architecture for the business need of
client.
SI oversees the overall implementation of cloud strategy and plan. It also guides the client while choosing the right options in cloud computing
platform.
Virtualization is the core of cloud computing platform. In cloud we can create a virtual version of hardware, storage and operating system that can
be used to deploy the application.
A cloud provider gives options to create virtual machines in cloud that can be used by its clients. These virtual machines are much cheaper than
buying a few high end computing machines.
In cloud we can use multiple cheap virtual machines to implement a resilient software system that can be scaled very easily in quick time. Where as
buying an actual high-end machine to scale the system is very costly and time taking.
Eucalyptus is an open source software to build private and hybrid cloud in Amazon Web Services (AWS).
It stands for Elastic Utility Computing Architecture for Linking Your Programs To Useful Systems.
We can create our own datacenter in a private cloud by using Eucalyptus. It makes use of pooling the computing and storage resources to scale up
the operations.
In Eucalyptus, we create images of software applications. These images are deployed to create instances. These instances are used for computing
needs.
I. Cloud Controller (CLC) : This is the controller that manages virtual resources like servers, network and storage. It is at the
highest level in hierarchy. It is a Java program with web interface for outside world. It can do resource scheduling as well as
system accounting. There is only one CLC per cloud. It can handle authentication, accounting, reporting and quota management in
cloud.
II. Walrus : This is another Java program in Eucalyptus that is equivalent to AWS S3 storage. It provides persistent storage. It also
contains images, volumes and snapshots similar to AWS. There is only one Walrus in a cloud.
III. Cluster Controller (CC) : It is a C program that is the front end for a Eucalyptus cloud cluster. It can communicate with Storage
controller and Node controller. It manages the instance execution in cloud.
IV. Storage Controller (SC) : It is a Java program equivalent to EBS in AWS. It can interface with Cluster Controller and Node
Controller to manage persistent data via Walrus.
V. Node Controller (NC) : It is a C program that can host a virtual machine instance. It is at the lowest level in Eucalyptus cloud. It
downloads images from Walrus and creates an instance for computing requirements in cloud.
VI. VMWare Broker : It is an optional component in Eucalyptus. It provides AWS compatible interface to VMWare environment.
Amazon Web Services (AWS) provides an important feature called Auto-scaling in the cloud. With Auto-scaling setup we can automatically
provision and start new instances in AWS cloud without any human intervention.
Let say if the load reaches a threshold we can setup auto-scaling to kick in and start a new server to handle additional load.
Utility computing is a cloud service model in which provider gives computing resources to users for using on need basis.
I. Pay per use : Since a user pays for only usage, the cost of Utility computing is pay per use. We pay for the number of servers of
instances that we use in cloud.
II. Easy to Scale : It is easier to scale up the operations in Utility computing. There is no need to plan for time consuming and costly
hardware purchase.
III. Maintenance : In Utility computing maintenance of servers is done by cloud provider. So a user can focus on its core business. It
need not spend time and resources on maintenance of servers in cloud.
Hypervisor is also known as virtual machine monitor (VMM). It is a computer software/hardware that can create and run virtual machines.
Hypervisor runs on a host machine. Each virtual machine is called Guest machine.
Hypervisor derives its name from term supervisor, which is a traditional name for the kernel of an operating system.
Hypervisor provides a virtual operating platform to the guest operating system. It manages the execution of guest OS.
Examples of Type-1 are: Xen, Oracle VM Server for SPARC, Oracle VM Server for x86, the Citrix XenServer, Microsoft
Hyper-V and VMware ESX/ESXi.
II. Type-2, hosted hypervisors: Type 2 hypervisor runs like a regular computer program on an operating system. The guest
operating system runs like a process on the host machine. It creates an abstract guest operating system different from the host
operating system.
Examples of Type-2 are: VMware Workstation, VMware Player, VirtualBox, Parallels Desktop for Mac and QEMU are
examples of type-2 hypervisors.
Type-1 Hypervisor has better performance than Type-2 hypervisor because Type-1 hypervisor skips the host operating system and it runs directly
on host hardware. So it can utilize all the resources of host machine.
In cloud computing Type-1 hypervisors are more popular since Cloud servers may need to run multiple operating system images.
CaaS is also known as Communication as a Service. It is available in Telecom domain. One of the examples for CaaS is Voice Over IP (VoIP).
CaaS offers business features like desktop call control, unified messaging, and fax via desktop.
CaaS also provides services for Call Center automation like- IVR, ACD, call recording, multimedia routing and screen sharing.
Since Mobile devices are getting connected to the Internet in large numbers, we often use Cloud computing for Mobile devices.
In mobile applications, there can be sudden increase in traffic as well as usage. Even some applications become viral very soon. This leads to very
high load on application.
In such a scenario, it makes sense to use Cloud Computing for mobile devices.
Also mobile devices keep changing over time, it requires standard interfaces of cloud computing for handling multiple mobile devices.
One of the main reasons for selecting Cloud architecture is scalability of the system. In case of heavy load, we have to scale up the system so that
there is no performance degradation.
While scaling up the system we have to start new instances. To provision new instances we have to deploy our application on them.
In such a scenario, if we want to save time, it makes sense to automate the deployment process. Another term for this is Auto-scaling.
With a fully automated deployment process we can start new instances based on automated triggers that are raised by load reaching a threshold.
Amazon provides a wide range of products in Amazon Web Services for implementing Cloud computing architecture. In AWS some of the main
components are as follows:
I. Amazon EC2 : This is used for creating instances and getting computing power to run applications in AWS.
II. Amazon S3 : This is a Simple Storage Service from AWS to store files and media in cloud.
III. Amazon DynamoDB : It is the database solution by AWS in cloud. It can store very large-scale data to meet needs of even
BigData computing.
IV. Amazon Route53 : This is a cloud based Domain Name System (DNS) service from AWS.
V. Amazon Elastic Load Balancing (ELB): This component can be used to load balance the various nodes in AWS cloud.
VI. Amazon CodeDeploy : This service provides feature to automate the code deployment to any instance in AWS.
Google is a newer cloud alternative than Amazon. But Google provides many additional features than AWS. Some of the main components of
Google Cloud are as follows:
I. Compute Engine : This component provides computing power to Google Cloud users.
II. Cloud Storage : As the name suggests this is a cloud storage solution from Google for storing large files for application use or just
serving over the Internet.
III. Cloud Bigtable : It is a Google proprietary database from Google in Cloud. Now users can use this unique database for creating
their applications.
IV. Cloud Load Balancing : This is a cloud-based load balancing service from Google.
V. BigQuery : It is a data-warehouse solution from Google in Cloud to perform data analytics of large scale.
VI. Cloud Machine Learning Platform : It is a powerful cloud based machine learning product from Google to perform machine
learning with APIs like- Job Search, Text Analysis, Speech Recognition, Dynamic translation etc.
VII. Cloud IAM : This is an Identity and Access management tool from Google to help administrators run the security and
authorization/authentication policies of an organization.
Microsoft is a relatively new entrant to Cloud computing with Azure cloud offering. Some of the main products of Microsoft cloud are as follows:
I. Azure Container Service : This is a cloud computing service from Microsoft to run and manage Docker based containers.
III. App Service : By using App Services, users can create Apps for mobile devices as well as websites.
VI. Azure Bot Service : We can use Azure Bot Service to create serverless bots that can be scaled up on demand.
VII. Azure IoT Hub : It is a solution for Internet of Things services in cloud by Microsoft.
These days Cloud Computing is one of the most favorite architecture among organizations for their systems. Following are some of the reasons for
popularity of Cloud Computing architecture:
I. IoT : With the Internet of Things, there are many types of machines joining the Internet and creating various types of interactions. In
such a scenario, Cloud Computing serves well to provide scalable interfaces to communicate between the machines in IoT.
II. Big Data : Another major trend in today’s computing is Big Data. With Big Data there is very large amount of user / machine data
that is generated. Using in-house solution to handle Big Data is very costly and capital intensive. In Cloud Computing we can
handle Big Data very easily since we do not have to worry about capital costs.
III. Mobile Devices : A large number of users are going to Mobile computing. With a mobile device users can access a service from
any location. To handle wide-variety of mobile devices, standard interfaces of Cloud Computing are very useful.
IV. Viral Content : With growth of Social Media, content and media is getting viral i.e. It takes very short time to increase the traffic
exponentially on a server. In such a scenario Auto-scaling of Cloud Computing architecture can handle such spikes very easily.
142. What are the Machine Learning options from Google Cloud?
Google provides a very rich library of Machine Learning options in Google Cloud. Some of these API are:
I. Google Cloud ML : This is a general purpose Machine Learning API in cloud. We can use pre-trained models or generate new
models for machine learning with this option.
II. Google Cloud Jobs API : It is an API to link Job Seekers with Opportunities. It is mainly for job search based on skills, demand
and location.
III. Google Natural Language API : This API can do text analysis of natural language content. We can use it for analyzing the
content of blogs, websites, books etc.
IV. Google Cloud Speech API : It is a Speech Recognition API from Google to handle spoken text. It can recognize more than 80
languages and their related variants. It can even transcribe the user speech into written text.
V. Google Cloud Translate API : This API can translate content from one language to another language in cloud.
VI. Google Cloud Vision API : It is a powerful API for Image analysis. It can recognize faces and objects in an image. It can even
categorize images in multiple relevant categories with a simple REST API call.
In a Cloud Computing environment we pay by usage. In such a scenario our usage costs are much higher. To optimize the Cloud Computing
environment we have to keep a balance between our usage costs and usage.
If we are paying for computing instances we can choose options like Lambda in AWS, which is a much cheaper options for computing in cloud.
In case of Storage, if the data to be stored is not going to be accesses frequently we can go for Glacier option in AWS.
Similarly when we pay for bandwidth usage, it makes sense to implement a caching strategy so that we use less bandwidth for the content that is
accessed very frequently.
It is a challenging task for an architect in cloud to match the options available in cloud with the budget that an organization has to run its
applications.
Optimizations like server-less computing, load balancing, and storage selection can help in keeping the Cloud computing costs low with no
degradation in User experience.
Yes, in Cloud Computing we are using resources that are owned by the Cloud provider. Due to this our data resides on the servers that can be
shared by other users of Cloud.
There are regulations and laws for handling user data. We have to ensure that these regulations are met while selecting and implementing a Cloud
computing strategy.
Similarly, if we are in a contract with a client to provide certain Service Level Agreement (SLA) performance, we have to implement the cloud
solution in such a way that there is no breach of SLA agreement due to Cloud provider’s failures.
For security there are laws that have to be followed irrespective of Cloud or Co-located Data center. This is in the interest of our end-customer as
well as for the benefit of business continuity.
With Cloud computing architecture we have to do due diligence in selecting Security and Encryption options in Cloud.
Unix Questions
145. How will you remove all files in current directory? Including
the files that are two levels down in a sub-directory.
In Unix we have rm command to remove files and sub-directories. With rm command we have –r option that stands for recursive. The –r option
can delete all files in a directory recursively.
My_dir
->Level_1_dir
-> Level_1_dir ->Level_2_dir
-> Level_1_dir ->Level_2_dir->a.txt
With rm –r * command we can delete the file a.txt as well as sub-directories Level_1_dir and Level_2_dir.
Command:
rm – r *
The asterisk (*) is a wild card character that stands for all the files with any name.
146. What is the difference between the –v and –x options in Bash shell
scripts?
In a BASH Unix shell we can specify the options –v and –x on top of a script as follows:
#!/bin/bash -x –v
With –x option BASH shell will echo the commands like for, select, case etc. after substituting the arguments and variables. So it will be an
expanded form of the command that shows all the actions of the script. It is very useful for debugging a shell script.
With –v option BASH shell will echo every command before substituting the values of arguments and variables. In –v option Unix will print each
line as it reads.
In –v option, If we run the script, the shell prints the entire file and then executes. If we run the script interactively, it shows each command after
pressing enter.
In Unix there are many Filter commands like- cat, awk, grep, head, tail cut etc.
A Filter is a software program that takes an input and produces an output, and it can be used in a stream operation.
We can mix and match multiple filters to create a complex command that can solve a problem.
Awk and Sed are complex filters that provide fully programmable features.
Even Data scientists use Unix filters to get the overview of data stored in the files.
A Kernel is the main component that can control everything within Unix OS.
It is the first program that is loaded on startup of Unix OS. Once it is loaded it will manage the rest of the startup process.
Kernel manages memory, scheduling as well as communication with peripherals like printers, keyboards etc.
But Kernel does not directly interact with a user. For a new task, Kernel will spawn a shell and user will work in a shell.
Kernel provides many system calls. A software program interacts with Kernel by using system calls.
Kernel has a protected memory area that cannot be overwritten accidentally by any process.
Shell in Unix is a user interface that is used by a user to access Unix services.
Generally a Unix Shell is a command line interface (CLI) in which users enter commands by typing or uploading a file.
We use a Shell to run different commands and programs on Unix operating system.
A Shell also has a command interpreter that can take our commands and send these to be executed by Unix operating system.
Some of the popular Shells on Unix are: Korn shell, BASH, C shell etc.
150. What are the different shells in Unix that you know about?
We use ls -l command to list the files and directories in a directory. With -l option we get long listing format.
In this format the first character identifies the entry type. The entry type can be one of the following:
In a Multi-tasking environment, same user can submit more than one tasks and operating system will execute them at the same time.
In a Multi-user environment, more than one user can interact with the operating system at the same time.
Each Inode has a number that is used in the index table. Unix kernel uses Inode number to access the contents of an Inode.
154. What is the difference between absolute path and relative path in
Unix file system?
Absolute path is the complete path of a file or directory from the root directory. In general root directory is represented by / symbol. If we are in a
directory and want to know the absolute path, we can use pwd command.
E.g. In a directory structure /var/user/kevin/mail if we are in kevin directory then pwd command will give absolute path as /var/user/kevin.
Absolute path of mail folder is /var/user/kevin/mail. For mail folder ./mail is the relative path of mail directory from kevin folder.
1. Program Execution: A shell is responsible for executing the commands and script files in Unix. User can either interactively enter the commands
in Command Line Interface called terminal or they can run a script file containing a program.
2. Environment Setup: A shell can define the environment for a user. We can set many environment variables in a shell and use the value of these
variables in our program.
3. Interpreter: A shell acts as an interpreter for our scripts. It has a built in programming language that can be used to implement the logic.
4. Pipeline: A shell also can hookup a pipeline of commands. When we run multiple commands separated by | pipe character, the shell takes the
output of a command and passes it to next one in the pipeline.
5. I/O Redirection: Shell is also responsible for taking input from command line interface (CLI) and sending the output back to CLI. We use >, <,
>> characters for this purpose.
To use a Shell variable in a script we use $ sign in front of the variable name.
157. What are the important Shell variables that are initialized on starting
a Shell?
There are following important Shell variables that are automatically initialized when a Shell starts:
user:
term:
home:
path:
These Shell variables take values from environment variables.
If we change the value of these Shell variables then the corresponding environment variable value is also changed.
158. How will you set the value of Environment variables in Unix?
We can use 'setenv' command to set the value of environment variables.
E.g. % setenv [Name] [value]
% setenv MAX_TIME 10
If we just use printenv then it lists all the environment variables and their values.
To use an environment variable in a command we use the prefix $ with the name of variable.
What is the special rule about Shell and Environment variable in Bourne Shell?
In Bourne Shell, there is not much difference between Shell variable and Environment variable.
Once we start a Bourne Shell, it gets the value of environment variables and defines a corresponding Shell variable. From that time onwards the
shell only refers to Shell variable. But if a change is made to a Shell variable, then we have to explicitly export it to environment so that other shell
or child processes can use it.
159. What is the difference between a System Call and a library function?
System calls are low-level kernel calls. These are handled by the kernel. System calls are implemented in kernel of Unix. An application has to
execute special hardware and system dependent instruction to run a System call.
A library function is also a low level call but it is implemented in user space. A library call is a regular function call whose code resides in a shared
library.
160. What are the networking commands in Unix that you have used?
Some of the popular networking commands in Unix that we use are as follows:
I. ping : We use this command to test the reachability of a host on an Internet Protocol (IP) network.
II. telnet : This is another useful command to access another machine on the network. This is command uses Telnet protocol.
III. tracert : This is short for Traceroute. It is a diagnostic command to display the route and transit delays of packets across Internet
Protocol.
IV. ftp : We use ftp commands to transfer files over the network. ftp uses File Transfer Protocol.
V. su : This unix command is used to execute commands with the privileges of another user. It is also known as switch user, substitute
user.
VI. ssh : This is a secure command that is preferred over Telnet for connecting to another machine. It creates a secure channel over an
unsecured network. It uses cryptographic protocol to make the communication secure.
A Pipeline in Unix is a chain of commands that are connected through a stream in such a way that output of one command becomes input for
another command.
In the above example we have created pipeline of three commands ls, grep and wc.
First ls –l command is executed and gives the list of files in a directory. Then grep command searches for any line with word “abc” in it. Finally wc
–l command counts the number of lines that are returned by grep command.
In general a Pipeline is uni-directional. The data flows from left to right direction.
We can use tee command to split the output of a program so that it is visible on command line interface (CLI) as well as stored on a file for later
use.
163. How will you count the number of lines and words in a file in Unix?
We can use wc (word count) command for counting the number of lines and words in a file. The wc command provides very good options for
collecting statistics of a file. Some of these options are:
In case we give more than one files as input to wc command then it gives statistics for individual files as well as the total statistics for all files.
Bash stands for Bourne Again Shell. It is free software written to replace Bourne shell.
We can use grep command to search for a name or any text in a Unix file.
Grep command can search for a text in one file as well as multiple files.
% grep ^z *.txt
Above command searches for lines starting with letter z in all the .txt files in current directory.
In Unix, grep is one of the very useful commands. It provides many useful options. Some of the popular options are:
% grep –v: We use this option to find the lines that do not have the text we are searching.
% grep –A 10: This option displays 10 lines after the match is found.
Both the commands whoami and who am i are used to get the user information in Unix.
When we login as root user on the network, then both whoami and who am i commands will show the user as root.
But when any other user let say john logs in remotely and runs su –root, whoami will show root, but who am i will show the original user john.
Superuser is a special user account. It is used for Unix system administration. This user can access all files on the file system. Also Superuser can
also run any command on a system.
Most of the users work on their own user accounts. But when they need to run some additional commands, they can use su to switch to Superuser
account.
169. How will you check the information about a process in Unix?
We can use ps command to check the status of a process in Unix. It is short for Process Status.
On running ps command we get the list of processes that are executing in the Unix environment.
Generally we use ps –ef command. In this e stands for every process and f stands for full format.
This command gives us id of the process. We can use this id to kill the process.
If a file is very big then the contents of the file will not fit in screen, therefore screen will scroll forward and in the end we just see the last page of
information from a file.
With more command we can pause the scrolling of data from a file in display. If we use cat command with more then we just see the first page of a
file first. On pressing enter button, more command will keep changing the page. In this way it is easier to view information in a file.
When using the cat command to display file contents, large data that does not fit on the screen would scroll off without pausing, therefore making it
difficult to view. On the other hand, using the more command is more appropriate in such case because it will display file contents one screen page
at a time.
With the combination of these three sets permissions of file in Unix are specified.
E.g. If a file has permissions –rwxr-xr-- , it means that owner has read, write, execute access. Group has read and execute access. Others have
just read access. So the owner or admin has to specifically grant access to Others to execute the file.
172. We wrote a shell script in Unix but it is not doing anything. What
could be the reason?
After writing a shell script we have to give it execute permission so that it can be run in Unix shell.
We can use chmod command to change the permission of a file in Unix. In general we use chmod +x to give execute permission to users for
executing the shell script.
E.g. chmod +x abc.txt will give execute permission to users for executing the file abc.txt.
With chmod command we can also specify to which user/group the permission should be granted. The options are:
We use chmod command to change the permissions of a file in Unix. In this command we can pass the file permissions in the form of a three-digit
number.
In this number 755, first digit 7 is the permissions given to owner, second digit 5 is the permissions of group and third digit 5 is the permissions of
all others.
In out example 755 means, owner has read, write and execute permissions. Group and others have read and execute permissions.
178. How can we run a process in background in Unix? How can we kill a
process running in background?
Once we use & option it runs the process in background and prints the process ID. We cannot down this process ID for using it in kill command.
We can also use ps –ef command to get the process ID of processes running in background.
Once we know the process ID of a process we can kill it by following command:
% kill -9 processId
We can create a file with Vi editor, cat or any other command. Once the file is created we have to give read only permissions to file. To change file
permission to read only we use following command:
We use alias in Unix to give a short name to a long command. We can even use it to combine multiple commands and give a short convenient
name.
With this alias we just need to type c for running clear command.
To get the list of all active alias in a shell we can run the alias command without any argument on command line.
% alias
alias h='history'
alias ki='kill -9'
alias l='last'
In Unix we can redirect the output of command or operation to a file instead of command line interface (CLI). For this we sue redirection pointers.
These are symbols > and >>.
If we want to append the contents of one file at the end of another file we use following:
% cat srcFile >> appendToFile
182. What are the main steps taken by a Unix Shell for processing a
command?
I. Parse : First step is to parse the command or set of commands given in a Command Line Interface (CLI). In this step multiple
consecutive spaces are replaced by single space. Multiple commands that are delimited by a symbol are divided into multiple
individual actions.
II. Variable : In next step Shell identifies the variables mentioned in commands. Generally any word prefixed by $ sign is a variable.
III. Command Substitution : In this step, Shell executes the commands that are surrounded by back quotes and replaces that section
with the output from the command.
IV. Wild Card : Once these steps are done, Shell replaces the Wild card characters like asterisk * with the relevant substitution.
V. Execute : Finally, Shell executes all the commands and follows the sequence in which Commands are given in CLI.
Sometimes when we give write permission to another user then that user can delete the file without the owner knowing about it. To prevent such an
accidental deletion of file we use sticky bit.
When we mark a file/directory with a sticky bit, no user other than owner of file/directory gets the privilege to delete a file/directory.
To set the sticky bit we use following command:
% chmod +t filename
When we do ls for a file or directory, the entries with sticky bit are listed with letter t in the end of permissions.
E.g. % ls –lrt
184. What are the different outputs from Kill command in Unix?
EPERM denotes that system does not permit the process to be killed.
ESRCH denotes that process with PID mentioned in Kill command does not exist anymore. Or due to security restrictions we
cannot access that process.
In Unix, almost all the popular shells provide options to customize the environment by using environment variables. To make these customizations
permanent we can write these to special files that are specific to a user in a shell.
Once we write our customizations to these files, we keep on getting same customization when we open a new shell with same user account.
The special files for storing customization information for different shells at login time are:
I. C shell: /etc/.login or ~/.cshrc
II. TC shell: /etc/.login or ~/.tshrc
III. Korn shell: ~etc/ksh.kshrc
IV. Bash: ~/.bash_profile
186. What are the popular commands for user management in Unix?
I. id : This command gives the active user id with login and groups to which user belongs.
II. who : This command gives the user that is currently logged on system. It also gives the time of login.
III. last : This command shows the previous logins to the system in a chronological order.
A shell script is a program that can be executed in Unix shell. Sometimes a shell script does not work as intended. To debug and find the problem
in shell script we can use the options provided by shell to debug the script.
In bash shell there are x and v options that can be used while running a script.
With option v all the input lines are printed by shell. With option x all the simple commands are printed in expanded format. We can see all the
arguments passed to a command with –x option.
Sometimes a child process is terminated in Unix, but the parent process still waits on it.
A Zombie process is different from an Orphan process. An orphan process is a child process whose parent process had died. Once a process is
orphan it is adopted by init process. So effectively it is not an orphan.
Therefore if a process exits without cleaning its child processes, they do not become Zombie. Instead init process adopts these child processes.
Zombie processes are the ones that are not yet adopted by init process.
We can use one of the networking commands in Unix. It is called ping. With ping command we can ping a remote host.
Ping utility sends packets in an IP network with ICMP protocol. Once the packet goes from source to destination and comes back it records the
time.
We can even specify the number of packets we want to send so that we collect more statistics to confirm the result.
% ping www.google.com
190. How will you get the last executed command in Unix?
We can use history command to get the list commands that were executed in Unix. Since we are only interested in the last executed command we
have to use tail to get the last entry.
We can use “2>&1” in a command so that all the errors from standard error go to standard output.
192. How will you find which process is taking most CPU time in Unix?
In Unix, we can use top command to list the CPU time and memory used by various processes. The top command lists the process IDs and CPU
time, memory etc used by top most processes.
Top command keeps refreshing the screen at a specified interval. So we can see over the time which process is always appearing on the top most
row in the result of top command.
193. What is the difference between Soft link and Hard link in Unix?
A soft link is a pointer to a file, directory or a program located in a different location. A hard link can point to a program or a file but not to a
directory.
If we move, delete or rename a file, the soft link will be broken. But a hard link still remains after moving the file/program.
We use the command ln –s for creating a soft link. But a hard link can be created by ln command without –s option.
194. How will you find which processes are using a file?
We can use lsof command to find the list of Process IDs of the processes that are accessing a file in Unix.
In Unix, nohup command can be used to run a command in background. But it is different from & option to run a process in background.
Nohup stands for No Hangup. A nohup process does not stop even if the Unix user that started the process has logged out from the system.
But the process started with option & will stop when the user that started the process logs off.
196. How will you remove blank lines from a file in Unix?
We can use grep command for this option. Grep command gives –v option to exclude lines that do not match a pattern.
In an empty line there is nothing from start to end. In Grep command, ^ denotes that start of line and $ denotes the end of line.
% grep –v ‘^$’ lists the lines that are empty from start to the end.
Once we get this result, we can use > operator to write the output to a new file. So exact command will be:
197. How will you find the remote hosts that are connecting to your
system on a specific port in Unix?
We can use netstat command for this purpose. Netstat command lists the statistics about network connections. We can grep for the port in which
we are interested.
Exact command will be:
% netstst –a | grep “port number”
We use xargs command to build and execute commands that take input from standard input. It is generally used in chaining of commands.
Xargs breaks the list of arguments into small sub lists that can be handled by a command.
The above command uses find to get the list of all files in /path directory. Then xargs command passes this list to rm command so that they can be
deleted.
Thanks!!!
TOP 250+ Interviews Questions on AWS
Answer:AWS stands for Amazon Web Services. AWS is a platform that provides on-demand
resources for hosting web services, storage, networking, databases and other resources over the
internet with a pay-as-you-go pricing.
Answer:EC2 – Elastic Compute Cloud, S3 – Simple Storage Service, Route53, EBS – Elastic Block
Store, Cloudwatch, Key-Paris are few of the components of AWS.
Answer:Key-pairs are secure login information for your instances/virtual machines. To connect to the
instances we use key-pairs that contain a public-key and private-key.
Answer:S3 stands for Simple Storage Service. It is a storage service that provides an interface that
you can use to store any amount of data, at any time, from anywhere in the world. With S3 you pay
only for what you use and the payment model is pay-as-you-go.
• On-demand
• Reserved
• Spot
• Scheduled
• Dedicated
Answer:EBS stands for Elastic Block Stores. They are persistent volumes that you can attach to the
instances. With EBS volumes, your data will be preserved even when you stop your instances, unlike
your instance store volumes where the data is deleted when you stop the instances.
• General purpose
• Provisioned IOPS
• Magnetic
• Cold HDD
• Throughput optimized
• General purpose
• Computer Optimized
• Storage Optimized
• Memory Optimized
• Accelerated Computing
Answer: Auto scaling allows you to automatically scale-up and scale-down the number of instances
depending on the CPU utilization or memory utilization. There are 2 components in Auto scaling, they
are Auto-scaling groups and Launch Configuration.
Answer: Reserved instances are the instance that you can reserve a fixed capacity of EC2 instances. In
reserved instances you will have to get into a contract of 1 year or 3 years.
Q12)What is an AMI?
Answer: AMI stands for Amazon Machine Image. AMI is a template that contains the software
configurations, launch permission and a block device mapping that specifies the volume to attach to
the instance when it is launched.
Answer: Cloudwatch is a monitoring tool that you can use to monitor your various AWS resources.
Like health check, network, Application, etc.
Answer: There are 2 types in cloudwatch. Basic monitoring and detailed monitoring. Basic monitoring
is free and detailed monitoring is chargeable.
Q16) What are the cloudwatch metrics that are available for EC2 instances?
Q17) What is the minimum and maximum size of individual objects that you can store in S3
Answer: The minimum size of individual objects that you can store in S3 is 0 bytes and the maximum
bytes that you can store for individual objects is 5TB.
Answer: Glacier is the back up or archival tool that you use to back up your data in S3.
Answer: There are two ways that you can control the access to your S3 buckets,
Answer: You can encrypt the data by using the below methods,
• Storage used
• Number of requests you make
• Storage management
• Data transfer
• Transfer acceleration
Q24) What is the pre-requisite to work with Cross region replication in S3?
Answer: You need to enable versioning on both source bucket and destination to work with cross
region replication. Also both the source and destination bucket should be in different region.
Answer: Roles are used to provide permissions to entities that you trust within your AWS account.
Roles are users in another account. Roles are similar to users but with roles you do not need to create
any username and password to work with the resources.
Q26) What are policies and what are the types of policies?
Answer: Policies are permissions that you can attach to the users that you create. These policies will
contain that access that you have provided to the users that you have created. There are 2 types of
policies.
• Managed policies
• Inline policies
Answer: Cloudfront is an AWS web service that provided businesses and application developers an
easy and efficient way to distribute their content with low latency and high data transfer speeds.
Cloudfront is content delivery network of AWS.
Answer: Edge location is the place where the contents will be cached. When a user tries to access
some content, the content will be searched in the edge location. If it is not available then the content
will be made available from the origin location and a copy will be stored in the edge location.
Q29) What is the maximum individual archive that you can store in glacier?
Answer: VPC stands for Virtual Private Cloud. VPC allows you to easily customize your networking
configuration. VPC is a network that is logically isolated from other network in the cloud. It allows
you to have your own IP address range, subnets, internet gateways, NAT gateways and security
groups.
Answer: VPC peering connection allows you to connect 1 VPC with another VPC. Instances in these
VPC behave as if they are in the same network.
Answer: NAT stands for Network Address Translation. NAT gateways enables instances in a private
subnet to connect to the internet but prevent the internet from initiating a connection with those
instances.
Answer: You can use security groups and NACL (Network Access Control List) to control the
security to your
VPC.
Q34) What are the different types of storage gateway?
• File gateway
• Volume gateway
• Tape gateway
Answer: Snowball is a data transport solution that used source appliances to transfer large amounts of
data into and out of AWS. Using snowball, you can move huge amount of data from one place to
another which reduces your network costs, long transfer times and also provides better security.
• Aurora
• Oracle
• MYSQL server
• Postgresql
• MariaDB
• SQL server
Answer: Amazon redshift is a data warehouse product. It is a fast and powerful, fully managed,
petabyte scale data warehouse service in the cloud.
• Simple routing
• Latency routing
• Failover routing
• Geolocation routing
• Weighted routing
• Multivalue answer
Answer: Multi-AZ (Availability Zone) RDS allows you to have a replica of your production database
in another availability zone. Multi-AZ (Availability Zone) database is used for disaster recovery. You
will have an exact copy of your database. So when your primary database goes down, your application
will automatically failover to the standby database.
• Automated backups
• Manual backups which are known as snapshots.
Q44) What is the difference between security groups and network access control list?
Answer:
Security Groups Network access control list
Can control the access at the instance level Can control access at the subnet level
Can add rules for “allow” only Can add rules for both “allow” and “deny”
Rules are processed in order number when
Evaluates all rules before allowing the traffic
allowing traffic.
Can assign unlimited number of security groups Can assign upto 5 security groups.
Statefull filtering Stateless filtering
Q45) What are the types of load balancers in EC2?
Answer: ELB stands for Elastic Load balancing. ELB automatically distributes the incoming
application traffic or network traffic across multiple targets like EC2, containers, IP addresses.
Q47) What are the two types of access that you can provide when you are creating users?
Answer: Following are the two types of access that you can create.
• Programmatic access
• Console access
Answer: Security groups acts as a firewall that contains the traffic for one or more instances. You can
associate one or more security groups to your instances when you launch then. You can add rules to
each security group that allow traffic to and from its associated instances. You can modify the rules of
a security group at any time, the new rules are automatically and immediately applied to all the
instances that are associated with the security group
Answer: Shared AMI’s are the AMI that are created by other developed and made available for other
developed to use.
Q51)What is the difference between the classic load balancer and application load balancer?
Answer: Dynamic port mapping, multiple port multiple listeners is used in Application Load
Balancer, One port one listener is achieved via Classic Load Balancer
Answer: 5
Answer: Remove IGW & add NAT Gateway, Associate subnet in Private route table
Q55) Is it possible to reduce a ebs volume?
Answer: no it’s not possible, we can increase it but not reduce them
Answer: These are ipv4 address which are used to connect the instance from internet, they are charged
if the instances are not attached to it
Q57) One of my s3 is bucket is deleted but i need to restore is there any possible way?
Q58) When I try to launch an ec2 instance i am getting Service limit exceed, how to fix the
issue?
Answer: By default AWS offer service limit of 20 running instances per region, to fix the issue we
need to contact AWS support to increase the limit based on the requirement
Q59) I need to modify the ebs volumes in Linux and windows is it possible
Answer: yes its possible from console use modify volumes in section give the size u need then for
windows go to disk management for Linux mount it to achieve the modification
Answer: Yes it’s possible to stop rds. Instance which are non-production and non multi AZ’s
Q61) What is meant by parameter groups in rds. And what is the use of it?
Answer: Since RDS is a managed service AWS offers a wide set of parameter in RDS as parameter
group which is modified as per requirement
Q62) What is the use of tags and how they are useful?
Answer: Tags are used for identification and grouping AWS Resources
Q63) I am viewing an AWS Console but unable to launch the instance, I receive an IAM Error
how can I rectify it?
Answer: As AWS user I don’t have access to use it, I need to have permissions to use it further
Q64) I don’t want my AWS Account id to be exposed to users how can I avoid it?
Answer: In IAM console there is option as sign in url where I can rename my own account name with
AWS account
Q66) You are enabled sticky session with ELB. What does it do with your instance?
Answer: Binds the user session with a specific instance
Q67) Which type of load balancer makes routing decisions at either the transport layer or the
Q68) Which is virtual network interface that you can attach to an instance in a VPC?
Q69) You have launched a Linux instance in AWS EC2. While configuring security group, you
Have selected SSH, HTTP, HTTPS protocol. Why do we need to select SSH?
Answer: To verify that there is a rule that allows traffic from EC2 Instance to your computer
Q70) You have chosen a windows instance with Classic and you want to make some change to
the
Q71) Load Balancer and DNS service comes under which type of cloud service?
Answer: IAAS-Storage
Q72) You have an EC2 instance that has an unencrypted volume. You want to create another
Encrypted volume from this unencrypted volume. Which of the following steps can achieve
this?
Answer: Create a snapshot of the unencrypted volume (applying encryption parameters), copy the.
Snapshot and create a volume from the copied snapshot
Q73) Where does the user specify the maximum number of instances with the auto scaling
Commands?
Q75) After configuring ELB, you need to ensure that the user requests are always attached to a
Single instance. What setting can you use?
Q76) When do I prefer to Provisioned IOPS over the Standard RDS storage?
Answer:If you have do batch-oriented is workloads.
Q77) If I am running on my DB Instance a Multi-AZ deployments, can I use to the stand by the
DB Instance for read or write a operation along with to primary DB instance?
Q78) Which the AWS services will you use to the collect and the process e-commerce data for
the near by real-time analysis?
Q79) A company is deploying the new two-tier an web application in AWS. The company has to
limited on staff and the requires high availability, and the application requires to complex queries
and table joins. Which configuration provides to the solution for company’s requirements?
Q80) Which the statement use to cases are suitable for Amazon DynamoDB?
Answer:The storing metadata for the Amazon S3 objects& The Running of relational joins and
complex an updates.
Q81) Your application has to the retrieve on data from your user’s mobile take every 5 minutes
and then data is stored in the DynamoDB, later every day at the particular time the data is an
extracted into S3 on a per user basis and then your application is later on used to visualize the
data to user. You are the asked to the optimize the architecture of the backend system can to
lower cost, what would you recommend do?
Answer: Introduce Amazon Elasticache to the cache reads from the Amazon DynamoDB table and to
reduce the provisioned read throughput.
Q82) You are running to website on EC2 instances can deployed across multiple Availability
Zones with an Multi-AZ RDS MySQL Extra Large DB Instance etc. Then site performs a high
number of the small reads and the write per second and the relies on the eventual consistency
model. After the comprehensive tests you discover to that there is read contention on RDS
MySQL. Which is the best approaches to the meet these requirements?
Answer:The Deploy Elasti Cache in-memory cache is running in each availability zone and Then
Increase the RDS MySQL Instance size and the Implement provisioned IOPS.
Q83) An startup is running to a pilot deployment of around 100 sensors to the measure street
noise and The air quality is urban areas for the 3 months. It was noted that every month to
around the 4GB of sensor data are generated. The company uses to a load balanced take auto
scaled layer of the EC2 instances and a RDS database with a 500 GB standard storage. The pilot
was success and now they want to the deploy take atleast 100K sensors.let which to need the
supported by backend. You need to the stored data for at least 2 years to an analyze it. Which
setup of following would you be prefer?
Answer: The Replace the RDS instance with an 6 node Redshift cluster with take 96TB of storage.
Q84) Let to Suppose you have an application where do you have to render images and also do
some of general computing. which service will be best fit your need?
Answer:Used on Application Load Balancer.
Q85) How will change the instance give type for the instances, which are the running in your
applications tier and Then using Auto Scaling. Where will you change it from areas?
Q86) You have an content management system running on the Amazon EC2 instance that is the
approaching 100% CPU of utilization. Which option will be reduce load on the Amazon EC2
instance?
Answer: Let Create a load balancer, and Give register the Amazon EC2 instance with it.
Q87) What does the Connection of draining do?
Answer: The re-routes traffic from the instances which are to be updated (or) failed an health to check.
Q88) When the instance is an unhealthy, it is do terminated and replaced with an new ones,
which of the services does that?
Q89) What are the life cycle to hooks used for the AutoScaling?
Answer: They are used to the put an additional taken wait time to the scale in or scale out events.
Q90) An user has to setup an Auto Scaling group. Due to some issue the group has to failed for
launch a single instance for the more than 24 hours. What will be happen to the Auto Scaling in
the condition?
Q91) You have an the EC2 Security Group with a several running to EC2 instances. You
changed to the Security of Group rules to allow the inbound traffic on a new port and protocol,
and then the launched a several new instances in the same of Security Group.Such the new rules
apply?
Q92) To create an mirror make a image of your environment in another region for the disaster
recoverys, which of the following AWS is resources do not need to be recreated in second
region?
Q93) An customers wants to the captures all client connections to get information from his load
balancers at an interval of 5 minutes only, which cal select option should he choose for his
application?
Answer: The condition should be Enable to AWS CloudTrail for the loadbalancers.
Q94) Which of the services to you would not use to deploy an app?
Answer: Lambda app not used on deploy.
Q96) An created a key in the oregon region to encrypt of my data in North Virginia region for
security purposes. I added to two users to the key and the external AWS accounts. I wanted to
encrypt an the object in S3, so when I was tried, then key that I just created is not listed.What
could be reason&solution?
Q98) The organization that is currently using the consolidated billing has to recently acquired to
another company that already has a number of the AWS accounts. How could an Administrator to
ensure that all the AWS accounts, from the both existing company and then acquired company, is
billed to the single account?
Answer: All Invites take acquired the company’s AWS account to join existing the company’s of
organization by using AWS Organizations.
Q99) The user has created an the applications, which will be hosted on the EC2. The application
makes calls to the Dynamo DB to fetch on certain data. The application using the DynamoDB
SDK to connect with the EC2 instance. Which of respect to best practice for the security in this
scenario?
Answer: The user should be attach an IAM roles with the DynamoDB access to EC2 instance.
Q100) You have an application are running on EC2 Instance, which will allow users to
download the files from a private S3 bucket using the pre-assigned URL. Before generating to
URL the Q101) application should be verify the existence of file in S3. How do the application
use the AWS credentials to access S3 bucket securely?
Answer:An Create an IAM role for the EC2 that allows list access to objects in S3 buckets. Launch
to instance with this role, and retrieve an role’s credentials from EC2 Instance make metadata.
Q101) You use the Amazon CloudWatch as your primary monitoring system for
web application. After a recent to software deployment, your users are to getting
Intermittent the 500 Internal Server to the Errors, when you using web application. You want
to create the CloudWatch alarm, and notify the on-call engineer let when these occur. How can
you accomplish the using the AWS services?
Answer: An Create a CloudWatch get Logs to group and A define metric filters that assure capture
500 Internal Servers should be Errors. Set a CloudWatch alarm on the metric and By Use of
Amazon Simple to create a Notification Service to notify an the on-call engineers when prepare
CloudWatch alarm is triggered.
Q102) You are designing a multi-platform of web application for the AWS. The application will
run on the EC2 instances and Till will be accessed from PCs, tablets and smart phones.Then
Supported accessing a platforms are Windows, MACOS, IOS and Android. They Separate
sticky sessions and SSL certificate took setups are required for the different platform types.
Which do describes the most cost effective and Like performance efficient the architecture
setup?
Answer:Assign to multiple ELBs an EC2 instance or group of EC2 take instances running to common
component of the web application, one ELB change for each platform type.Take Session will be
stickiness and SSL termination are done for the ELBs.
Q103) You are migrating to legacy client-server application for AWS. The application responds
to a specific DNS visible domain (e.g. www.example.com) and server 2-tier architecture, with
multiple application for the servers and the database server. Remote clients use to TCP to connect
to the application of servers. The application servers need to know the IP address of clients in
order to the function of properly and are currently taking of that information from TCP socket.
A Multi-AZ RDS MySQL instance to will be used for database. During the migration you change
the application code but you have file a change request. How do would you implement the
architecture on the AWS in order to maximize scalability and high availability?
Answer: File a change request to get implement of Proxy Protocol support in the application. Use of
ELB with TCP Listener and A Proxy Protocol enabled to distribute the load on two application
servers in the different AZs.
Q104) Your application currently is leverages AWS Auto Scaling to the grow and shrink as a
load Increases/decreases and has been performing as well. Your marketing a team expects and
steady ramp up in traffic to follow an upcoming campaign that will result in 20x growth in the
traffic over 4 weeks. Your forecast for approximate number of the Amazon EC2 instances
necessary to meet peak demand is 175. What should be you do avoid potential service
disruptions during the ramp up traffic?
Answer: Check the service limits in the Trusted Advisors and adjust as necessary, so that forecasted
count remains within the limits.
Q105) You have a web application running on the six Amazon EC2 instances, consuming about
45% of resources on the each instance. You are using the auto-scaling to make sure that a six
instances are running at all times. The number of requests this application processes to
consistent and does not experience to spikes. Then application are critical to your business and
you want to high availability for at all times. You want to the load be distributed evenly has
between all instances. You also want to between use same Amazon Machine Image (AMI) for all
instances. Which are architectural choices should you make?
Answer: Deploy to 3 EC2 instances in one of availability zone and 3 in another availability of zones
and to use of Amazon Elastic is Load Balancer.
Q106) You are the designing an application that a contains protected health information.
Security and Then compliance requirements for your application mandate that all protected to
health information in application use to encryption at rest and in the transit module. The
application to uses an three-tier architecture. where should data flows through the load
balancers and is stored on the Amazon EBS volumes for the processing, and the results are
stored in the Amazon S3 using a AWS SDK. Which of the options satisfy the security
requirements?
Answer: Use TCP load balancing on load balancer system, SSL termination on Amazon to create EC2
instances, OS-level disk take encryption on Amazon EBS volumes, and The amazon S3 with
serverside to encryption and Use the SSL termination on load balancers, an SSL listener on the
Amazon to create EC2 instances, Amazon EBS encryption on the EBS volumes containing the PHI,
and Amazon S3 with a server-side of encryption.
Q107) An startup deploys its create photo-sharing site in a VPC. An elastic load balancer
distributes to web traffic across two the subnets. Then the load balancer session to stickiness is
configured to use of AWS-generated session cookie, with a session TTL of the 5 minutes. The
web server to change Auto Scaling group is configured as like min-size=4, max-size=4. The
startup is the preparing for a public launchs, by running the load-testing software installed on
the single Amazon Elastic Compute Cloud (EC2) instance to running in us-west-2a. After 60
minutes of load-testing, the web server logs of show the following:WEBSERVER LOGS | # of
HTTP requests to from load-tester system | # of HTTP requests to from private on beta users ||
webserver #1 (subnet an us-west-2a): | 19,210 | 434 | webserver #2 (subnet an us-west-2a): |
21,790 | 490 || webserver #3 (subnet an us-west-2b): | 0 | 410 || webserver #4 (subnet an us-
west2b): | 0 | 428 |Which as recommendations can be help of ensure that load-testing HTTP
requests are will evenly distributed across to four web servers?
Answer:Result of cloud is re-configure the load-testing software to the re-resolve DNS for each web
request.
Q108) To serve the Web traffic for a popular product to your chief financial officer and IT
director have purchased 10 m1.large heavy utilization of Reserved Instances (RIs) evenly put
spread across two availability zones: Route 53 are used to deliver the traffic to on Elastic Load
Balancer (ELB). After the several months, the product grows to even more popular and you
need to additional capacity As a result, your company that purchases two c3.2xlarge medium
utilization RIs You take register the two c3.2xlarge instances on with your ELB and quickly find
that the ml of large instances at 100% of capacity and the c3.2xlarge instances have significant
to capacity that’s can unused Which option is the most of cost effective and uses EC2 capacity
most of effectively?
Answer: To use a separate ELB for the each instance type and the distribute load to ELBs with a
Route 53 weighted round of robin.
Q109) An AWS customer are deploying an web application that is the composed of a front-end
running on the Amazon EC2 and confidential data that are stored on the Amazon S3. The
customer security policy is that all accessing operations to this sensitive data must authenticated
and authorized by centralized access to management system that is operated by separate
security team. In addition, the web application team that be owns and administers the EC2 web
front-end instances are prohibited from having the any ability to access data that circumvents
this centralized access to management system. Which are configurations will support these
requirements?
Answer:The configure to the web application get authenticate end-users against the centralized access
on the management system. Have a web application provision trusted to users STS tokens an entitling
the download of the approved data directly from a Amazon S3.
Q110) A Enterprise customer is starting on their migration to the cloud, their main reason for
the migrating is agility and they want to the make their internal Microsoft active directory
available to the many applications running on AWS, this is so internal users for only have to
remember one set of the credentials and as a central point of user take control for the leavers
and joiners. How could they make their actions the directory secures and the highly available
with minimal on-premises on infrastructure changes in the most cost and the timeefficient
way?
Answer: By Using a VPC, they could be create an the extension to their data center and to make use
of resilient hardware IPSEC on tunnels, they could then have two domain consider to controller
instances that are joined to the existing domain and reside within the different subnets in the different
availability zones.
Answer:
Answer:
• Private Cloud
• Public Cloud
• Hybrid cloud
• Community cloud 4
Answer: SAAS (Software as a Service): It is software distribution model in which application are
hosted by a vendor over the internet for the end user freeing from complex software and hardware
management. (Ex: Google drive, drop box)
PAAS (Platform as a Service): It provides platform and environment to allow developers to build
applications. It frees developers without going into the complexity of building and maintaining the
infrastructure. (Ex: AWS Elastic Beanstalk, Windows Azure)
IAAS (Infrastructure as a Service): It provides virtualized computing resources over the internet like
cpu, memory, switches, routers, firewall, Dns, Load balancer (Ex: Azure, AWS)
Answer:
Answer: Amazon web service is a secure cloud services platform offering compute, power, database,
storage, content delivery and other functionality to help business scale and grow.
Availability Zones: An Availability zone is a simply a data center. Designed as independent failure
zone. High speed connectivity, Low latency.
Edge Locations: Edge location are the important part of AWS Infrastructure. Edge locations are CDN
endpoints for cloud front to deliver content to end user with low latency
Answer:
• AWS Console
• AWS CLI (Command line interface)
• AWS SDK (Software Development Kit)
Amazon Elastic compute cloud is a web service that provides resizable compute capacity in the
cloud.AWS EC2 provides scalable computing capacity in the AWS Cloud. These are the virtual
servers also called as an instances. We can use the instances pay per use basis.
Benefits:
Answer:
• On-Demand Instances
• Reserved Instances
• Spot Instances
• Dedicated Host
Answer:
• General Purpose
• Compute Optimized
• Memory optimized
• Storage Optimized
• Accelerated Computing (GPU Based)
Q122) What is AMI? What are the types in AMI?
Answer:
Amazon machine image is a special type of virtual appliance that is used to create a virtual machine
within the amazon Elastic compute cloud. AMI defines the initial software that will be in an instance
when it is launched.
Types of AMI:
• Published by AWS
• AWS Marketplace
• Generated from existing instances
• Uploaded virtual server
Answer:
• Public Domain name system (DNS) name: When you launch an instance AWS creates a DNS
name that can be used to access the
• Public IP: A launched instance may also have a public ip address This IP address assigned
from the address reserved by AWS and cannot be specified.
• Elastic IP: An Elastic IP Address is an address unique on the internet that you reserve
independently and associate with Amazon EC2 instance. This IP Address persists until the
customer release it and is not tried to
Answer: AWS allows you to control traffic in and out of your instance through virtual firewall called
Security groups. Security groups allow you to control traffic based on port, protocol and
source/Destination.
Answer:Retired state only available in Reserved instances. Once the reserved instance reserving time
(1 yr/3 yr) ends it shows Retired state.
Q126) Scenario: My EC2 instance IP address change automatically while instance stop and
start. What is the reason for that and explain solution?
Answer:AWS assigned Public IP automatically but it’s change dynamically while stop and start. In
that case we need to assign Elastic IP for that instance, once assigned it doesn’t change automatically.
Answer:AWS Elastic Beanstalk is the fastest and simplest way to get an application up and running on
AWS.Developers can simply upload their code and the service automatically handle all the details
such as resource provisioning, load balancing, Auto scaling and Monitoring.
Answer:Lightsail designed to be the easiest way to launch and manage a virtual private server with
AWS.Lightsail plans include everything you need to jumpstart your project a virtual machine, ssd
based storage, data transfer, DNS Management and a static ip.
Answer:Amazon EBS Provides persistent block level storage volumes for use with Amazon EC2
instances. Amazon EBS volume is automatically replicated with its availability zone to protect
component failure offering high availability and durability. Amazon EBS volumes are available in a
variety of types that differ in performance characteristics and Price.
Answer: Magnetic Volume: Magnetic volumes have the lowest performance characteristics of all
Amazon EBS volume types.
EBS Volume size: 1 GB to 1 TB Average IOPS: 100 IOPS Maximum throughput: 40-90 MB
General-Purpose SSD: General purpose SSD volumes offers cost-effective storage that is ideal for a
broad range of workloads. General purpose SSD volumes are billed based on the amount of data space
provisioned regardless of how much of data you actually store on the volume.
EBS Volume size: 1 GB to 16 TB Maximum IOPS: upto 10000 IOPS Maximum throughput: 160 MB
Provisioned IOPS SSD: Provisioned IOPS SSD volumes are designed to meet the needs of I/O
intensive workloads, particularly database workloads that are sensitive to storage performance and
consistency in random access I/O throughput. Provisioned IOPS SSD Volumes provide predictable,
High performance.
EBS Volume size: 4 GB to 16 TB Maximum IOPS: upto 20000 IOPS Maximum throughput: 320 MB
Answer: Cold HDD: Cold HDD volumes are designed for less frequently accessed workloads. These
volumes are significantly less expensive than throughput-optimized HDD volumes.
EBS Volume size: 500 GB to 16 TB Maximum IOPS: 200 IOPS Maximum throughput: 250 MB
Throughput-Optimized HDD: Throughput-optimized HDD volumes are low cost HDD volumes
designed for frequent access, throughput-intensive workloads such as big data, data warehouse.
EBS Volume size: 500 GB to 16 TB Maximum IOPS: 500 IOPS Maximum throughput: 500 MB
Q132) What is Amazon EBS-Optimized instances?
Answer: Amazon EBS optimized instances to ensure that the Amazon EC2 instance is prepared to
take advantage of the I/O of the Amazon EBS Volume. An amazon EBS-optimized instance uses an
optimized configuration stack and provide additional dedicated capacity for Amazon EBS I/When you
select Amazon EBS-optimized for an instance you pay an additional hourly charge for that instance.
Answer:
• It can back up the data on the EBS Volume. Snapshots are incremental backups.
• If this is your first snapshot it may take some time to create. Snapshots are point in time
copies of volumes.
Answer: We can’t able to connect EBS volume to multiple instance, but we can able to connect
multiple EBS Volume to single instance.
Answer: Hardware assisted Virtualization: HVM instances are presented with a fully virtualized set of
hardware and they executing boot by executing master boot record of the root block device of the
image. It is default Virtualization.
Para virtualization: This AMI boot with a special boot loader called PV-GRUB. The ability of the guest
kernel to communicate directly with the hypervisor results in greater performance levels than other
virtualization approaches but they cannot take advantage of hardware extensions such as networking,
GPU etc. Its customized Virtualization image. Virtualization image can be used only for particular
service.
Answer:
Block Storage: Block storage operates at lower level, raw storage device level and manages data as a
set of numbered, fixed size blocks.
File Storage: File storage operates at a higher level, the operating system level and manage data as a
named hierarchy of files and folders.
Answer:
Q138) what are the things we need to remember while creating s3 bucket?
Answer:
Answer:
• Amazon S3 Standard
• Amazon S3 Standard-Infrequent Access
• Amazon S3 Reduced Redundancy Storage
• Amazon Glacier
Answer: Amazon S3 lifecycle configuration rules, you can significantly reduce your storage costs by
automatically transitioning data from one storage class to another or even automatically delete data
after a period of time.
Answer: To encrypt Amazon S3 data at rest, you can use several variations of Server-Side Encryption.
Amazon S3 encrypts your data at the object level as it writes it to disks in its data centers and decrypt
it for you when you access it’ll SSE performed by Amazon S3 and AWS Key Management Service
(AWS KMS) uses the 256-bit Advanced Encryption Standard (AES).
Answer: Cross region replication is a feature allows you asynchronously replicate all new objects in
the source bucket in one AWS region to a target bucket in another region. To enable cross-region
replication, versioning must be turned on for both source and destination buckets. Cross region
replication is commonly used to reduce the latency required to access objects in Amazon S3
Answer:
Stateful Firewall: A Security group is a virtual stateful firewall that controls inbound and outbound
network traffic to AWS resources and Amazon EC2 instances. Operates at the instance level. It
supports allow rules only. Return traffic is automatically allowed, regardless of any rules.
Stateless Firewall: A Network access control List (ACL) is a virtual stateless firewall on a subnet
level. Supports allow rules and deny rules. Return traffic must be explicitly allowed by rules.
Answer:
NAT instance: A network address translation (NAT) instance is an Amazon Linux machine Image
(AMI) that is designed to accept traffic from instances within a private subnet, translate the source IP
address to the Public IP address of the NAT instance and forward the traffic to IWG.
NAT Gateway: A NAT gateway is an Amazon managed resources that is designed to operate just like
a NAT instance but it is simpler to manage and highly available within an availability Zone. To allow
instance within a private subnet to access internet resources through the IGW via a NAT gateway.
Answer: Amazon VPC peering connection is a networking connection between two amazon vpc’s that
enables instances in either Amazon VPC to communicate with each other as if they are within the
same network. You can create amazon VPC peering connection between your own Amazon VPC’s or
Amazon VPC in another AWS account within a single region.
Answer: Multi factor Authentication can add an extra layer of security to your infrastructure by
adding a second method of authentication beyond just password or access key.
Answer:
• User Name/Password
• Access Key
• Access Key/ Session Token
Data ware house is a central repository for data that can come from one or more sources. Organization
typically use data warehouse to compile reports and search the database using highly complex queries.
Data warehouse also typically updated on a batch schedule multiple times per day or per hour compared
to an OLTP (Online Transaction Processing) relational database that can be updated thousands of times
per second.
Answer: Multi AZ allows you to place a secondary copy of your database in another availability zone
for disaster recovery purpose. Multi AZ deployments are available for all types of Amazon RDS
Database engines. When you create s Multi-AZ DB instance a primary instance is created in one
Availability Zone and a secondary instance is created by another Availability zone.
Answer: Amazon Dynamo DB is fully managed NoSQL database service that provides fast and
predictable performance with seamless scalability. Dynamo DB makes it simple and Cost effective to
store and retrieve any amount of data.
Answer: Cloud formation is a service which creates the AWS infrastructure using code. It helps to
reduce time to manage resources. We can able to create our resources Quickly and faster.
Answer:
• Manual Scaling
• Scheduled Scaling
• Dynamic Scaling
Answer: Auto Scaling group is a collection of Amazon EC2 instances managed by the Auto scaling
service. Each auto scaling group contains configuration options that control when auto scaling should
launch new instance or terminate existing instance.
Answer:
Basic Monitoring: Basic monitoring sends data points to Amazon cloud watch every five minutes for
a limited number of preselected metrics at no charge.
Detailed Monitoring: Detailed monitoring sends data points to amazon CloudWatch every minute and
allows data aggregation for an additional charge.
Answer: In Cloud front we will deliver content to edge location wise so here we can use Route 53 for
Content Delivery Network. Additionally, if you are using Amazon CloudFront you can configure
Route 53 to route Internet traffic to those resources.
• Simple
• Weighted
• Latency Based
• Failover
• Geolocation
Answer: Amazon ElastiCache is a web services that simplifies the setup and management of
distributed in memory caching environment.
• Cost Effective
• High Performance
• Scalable Caching Environment
• Using Memcached or Redis Cache Engine
Q159)What is SES, SQS and SNS?
Answer: SES (Simple Email Service): SES is SMTP server provided by Amazon which is designed to
send bulk mails to customers in a quick and cost-effective manner.SES does not allows to configure
mail server.
SQS (Simple Queue Service): SQS is a fast, reliable and scalable, fully managed message queuing
service. Amazon SQS makes it simple and cost Effective. It’s temporary repository for messages to
waiting for processing and acts as a buffer between the component producer and the consumer.
SNS (Simple Notification Service): SNS is a web service that coordinates and manages the delivery or
sending of messages to recipients.
Answer:Amazon Web Services is a secure cloud services stage, offering compute power, database
storage, content delivery and other functionality to help industries scale and grow.
Answer:low price – Consume only the amount of calculating, storage and other IT devices needed. No
long-term assignation, minimum spend or up-front expenditure is required.
Elastic and Scalable – Quickly Rise and decrease resources to applications to satisfy customer
demand and control costs. Avoid provisioning maintenance up-front for plans with variable
consumption speeds or low lifetimes.
Q162) What is the way to secure data for resounding in the cloud?
Answer:
Answer:Cloud computing can be damaged up into three main services: Software-as-a-Service (SaaS),
Infrastructure-as-a-Service (IaaS) and Platform-as-a-Service (PaaS). PaaS in the middle, and IaaS on
the lowest
Answer:Lambda Edge lets you run Lambda functions to modify satisfied that Cloud Front delivers,
executing the functions in AWS locations closer to the viewer. The functions run in response to Cloud
Front events, without provisioning or managing server.
Answer:Cloud computing offers industries flexibility and scalability when it comes to computing
needs:
Flexibility. Cloud computing agrees your workers to be more flexible – both in and out of the
workplace. Workers can access files using web-enabled devices such as smartphones, laptops and
notebooks. In this way, cloud computing empowers the use of mobile technology.
One of the key assistances of using cloud computing is its scalability. Cloud computing allows your
business to easily expensive or downscale your IT requests as and when required. For example, most
cloud service workers will allow you to increase your existing resources to accommodate increased
business needs or changes. This will allow you to support your commercial growth without exclusive
changes to your present IT systems.
IaaS workers include Amazon Web Services, Microsoft Azure and Google Compute Engine
Users: IT Administrators
Answer:PaaS runs cloud platforms and runtime environments to develop, test and manage software
Answer:In SaaS, cloud workers host and manage the software application on a pay-as-you-go pricing
model
Answer:An Amazon Machine Image (AMI) explains the programs and settings that will be applied
when you launch an EC2 instance. Once you have finished organizing the data, services, and
submissions on your ArcGIS Server instance, you can save your work as a custom AMI stored in
Amazon EC2. You can scale out your site by using this institution AMI to launch added instances Use
the following process to create your own AMI using the AWS Administration Console:
*Configure an EC2 example and its attached EBS volumes in the exact way you want them created in
the custom AMI.
Read the message box that appears. To view the AMI standing, go to the AMIs page. Here you can
see your AMI being created. It can take a though to create the AMI. Plan for at least 20 minutes, or
slower if you’ve connected a lot of additional applications or data.
Answer:Amazon Cloud Front is a web service that speeds up delivery of your static and dynamic web
content, such as .html, .css, .js, and image files, to your users.CloudFront delivers your content through
a universal network of data centers called edge locations
Answer:Amazon Elastic Calculate Cloud (Amazon EC2) is a web service that provides secure,
resizable compute capacity in the cloud. It is designed to make web-scale cloud calculating easier for
designers. Amazon EC2’s simple web serviceinterface allows you to obtain and configure capacity
with minimal friction.
Answer:An instance store is a provisional storing type located on disks that are physically attached to
a host machine. … This article will present you to the AWS instance store storage type, compare it to
AWS Elastic Block Storage (AWS EBS), and show you how to backup data stored on instance stores
to AWS EBS
Amazon SQS is a message queue service used by scattered requests to exchange messages through a
polling model, and can be used to decouple sending and receiving components
Q174) When attached to an Amazon VPC which two components provide connectivity with
external networks?
Answer:
Answer:
Q177) What is the best approach to anchor information for conveying in the cloud ?
Answer:Backup Data Locally. A standout amongst the most vital interesting points while overseeing
information is to guarantee that you have reinforcements for your information,
Answer:AWS Certificate Manager is an administration that lets you effortlessly arrangement, oversee,
and send open and private Secure Sockets Layer/Transport Layer Security (SSL/TLS) endorsements
for use with AWS administrations and your inward associated assets. SSL/TLS declarations are
utilized to anchor arrange interchanges and set up the character of sites over the Internet and
additionally assets on private systems. AWS Certificate Manager expels the tedious manual procedure
of obtaining, transferring, and reestablishing SSL/TLS endorsements.
Answer:AWS Key Management Service (AWS KMS) is an overseen benefit that makes it simple for
you to make and control the encryption keys used to scramble your information. … AWS KMS is
additionally coordinated with AWS CloudTrail to give encryption key use logs to help meet your
inspecting, administrative and consistence needs.
Answer:Amazon Kinesis Data Firehose is the least demanding approach to dependably stack gushing
information into information stores and examination devices. … It is a completely overseen benefit
that consequently scales to coordinate the throughput of your information and requires no continuous
organization
Answer:Amazon CloudSearch is a versatile cloud-based hunt benefit that frames some portion of
Amazon Web Services (AWS). CloudSearch is normally used to incorporate tweaked seek abilities
into different applications. As indicated by Amazon, engineers can set a pursuit application up and
send it completely in under 60 minutes.
Q183) Is it feasible for an EC2 exemplary occurrence to wind up an individual from a virtual
private cloud?
Answer:Amazon Virtual Private Cloud (Amazon VPC) empowers you to characterize a virtual system
in your very own consistently disengaged zone inside the AWS cloud, known as a virtual private
cloud (VPC). You can dispatch your Amazon EC2 assets, for example, occasions, into the subnets of
your VPC. Your VPC nearly looks like a conventional system that you may work in your very own
server farm, with the advantages of utilizing adaptable foundation from AWS. You can design your
VPC; you can choose its IP address extend, make subnets, and arrange course tables, organize portals,
and security settings. You can interface occurrences in your VPC to the web or to your own server
farm
Answer:VPCs and Subnets. A virtual private cloud (VPC) is a virtual system committed to your AWS
account. It is consistently segregated from other virtual systems in the AWS Cloud. You can dispatch
your AWS assets, for example, Amazon EC2 cases, into your VPC.
Q185) How would one be able to associate a VPC to corporate server farm?
Answer:AWS Direct Connect empowers you to safely associate your AWS condition to your
onpremises server farm or office area over a standard 1 gigabit or 10 gigabit Ethernet fiber-optic
association. AWS Direct Connect offers committed fast, low dormancy association, which sidesteps
web access suppliers in your system way. An AWS Direct Connect area gives access to Amazon Web
Services in the locale it is related with, and also access to different US areas. AWS Direct Connect
enables you to consistently parcel the fiber-optic associations into numerous intelligent associations
called Virtual Local Area Networks (VLAN). You can exploit these intelligent associations with
enhance security, separate traffic, and accomplish consistence necessities.
Answer:Truly, it very well may be pushed off for examples with root approaches upheld by local
event stockpiling. By utilizing Amazon S3, engineers approach the comparative to a great degree
versatile, reliable, quick, low-valued information stockpiling substructure that Amazon uses to follow
its own overall system of sites. So as to perform frameworks in the Amazon EC2 air, engineers utilize
the instruments giving to stack their Amazon Machine Images (AMIs) into Amazon S3 and to
exchange them between Amazon S3 and Amazon EC2. Extra use case may be for sites facilitated on
EC2 to stack their stationary substance from S3.
Answer:EBS is for mounting straightforwardly onto EC2 server examples. S3 is Object Oriented
Storage that isn’t continually waiting be gotten to (and is subsequently less expensive). There is then
much less expensive AWS Glacier which is for long haul stockpiling where you don’t generally hope
to need to get to it, however wouldn’t have any desire to lose it.
There are then two principle kinds of EBS – HDD (Hard Disk Drives, i.e. attractive turning circles),
which are genuinely ease back to access, and SSD, which are strong state drives which are excessively
quick to get to, yet increasingly costly.
Answer:This is one of the generally asked AWS engineer inquiries questions. This inquiry checks
your essential AWS learning so the appropriate response ought to be clear. Amazon Web Services
(AWS) is a cloud benefit stage which offers figuring power, investigation, content conveyance,
database stockpiling, sending and some different administrations to help you in your business
development. These administrations are profoundly versatile, solid, secure, and cheap distributed
computing administrations which are plot to cooperate and, applications in this manner made are
further developed and escalade.
Basic Storage Service (S3): S3 is most generally utilized AWS stockpiling web benefit.
Straightforward E-mail Service (SES): SES is a facilitated value-based email benefit and enables one
to smoothly send deliverable messages utilizing a RESTFUL API call or through an ordinary SMTP.
Personality and Access Management (IAM): IAM gives enhanced character and security the board for
AWS account.
Versatile Compute Cloud (EC2): EC2 is an AWS biological community focal piece. It is in charge of
giving on-request and adaptable processing assets with a “pay as you go” estimating model.
Flexible Block Store (EBS): EBS offers consistent capacity arrangement that can be found in
occurrences as a customary hard drive.
CloudWatch: CloudWatch enables the controller to viewpoint and accumulate key measurements and
furthermore set a progression of cautions to be advised if there is any inconvenience.
This is among habitually asked AWS engineer inquiries questions. Simply find the questioner psyche
and solution appropriately either with parts name or with the portrayal alongside.
Q190) I’m not catching your meaning by AMI? What does it incorporate?
Answer:You may run over at least one AMI related AWS engineer inquiries amid your AWS designer
meet. Along these lines, set yourself up with a decent learning of AMI.
AMI represents the term Amazon Machine Image. It’s an AWS format which gives the data (an
application server, and working framework, and applications) required to play out the dispatch of an
occasion. This AMI is the duplicate of the AMI that is running in the cloud as a virtual server. You
can dispatch occurrences from the same number of various AMIs as you require. AMI comprises of
the followings:
Launch authorizations to figure out which AWS records will inspire the AMI so as to dispatch the
occasions
Mapping for square gadget to compute the aggregate volume that will be appended to the example at
the season of dispatch
This is one of the normal AWS engineer inquiries questions. In the event that the questioner is hoping
to find a definite solution from you, clarify the system for vertical scaling.
Answer:Various sorts of examples can be propelled from one AMI. The sort of an occasion for the
most part manages the equipment segments of the host PC that is utilized for the case. Each kind of
occurrence has unmistakable registering and memory adequacy.
When an example is propelled, it gives a role as host and the client cooperation with it is same
likewise with some other PC however we have a totally controlled access to our occurrences. AWS
engineer inquiries questions may contain at least one AMI based inquiries, so set yourself up for the
AMI theme exceptionally well.
Amazon S3
Amazon EC2
The significance of S3 is Simple Storage Service. The importance of EC2 is Elastic Compute Cloud.
It is only an information stockpiling administration which is utilized to store huge paired files. It is a
cloud web benefit which is utilized to have the application made.
When you are going for an AWS designer meet, set yourself up with the ideas of Amazon S3 and
EC2, and the distinction between them.
Q194) What number of capacity alternatives are there for EC2 Instance?
• Amazon EBS
• Amazon EC2 Instance Store
• Amazon S3
• Adding Storage
Amazon EC2 is the basic subject you may run over while experiencing AWS engineer inquiries
questions. Get a careful learning of the EC2 occurrence and all the capacity alternatives for the EC2
case.
Q195) What are the security best practices for Amazon Ec2 examples?
Answer:There are various accepted procedures for anchoring Amazon EC2 occurrences that are
pertinent whether occasions are running on-preface server farms or on virtual machines. How about
we view some broad prescribed procedures:
Minimum Access: Make beyond any doubt that your EC2 example has controlled access to the case
and in addition to the system. Offer access specialists just to the confided in substances.
Slightest Privilege: Follow the vital guideline of minimum benefit for cases and clients to play out the
capacities. Produce jobs with confined access for the occurrences.
Setup Management: Consider each EC2 occasion a design thing and use AWS arrangement the
executives administrations to have a pattern for the setup of the occurrences as these administrations
incorporate refreshed enemy of infection programming, security highlights and so forth.
Whatever be the activity job, you may go over security based AWS inquiries questions. Along these
lines, motivate arranged with this inquiry to break the AWS designer meet.
Answer:This is an extremely straightforward inquiry yet positions high among AWS engineer inquiries
questions. Answer this inquiry straightforwardly as the default number of pails made in each AWS
account is 100.
Answer:This is among habitually asked AWS designer inquiries questions. Give the appropriate
response in straightforward terms, the cradle is primarily used to oversee stack with the
synchronization of different parts i.e. to make framework blame tolerant. Without support, segments
don’t utilize any reasonable technique to get and process demands. Be that as it may, the cushion
makes segments to work in a decent way and at a similar speed, hence results in quicker
administrations.
Answer:At the season of ceasing an Amazon EC2 case, a shutdown is performed in a typical way.
From that point onward, the changes to the ceased state happen. Amid this, the majority of the
Amazon EBS volumes are stayed joined to the case and the case can be begun whenever. The
occurrence hours are not included when the occasion is the ceased state.
At the season of ending an Amazon EC2 case, a shutdown is performed in an ordinary way. Amid
this, the erasure of the majority of the Amazon EBS volumes is performed. To stay away from this,
the estimation of credit deleteOnTermination is set to false. On end, the occurrence additionally
experiences cancellation, so the case can’t be begun once more.
Answer:In an AWS DevOps Engineer talk with, this is the most widely recognized AWS inquiries for
DevOps. To answer this inquiry, notice the well known DevOps apparatuses with the kind of
hardware –
Answer:Roles are for AWS services, Where we can assign permission of some AWS service to other
Service.
Policies are for users and groups, Where we can assign permission to user’s and groups.
Q204) What are the Defaults services we get when we create custom AWS VPC?
Answer:
• Route Table
• Network ACL
• Security Group
Q205) What is the Difference Between Public Subnet and Private Subnet ?
Answer:Public Subnet will have Internet Gateway Attached to its associated Route Table and Subnet,
Private Subnet will not have the Internet Gateway Attached to its associated Route Table and Subnet
Public Subnet will have internet access and Private subnet will not have the internet access directly.
Q206) How do you access the Ec2 which has private IP which is in private Subnet ?
Answer: We can access using VPN if the VPN is configured into that Particular VPC where Ec2 is
assigned to that VPC in the Subnet. We can access using other Ec2 which has the Public access.
Q207) We have a custom VPC Configured and MYSQL Database server which is in Private
Subnet and we need to update the MYSQL Database Server, What are the Option to do so.
Answer:By using NAT Gateway in the VPC or Launch a NAT Instance ( Ec2) Configure or Attach
the NAT Gateway in Public Subnet ( Which has Route Table attached to IGW) and attach it to the
Route Table which is Already attached to the Private Subnet.
Q208) What are the Difference Between Security Groups and Network ACL
Answer:
Answer:Amazon Route 53 will handle DNS servers. Route 53 give you web interface through which
the DNS can be managed using Route 53, it is possible to direct and failover traffic. This can be
achieved by using DNS Routing Policy.
One more routing policy is Failover Routing policy. we set up a health check to monitor your
application endpoints. If one of the endpoints is not available, Route 53 will automatically forward the
traffic to other endpoint.
ELB automatically scales depends on the demand, so sizing of the load balancers to handle more
traffic effectively when it is not required.
Q210) What are the DB engines which can be used in AWS RDS?
Answer:
• MariaDB
• MYSQL DB
• MS SQL DB
• Postgre DB
• Oracle DB
Answer: System Status Checks – System Status checks will look into problems with instance which
needs AWS help to resolve the issue. When we see system status check failure, you can wait for AWS
to resolve the issue, or do it by our self.
• Network connectivity
• System power
• Software issues Data Centre’s
• Hardware issues
• Instance Status Checks – Instance Status checks will look into issues which need our
involvement to fix the issue. if status check fails, we can reboot that particular instance.
• Failed system status checks
• Memory Full
• Corrupted file system
• Kernel issues
Q212) To establish a peering connections between two VPC’s What condition must be met?
Answer:
• If the instance state is 0/2- there might be some hardware issue • If the instance state is ½-
there might be issue with OS.
Workaround-Need to restart the instance, if still that is not working logs will help to fix the
issue.
Q215) EBS: its block-level storage volume which we can use after mounting with EC2 instances.
Answer:
• We can access EBS only if its mounted with instance, at a time EBS can be mounted only
with one instance.
• EFS can be shared at a time with multiple instances
• S3 can be accessed without mounting with instances
Answer:100 buckets can be created by default in AWS account.To get more buckets additionally you
have to request Amazon for that.
Answer:Maximum 20 instances can be created in a VPC. we can create 20 reserve instances and
request for spot instance as per demand.
Answer:EBS provides high performance block-level storage which can be attached with running EC2
instance. Storage can be formatted and mounted with EC2 instance, then it can be accessed.
Q220)Process to mount EBS to EC2 instance
Answer:
• Df –k
• mkfs.ext4 /dev/xvdf
• Fdisk –l
• Mkdir /my5gbdata
• Mount /dev/xvdf /my5gbdata
Answer:With each restart volume will get unmounted from instance, to keep this attached need to
perform below step
Cd /etc/fstab
Q222) What is the Difference between the Service Role and SAML Federated Role.
Answer: Service Role are meant for usage of AWS Services and based upon the policies attached to
it,it will have the scope to do its task. Example : In case of automation we can create a service role and
attached to it.
Federated Roles are meant for User Access and getting access to AWS as per designed role. Example
: We can have a federated role created for our office employee and corresponding to that a Group will
be created in the AD and user will be added to it.
Answer: Root User will have acces to entire AWS environment and it will not have any policy
attached to it. While IAM User will be able to do its task on the basis of policies attached to it.
Q226) What do you mean by Principal of least privilege in term of IAM.
Answer: Principal of least privilege means to provide the same or equivalent permission to the
user/role.
Answer: When an IAM user is created and it is not having any policy attached to it,in that case he will
not be able to access any of the AWS Service until a policy has been attached to it.
Q228) What is the precedence level between explicit allow and explicit deny.
Q230) What is the difference between the Administrative Access and Power User Access in term
of pre-build policy.
Answer: Administrative Access will have the Full access to AWS resources. While Power User
Access will have the Admin access except the user/group management permission.
Answer: Identity Provider helps in building the trust between the AWS and the Corporate AD
environment while we create the Federated role.
Answer: It help in securing the AWS environment as we need not to embed or distributed the AWS
Security credentials in the application. As the credentials are temporary we need not to rotate them
and revoke them.
Q233) What is the benefit of creating the AWS Organization.
Answer: It helps in managing the IAM Policies, creating the AWS Accounts programmatically, helps
in managing the payment methods and consolidated billing.
Answer: 5TB
Answer:Yes
Q239) which service is used to distribute content to end user service using global network of
edge location?
Q243) I have some private servers on my premises also i have distributed some of My workload
on the public cloud,what is the architecture called?
Answer: False
Q245) Is simple workflow service one of the valid Simple Notification Service subscribers?
Answer: No
Q246) which cloud model do Developers and organizations all around the world leverage
extensively?
Q247) Can cloud front serve content from a non AWS origin server?
Answer: No
Answer: Yes
Q249) Which AWS service will you use to collect and process ecommerce data for near real time
analysis?
Q250)An high demand of IOPS performance is expected around 15000.Which EBS volume type
would you recommend?
DevOps
1. What is Source Code Management?
It is a process through which we can store and manage any code. Developers write code, Testers
write test cases and DevOps engineers write scripts. This code, we can store and manage in Source
Code Management. Different teams can store code simultaneously. It saves all changes separately.
We can retrieve this code at any point of time.
3. What is Git?
Git is one of the Source Code Management tools where we can store any type of code. Git is the most
advanced tool in the market now. We also call Git is version control system because every update
stored as a new version. At any point of time, we can get any previous version. We can go back to
previous versions. Every version will have a unique number. That number we call commit-ID. By using
this commit ID, we can track each change i.e. who did what at what time. For every version, it takes
incremental backup instead of taking the whole backup. That’s why Git occupies less space. Since it
is occupying less space, it is very fast.
3. Local repository: - It is the place where Git stores all commit locally. It is a hidden directory so that
no one can delete it accidentally. Every commit will have unique commit ID.
4. Central repository: - It is the place where Git stores all commit centrally. It belongs to everyone
who is working in your project. Git Hub is one of the central repositories. Used for storing the code
and sharing the code to others in the team.
security. We can copy the repository from one account to other accounts also. This process we call
as “Fork”. In this repository also we can create branches. The default branch is “Master”
commits which are there in your current branch. So picking particular commit and merging into your
current branch we call git cherry-pick.
29. What are the problems that system admins used to face earlier when there were no
configuration management tools?
1. Managing users & Groups is big hectic thing (create users and groups, delete, edit……) 2. Dealing
with packages (Installing, Upgrading & Uninstalling) 3. Taking backups on regular basis manually 4.
Deploying all kinds of applications in servers 5. Configure services (Starting, stopping and restarting
services) These are some problems that system administrators used to face earlier in their manual
process of managing configuration of any machine.
whatever the order you mention in run-list. Because sometimes order is important especially when
we deal with dependent recipes.
74. Advantages of Ansible over other SCM (Source Code Management) tools?
• Agentless
• Relies on “ssh”
• Uses python
• Push mechanism
section. In above example, package task we mention in task section and service task we mention in
handler section so that after installing task only service will be started.
94. What are the ways through which we can do Continues Integration?
are total three ways through which we can do Continues Integration
1. Manually: – Manually write code, then do build manually and then test manually by writing test
cases and deploy manually into clients machine.
2. Scripts: – Can do above process by writing scripts so that these scripts do CI&CD automatically. But
here complexity is, writing script is not so easy.
3. Tool: – Using tools like Jenkins is very handy. Everything is preconfigured in these type of tools. So
less manual intervention. This is the most preferred way.
• Jenkins also acts as crone server replacement. I.e. can do repeated tasks automatically
• Running some scripts regularly
• E.g.: Automatic daily alarm.
• Can create Labels (Group of slaves) (Can restrict where the project has to run)
DevOps
DevOps Interview Questions, From beginner to expert level DevOps Professional. These
questions covers a wide range of topics any DevOps professional needed to know to nail
an interview.
Table of Contents:
DevOps is gaining more popularity day by day. Here are some benefits of implementing
DevOps Practice.
Development Cycle: DevOps shortens the development cycle from initial design to
production.
Full Automation: DevOps helps to achieve full automation from testing, to build, release
and deployment.
Deployment Rollback: In DevOps, we plan for any failure in deployment rollback due to a
bug in code or issue in production. This gives confidence in releasing feature without
worrying about downtime for rollback.
Defect Detection: With DevOps approach, we can catch defects much earlier than
releasing to production. It improves the quality of the software.
Agile is a set of values and principles about how to develop software in a systematic way.
Where as DevOPs is a way to quickly, easily and repeatably move that software into
production infrastructure, in a safe and simple way.
Most important aspect of DevOps is to get the changes into production as quickly as
possible while minimizing risks in software quality assurance and compliance. This is the
primary objective of DevOps.
Code is deployed by adopting continuous delivery best practices. Which means that
checked in code is built automatically and then artifacts are published to repository servers.
On the application severs there are deployment triggers usually timed by using cron jobs.
All the artifacts are then downloaded and deployed automatically.
4/71
Gradle is an open-source build automation system that builds upon the concepts of Apache
Ant and Apache Maven. Gradle has a proper programming language instead of XML
configuration file and the language is called ‘Groovy’.
Gradle uses a directed acyclic graph ("DAG") to determine the order in which tasks can be
run.
Gradle was designed for multi-project builds, which can grow to be quite large. It supports
incremental builds by intelligently determining which parts of the build tree are up to date,
any task dependent only on those parts does not need to be re-executed.
5/71
There isn't a great support for multi-project builds in Ant and Maven. Developers end up
doing a lot of coding to support multi-project builds.
Also having some build-by-convention is nice and makes build scripts more concise. With
Maven, it takes build by convention too far, and customizing your build process becomes a
hack.
Maven also promotes every project publishing an artifact. Maven does not support
subprojects to be built and versioned together.
But with Gradle developers can have the flexibility of Ant and build by convention of
Maven.
Groovy is easier and clean to code than XML. In Gradle, developers can define
dependencies between projects on the local file system without the need to publish
artifacts to repository.
The following is a summary of the major differences between Gradle and Apache Maven:
Flexibility: Google chose Gradle as the official build tool for Android; not because build
scripts are code, but because Gradle is modeled in a way that is extensible in the most
fundamental ways.
Both Gradle and Maven provide convention over configuration. However, Maven provides a
very rigid model that makes customization tedious and sometimes impossible.
While this can make it easier to understand any given Maven build, it also makes it
unsuitable for many automation problems. Gradle, on the other hand, is built with an
empowered and responsible user in mind.
Performance
Both Gradle and Maven employ some form of parallel project building and parallel
dependency resolution. The biggest differences are Gradle's mechanisms for work
avoidance and incrementally. Following features make Gradle much faster than Maven:
Incrementally:Gradle avoids work by tracking input and output of tasks and only
running what is necessary.
Build Cache:Reuses the build outputs of any other Gradle build with the same
inputs.
Gradle Daemon:A long-lived process that keeps build information "hot" in memory.
User Experience
Maven's has a very good support for various IDE's. Gradle's IDE support continues to
improve quickly but is not great as of Maven.
6/71
Although IDEs are important, a large number of users prefer to execute build operations
through a command-line interface. Gradle provides a modern CLI that has discoverability
features like `gradle tasks`, as well as improved logging and command-line completion.
Dependency Management
Both build systems provide built-in capability to resolve dependencies from configurable
repositories. Both are able to cache dependencies locally and download them in parallel.
As a library consumer, Maven allows one to override a dependency, but only by version.
Gradle provides customizable dependency selection and substitution rules that can be
declared once and handle unwanted dependencies project-wide. This substitution
mechanism enables Gradle to build multiple source projects together to create composite
builds.
Maven has few, built-in dependency scopes, which forces awkward module architectures in
common scenarios like using test fixtures or code generation. There is no separation
between unit and integration tests, for example. Gradle allows custom dependency scopes,
which provides better-modeled and faster builds.
Gradle builds a script file for handling projects and tasks. Every Gradle build represents one
or more projects.
The wrapper is a batch script on Windows, and a shell script for other operating systems.
Gradle Wrapper is the preferred way of starting a Gradle build.
When a Gradle build is started via the wrapper, Gradle will automatically download and run
the build.
This type of name is written in the format that is build.gradle. It generally configures the
Gradle scripting language.
In order to make sure that dependency for your project is added, you need to mention the
7/71
configuration dependency like compiling the block dependencies of the build.gradle file.
Dependency configuration comprises of the external dependency, which you need to install
well and make sure the downloading is done from the web. There are some key features of
this configuration which are:
1. Compilation: The project which you would be starting and working on the first needs
to be well compiled and ensure that it is maintained in the good condition.
2. Runtime: It is the desired time which is required to get the work dependency in the
form of collection.
3. Test Compile: The dependencies check source requires the collection to be made
for running the project.
4. Test runtime: This is the final process which needs the checking to be done for
running the test that is in a default manner considered to be the mode of runtime
Gradle runs on the Java Virtual Machine (JVM) and uses several supporting
libraries that require a non-trivial initialization time.
As a result, it can sometimes seem a little slow to start. The solution to this
problem is the Gradle Daemon: a long-lived background process that
executes your builds much more quickly than would otherwise be the case.
Software projects rarely work in isolation. In most cases, a project relies on reusable
functionality in the form of libraries or is broken up into individual components to compose a
modularized system.
1. It has good UX
2. It is very powerful
3. It is aware of the resource
4. It is well integrated with the Gradle Build scans
5. It has been default enabled
Multi-project builds helps with modularization. It allows a person to concentrate on one area
of work in a larger project, while Gradle takes care of dependencies from other parts of the
project
A multi-project build in Gradle consists of one root project, and one or more subprojects
that may also have subprojects.
While each subproject could configure itself in complete isolation of the other subprojects, it
is common that subprojects share common traits.
Gradle Build Tasks is made up of one or more projects and a project represents what is
been done with Gradle.
9/71
Gradle Build life cycle consists of following three steps
-Initialization phase: In this phase the project layer or objects are organized
-Configuration phase: In this phase all the tasks are available for the current build and a
dependency graph is created
The Java plugin adds Java compilation along with testing and bundling capabilities to the
project. It is introduced in the way of a SourceSet which act as a group of source files
complied and executed together.
Compile:
Runtime:
It is the required time needed to get the dependency work in the collection.
Test Compile:
The check source of the dependencies is to be collected in order to run the project.
Test Runtime:
The final procedure is to check and run the test which is by default act as a runtime mode.
10/71
Question: What is Groovy?
It is both a static and dynamic language with features similar to those of Python, Ruby, Perl,
and Smalltalk.
It can be used as both a programming language and a scripting language for the Java
Platform, is compiled to Java virtual machine (JVM) bytecode, and interoperates
seamlessly with other Java code and libraries.
Groovy uses a curly-bracket syntax similar to Java. Groovy supports closures, multiline
strings, and expressions embedded in strings.
And much of Groovy's power lies in its ASTtransformations, triggered through annotations.
Groovy is documented very badly. In fact the core documentation of Groovy is limitedand
there is no information regarding the complex and run-time errors that happen.
Developers are largely on there own and they normally have to figure out the explanations
about internal workings by themselves.
Groovy adds the execute method to String to make executing shells fairly easy
println "ls".execute().text
-Application Servers
-Servlet Containers
It is possible but in this case the features are limited. Groovy cannot be made to handle all
the tasks in a manner it has to.
Groovy can perform optimally in every situation.There are many Java based components in
Groovy,which make it even more easier to work with Java applications.
A closure in Groovy is an open, anonymous, block of code that can take arguments, return
a value and be assigned to a variable. A closure may reference variables declared in its
surrounding scope. In opposition to the formal definition of a closure, Closure in the
Groovy language can also contain free variables which are defined outside of its
surrounding scope.
When a parameter list is specified, the -> character is required and serves to separate the
arguments from the closure body. The statements portion consists of 0, 1, or many Groovy
statements.
Through this class programmers can add properties, constructors, methods and operations
in the task. It is a powerful option available in the Groovy.
By default this class cannot be inherited and users need to call explicitly. The command for
this is “ExpandoMetaClass.enableGlobally()”.
13/71
It might take you some time to get used to the usual syntax and default typing.
It consists of thin documentation.
class Test {
println('Hello World');
Groovy tries to be as natural as possible for Java developers. Here are all the major
differences between Java and Groovy.
-Default imports
In Groovy all these packages and classes are imported by default, i.e. Developers do not
have to use an explicit import statement to use them:
java.io.*
java.lang.*
java.math.BigDecimal
java.math.BigInteger
java.net.*
java.util.*
groovy.lang.*
groovy.util.*
-Multi-methods
14/71
In Groovy, the methods which will be invoked are chosen at runtime. This is called runtime
dispatch or multi-methods. It means that the method will be chosen based on the types of
the arguments at runtime. In Java, this is the opposite: methods are chosen at compile
time, based on the declared types.
-Array initializers
In Groovy, the { … } block is reserved for closures. That means that you cannot create
array literals with this syntax:
int[] arraySyntex = { 6, 3, 1}
-ARM blocks
ARM (Automatic Resource Management) block from Java 7 are not supported in Groovy.
Instead, Groovy provides various methods relying on closures, which have the same effect
while being more idiomatic.
-GStrings
As double-quoted string literals are interpreted as GString values, Groovy may fail with
compile error or produce subtly different code if a class with String literal containing a
dollar character is compiled with Groovy and Java compiler.
While typically, Groovy will auto-cast between GString and String if an API declares
the type of a parameter, beware of Java APIs that accept an Object parameter and then
check the actual type.
assert 'c'.getClass()==String
assert "c".getClass()==String
assert "c${1}".getClass() in GString
Groovy will automatically cast a single-character String to char only when assigning to
a variable of type char . When calling methods with arguments of type char we need to
either cast explicitly or make sure the value has been cast in advance.
char a='a'
assert Character.digit(a, 16)==10 : 'But Groovy does boxing'
assert Character.digit((char) 'a', 16)==10
try {
assert Character.digit('a', 16)==10
assert false: 'Need explicit cast'
15/71
} catch(MissingMethodException e) {
}
Groovy supports two styles of casting and in the case of casting to char there are subtle
differences when casting a multi-char strings. The Groovy style cast is more lenient and will
take the first character, while the C-style cast will fail with exception.
-Behaviour of ==
In Java == means equality of primitive types or identity for objects. In
Groovy == translates to a.compareTo(b)==0 , if they are Comparable ,
and a.equals(b) otherwise. To check for identity, there is is . E.g. a.is(b) .
The Groovy programming language comes with great support for writing tests. In addition
to the language features and test integration with state-of-the-art testing libraries and
frameworks.
The Groovy ecosystem has born a rich set of testing libraries and frameworks.
Junit Integrations
Groovy also has excellent built-in support for a range of mocking and stubbing alternatives.
When using Java, dynamic mocking frameworks are very popular.
A key reason for this is that it is hard work creating custom hand-crafted mocks using Java.
Such frameworks can be used easily with Groovy.
16/71
Writing tests means formulating assumptions by using assertions. In Java this can be done
by using the assert keyword. But Groovy comes with a powerful variant of assert also
known as power assertion statement.
Groovy’s power assert differs from the Java version in its output given the boolean
expression validates to false :
def x = 1
assert x == 2
// Output:
//
// Assertion failed:
// assert x == 2
// | |
// 1 false
def x = [1,2,3,4,5]
assert (x << 6) == [6,7,8,9,10]
// Output:
//
// Assertion failed:
// assert (x << 6) == [6,7,8,9,10]
// | | |
// | | false
// | [1, 2, 3, 4, 5, 6]
// [1, 2, 3, 4, 5, 6]
Design patterns can also be used with Groovy. Here are important points
Some patterns carry over directly (and can make use of normal Groovy syntax
improvements for greater readability)
Some patterns are no longer required because they are built right into the language
or because Groovy supports a better way of achieving the intent of the pattern
some patterns that have to be expressed at the design level in other languages can
be implemented directly in Groovy (due to the way Groovy can blur the distinction
between design and implementation)
17/71
Groovy comes with integrated support for converting between Groovy objects and JSON.
The classes dedicated to JSON serialisation and parsing are found in
the groovy.json package.
JsonSlurper is a class that parses JSON text or reader content into Groovy data
structures (objects) such as maps, lists and primitive types
like Integer , Double , Boolean and String .
The class comes with a bunch of overloaded parse methods plus some special methods
such as parseText , parseFile and others
XmlParser and XmlSluper are used for parsing XML with Groovy. Both have the same
approach to parse an xml.
Both come with a bunch of overloaded parse methods plus some special methods such
as parseText , parseFile and others.
XmlSlurper
XmlParser
18/71
def text = '''
<list>
<technology>
<name>Groovy</name>
</technology>
</list>
'''
19/71
Question: What is Maven?
Maven is a build automation tool used primarily for Java projects. Maven addresses two
aspects of building software:
Unlike earlier tools like Apache Ant, it uses conventions for the build procedure, and only
exceptions need to be written down.
An XML file describes the software project being built, its dependencies on other external
modules and components, the build order, directories, and required plug-ins.
It comes with pre-defined targets for performing certain well-defined tasks such as
compilation of code and its packaging.
Maven dynamically downloads Java libraries and Maven plug-ins from one or more
repositories such as the Maven 2 Central Repository, and stores them in a local cache.
This local cache of downloaded artifacts can also be updated with artifacts created by local
projects. Public repositories can also be updated.
20/71
Question: What Are Benefits Of Maven?
One of the biggest benefit of Maven is that its design regards all projects as having a
certain structure and a set of supported task work-flows.
Maven has quick project setup, no complicated build.xml files, just a POM and go
All developers in a project use the same jar dependencies due to centralized POM.
In Maven getting a number of reports and metrics for a project "for free"
It reduces the size of source distributions, because jars can be pulled from a central
location
Maven lets developers get your package dependencies easily
With Maven there is no need to add jar files manually to the class path
Build lifecycle is a list of named phases that can be used to give order to goal execution.
One of Maven's standard life cycles is the default lifecycle, which includes the following
phases, in this order
1 validate
2 generate-sources
3 process-sources
4 generate-resources
5 process-resources
6 compile
7 process-test-sources
8 process-test-resources
9 test-compile
10 test
11 package
12 install
13 deploy
Build tools are programs that automate the creation of executable applications from source
code. Building incorporates compiling, linking and packaging the code into a usable or
executable form.
In small projects, developers will often manually invoke the build process. This is not
practical for larger projects.
Where it is very hard to keep track of what needs to be built, in what sequence and what
dependencies there are in the building process. Using an automation tool like Maven,
Gradle or ANT allows the build process to be more consistent.
21/71
Question: What Is Dependency Management Mechanism In
Gradle?
For example if a project needs Hibernate library. It has to simply declare Hibernate's
project coordinates in its POM.
Maven will automatically download the dependency and the dependencies that Hibernate
itself needs and store them in the user's local repository.
Maven 2 Central Repository is used by default to search for libraries, but developers can
configure the custom repositories to be used (e.g., company-private repositories) within the
POM.
The Central Repository Search Engine, can be used to find out coordinates for different
open-source libraries and frameworks.
Most of Maven's functionality is in plugins. A plugin provides a set of goals that can be
executed using the following syntax:
mvn [plugin-name]:[goal-name]
For example, a Java project can be compiled with the compiler-plugin's compile-goal by
running mvn compiler:compile . There are Maven plugins for building, testing, source
control management, running a web server, generating Eclipse project files, and much
more. Plugins are introduced and configured in a <plugins>-section of a pom.xml file.
Some basic plugins are included in every project by default, and they have sensible default
settings.
Ant Maven
22/71
Ant doesn't have formal Maven has a convention to place source code, compiled code
conventions. etc.
The ant scripts are not reusable. The maven plugins are reusable.
A Project Object Model (POM) provides all the configuration for a single project. General
configuration covers the project's name, its owner and its dependencies on other projects.
One can also configure individual phases of the build process, which are implemented
as plugins.
For example, one can configure the compiler-plugin to use Java version 1.5 for compilation,
or specify packaging the project even if some unit tests fail.
Larger projects should be divided into several modules, or sub-projects, each with its own
POM. One can then write a root POM through which one can compile all the modules with a
single command. POMs can also inherit configuration from other POMs. All POMs inherit
from the Super POM by default. The Super POM provides default configuration, such as
default source directories, default plugins, and so on.
In Maven artifact is simply a file or JAR that is deployed to a Maven repository. An artifact
has
-Group ID
-Artifact ID
-Version string. The three together uniquely identify the artifact. All the project
dependencies are specified as artifacts.
In Maven a goal represents a specific task which contributes to the building and managing
23/71
of a project.
It may be bound to 1 or many build phases. A goal not bound to any build phase could be
executed outside of the build lifecycle by its direct invocation.
In Maven a build profile is a set of configurations. This set is used to define or override
default behaviour of Maven build.
Build profile helps the developers to customize the build process for different environments.
For example you can set profiles for Test, UAT, Pre-prod and Prod environments each with
its own configurations etc.
There are 6 build phases. -Validate -Compile -Test -Package -Install -Deploy
Target: folder holds the compiled unit of code as part of the build process.
Source: folder usually holds java source codes. Test: directory contains all the unit
testing codes.
Compile: is used to compile the source code of the project Install: installs the package
into the local repository, for use as a dependency in other projects locally.Design patterns
can also be used with Groovy. Here are important points
24/71
Question: What is Linux?
Linux is the best-known and most-used open source operating system. As an operating
system, Linux is a software that sits underneath all of the other software on a computer,
receiving requests from those programs and relaying these requests to the computer’s
hardware.
In many ways, Linux is similar to other operating systems such as Windows, OS X, or iOS
But Linux also is different from other operating systems in many important ways.
First, and perhaps most importantly, Linux is open source software. The code used to
create Linux is free and available to the public to view, edit, and—for users with the
appropriate skills—to contribute to.
Kernel: Linux is a monolithic kernel that is free and open source software that is
responsible for managing hardware resources for the users.
System Library: System Library plays a vital role because application programs
access Kernels feature using system library.
System Utility: System Utility performs specific and individual level tasks.
25/71
Question: What Is Difference Between Linux & Unix?
Unix and Linux are similar in many ways, and in fact, Linux was originally created to be
similar to Unix.
Both have similar tools for interfacing with the systems, programming tools, filesystem
layouts, and other key components.
However, Unix is not free. Over the years, a number of different operating systems have
been created that attempted to be “unix-like” or “unix-compatible,” but Linux has been the
most successful, far surpassing its predecessors in popularity.
BASH stands for Bourne Again Shell. BASH is the UNIX shell for the GNU operating
system. So, BASH is the command language interpreter that helps you to enter your input,
and thus you can retrieve information.
In a straightforward language, BASH is a program that will understand the data entered by
the user and execute the command and gives output.
The crontab (short for "cron table") is a list of commands that are scheduled to run at
regular time intervals on computer system. The crontab command opens the crontab for
editing, and lets you add, remove, or modify scheduled tasks.
The daemon which reads the crontab and executes the commands at the right time is
called cron. It's named after Kronos, the Greek god of time.
Command syntax
A daemon is a type of program on Linux operating systems that runs unobtrusively in the
background, rather than under the direct control of a user, waiting to be activated by the
occurrence of a specific event or condition
26/71
Unix-like systems typically run numerous daemons, mainly to accommodate requests for
services from other computers on a network, but also to respond to other programs and to
hardware activity.
Examples of actions or conditions that can trigger daemons into activity are a specific time
or date, passage of a specified time interval, a file landing in a particular directory, receipt of
an e-mail or a Web request made through a particular communication line.
It is not necessary that the perpetrator of the action or condition be aware that a daemon
is listening, although programs frequently will perform an action only because they are
aware that they will implicitly arouse a daemon.
Processes are managed by the kernel (i.e., the core of the operating system), which
assigns each a unique process identification number (PID).
-Batch:Batch processes are submitted from a queue of processes and are not associated
with the command line; they are well suited for performing recurring tasks when system
usage is otherwise low.
-Daemon: Daemons are recognized by the system as any processes whose parent
process has a PID of one
CLI (Command Line Interface) is a type of human-computer interface that relies solely on
textual input and output.
That is, the entire display screen, or the currently active portion of it, shows
only characters (and no images), and input is usually performed entirely with a keyboard.
A kernel is the lowest level of easily replaceable software that interfaces with the hardware
in your computer.
It is responsible for interfacing all of your applications that are running in “user mode” down
27/71
to the physical hardware, and allowing processes, known as servers, to get information
from each other using inter-process communication (IPC).
Microkernel:A microkernel takes the approach of only managing what it has to: CPU,
memory, and IPC. Pretty much everything else in a computer can be seen as an accessory
and can be handled in user mode.
Hybrid Kernel:Hybrid kernels have the ability to pick and choose what they want to run in
user mode and what they want to run in supervisor mode. Because the Linux kernel is
monolithic, it has the largest footprint and the most complexity over the other types of
kernels. This was a design feature which was under quite a bit of debate in the early days
of Linux and still carries some of the same design flaws that monolithic kernels are inherent
to have.
Partial backup refers to selecting only a portion of file hierarchy or a single partition to back
up.
The root account a system administrator account. It provides you full access and control of
the system.
Admin can create and maintain user accounts, assign different permission for each account
etc
One of the main difference between cron and anacron jobs is that cron works on the
system that are running continuously.
While anacron is used for the systems that are not running continuously.
1. Other difference between the two is cron jobs can run every minute, but anacron jobs
can be run only once a day.
2. Any normal user can do the scheduling of cron jobs, but the scheduling of anacron
jobs can be done by the superuser only.
28/71
3. Cron should be used when you need to execute the job at a specific time as per the
given time in cron, but anacron should be used in when there is no any restriction for
the timing and can be executed at any time.
4. If we think about which one is ideal for servers or desktops, then cron should be used
for servers while anacron should be used for desktops or laptops.
Linux Loader is a boot loader for Linux operating system. It loads Linux into into the main
memory so that it can begin its operations.
Swap space is the amount of physical memory that is allocated for use by Linux to hold
some concurrent running programs temporarily.
This condition usually occurs when Ram does not have enough memory to support all
concurrent running programs.
This memory management involves the swapping of memory to and from physical storage.
There are around six hundred Linux distributors. Let us see some of the important ones
UBuntu: It is a well known Linux Distribution with a lot of pre-installed apps and easy
to use repositories libraries. It is very easy to use and works like MAC operating
system.
Linux Mint: It uses cinnamon and mate desktop. It works on windows and should be
used by newcomers.
Debian: It is the most stable, quicker and user-friendly Linux Distributors.
Fedora: It is less stable but provides the latest version of the software. It has
GNOME3 desktop environment by default.
Red Hat Enterprise: It is to be used commercially and to be well tested before
release. It usually provides the stable platform for a long time.
Arch Linux: Every package is to be installed by you and is not suitable for the
beginners.
Read: User can read the file and list the directory.
Write: User can write new files in the directory .
Execute: User can access and run the file in a directory.
Free command: This is the most simple and easy to use the command to check
memory usage. For example: ‘$ free –m’, the option ‘m’ displays all the data in MBs.
/proc/meminfo: The next way to determine the memory usage is to read
/proc/meminfo file. For example: ‘$ cat /proc/meminfo’
Vmstat: This command basically lays out the memory usage statistics. For example:
‘$ vmstat –s’
Top command: This command determines the total memory usage as well as also
monitors the RAM usage.
Htop: This command also displays the memory usage along with other details.
pwd: It is a built-in command which stands for ‘print working directory’. It displays
the current working location, working path starting with / and directory of the user.
Basically, it displays the full path to the directory you are currently in.
Is: This command list out all the files in the directed folder.
cd: This stands for ‘change directory’. This command is used to change to the
30/71
directory you want to work from the present directory. We just need to type cd
followed by the directory name to access that particular directory.
mkdir: This command is used to create an entirely new directory.
rmdir: This command is used to remove a directory from the system.
The shell reads this file and carries out the commands as though they have been entered
directly on the command line.
The shell is somewhat unique, in that it is both a powerful command line interface to the
system and a scripting language interpreter.
As we will see, most of the things that can be done on the command line can be done in
scripts, and most of the things that can be done in scripts can be done on the command
line.
We have covered many shell features, but we have focused on those features most often
used directly on the command line.
The shell also provides a set of features usually (but not always) used when writing
programs.
vmstat
netstat
iostat
ifstat
mpstat.
These are used for reporting statistics from different system components such as virtual
memory, network connections and interfaces, CPU, input/output devices and more.
31/71
It comes with extra features, counters and it is highly extensible, users with Python
knowledge can build their own plugins.
Features of dstat:
1. Joins information from vmstat, netstat, iostat, ifstat and mpstat tools
2. Displays statistics simultaneously
3. Orders counters and highly-extensible
4. Supports summarizing of grouped block/network devices
5. Displays interrupts per device
6. Works on accurate timeframes, no timeshifts when a system is stressed
7. Supports colored output, it indicates different units in different colors
8. Shows exact units and limits conversion mistakes as much as possible
9. Supports exporting of CSV output to Gnumeric and Excel documents
The child process will have the same environment as its parent, but only the process ID
number is different.
There are two conventional ways used for creating a new process in Linux:
Using The System() Function – this method is relatively simple, however, it’s
inefficient and has significantly certain security risks.
Using fork() and exec() Function – this technique is a little advanced but offers
greater flexibility, speed, together with security.
32/71
Because Linux is a multi-user system, meaning different users can be running various
programs on the system, each running instance of a program must be identified uniquely
by the kernel.
And a program is identified by its process ID (PID) as well as it’s parent processes ID
(PPID), therefore processes can further be categorized into:
Parent processes – these are processes that create other processes during run-
time.
Child processes – these processes are created by other processes during run-time.
lnit process is the mother (parent) of all processes on the system, it’s the first program that
is executed when the Linux system boots up; it manages all other processes on the
system. It is started by the kernel itself, so in principle it does not have a parent process.
The init process always has process ID of 1. It functions as an adoptive parent for all
orphaned processes.
# pidof systemd
# pidof top
# pidof httpd
To find the process ID and parent process ID of the current shell, run:
$ echo $$
$ echo $PPID
During execution, a process changes from one state to another depending on its
environment/circumstances. In Linux, a process has the following possible states:
Running – here it’s either running (it is the current process in the system) or it’s
ready to run (it’s waiting to be assigned to one of the CPUs).
Waiting – in this state, a process is waiting for an event to occur or for a system
resource. Additionally, the kernel also differentiates between two types of waiting
processes; interruptible waiting processes – can be interrupted by signals and
uninterruptible waiting processes – are waiting directly on hardware conditions and
cannot be interrupted by any event/signal.
Stopped – in this state, a process has been stopped, usually by receiving a signal.
For instance, a process that is being debugged.
33/71
Zombie – here, a process is dead, it has been halted but it’s still has an entry in the
process table.
1. ps Command
It displays information about a selection of the active processes on the system as shown
below:
#ps
#ps -e ] head
top is a powerful tool that offers you a dynamic real-time view of a running system as shown
in the screenshot below:
#top
#glances
The fundamental way of controlling processes in Linux is by sending signals to them. There
are multiple signals that you can send to a process, to view all the signals run:
34/71
$ kill -l
And most signals are for internal use by the system, or for programmers when they write
code. The following are signals which are useful to a system user:
However, a system user with root privileges can influence this with
the nice and renice commands.
From the output of the top command, the NI shows the process nice value:
$ top
$ renice +8 2687
$ renice +8 2103
35/71
Git is a version control system for tracking changes in computer files and coordinating work
on those files among multiple people.
It is primarily used for source code management in software development but it can be
used to keep track of changes in any set of files.
As a distributed revision control system it is aimed at speed, data integrity, and support for
distributed, non-linear workflows.
By far, the most widely used modern version control system in the world today is Git. Git is
a mature, actively maintained open source project originally developed in 2005 by Linus
Torvald. Git is an example of a Distributed Version Control System, In Git, every
developer's working copy of the code is also a repository that can contain the full history of
all changes.
Ease of use
Data redundancy and replication
High availability
Superior disk utilization and network performance
Only one .git directory per repository
Collaboration friendly
Any kind of projects from large to small scale can use GIT
The purpose of Git is to manage a project, or a set of files, as they change over time. Git
stores this information in a data structure called a repository. A git repository contains,
among other things, the following:
The Git repository is stored in the same directory as the project itself, in a subdirectory
called .git. Note differences from central-repository systems like CVS or Subversion:
There is only one .git directory, in the root directory of the project.
The repository is stored in files alongside the project. There is no central server
repository.
36/71
Staging is a step before the commit process in git. That is, a commit in git is performed in
two steps:
-Staging and
-Actual commit
As long as a change set is in the staging area, git allows you to edit it as you like
(replace staged files with other versions of staged files, remove changes from staging, etc.)
Often, when you’ve been working on part of your project, things are in a messy state and
you want to switch branches for a bit to work on something else.
The problem is, you don’t want to do a commit of half-done work just so you can get back to
this point later. The answer to this issue is the git stash command. Stashing takes the
dirty state of your working directory — that is, your modified tracked files and staged
changes — and saves it on a stack of unfinished changes that you can reapply at any time.
Given one or more existing commits, revert the changes that the related patches introduce,
and record some new commits that record them. This requires your working tree to be
clean (no modifications from the HEAD commit).
SYNOPSIS
Use the git remote rm command to remove a remote URL from your repository.
The git remote rm command takes one argument:
37/71
In case we do not need a specific stash, we use git stash drop command to remove it from
the list of stashes.
To remove a specific stash we specify as argument in the git stash drop <stashname>
command.
GIT fetch – It downloads only the new data from the remote repository and does not
integrate any of the downloaded data into your working files. Providing a view of the data is
all it does.
GIT pull – It downloads as well as merges the data from the remote repository into the local
working files.
This may also lead to merging conflicts if the user’s local changes are not yet committed.
Using the “GIT stash” command hides the local changes.
A fork is a copy of a repository. Forking a repository allows you to freely experiment with
changes without affecting the original project.
A fork is really a Github (not Git) construct to store a clone of the repo in your user account.
As a clone, it will contain all the branches in the main repo at the time you made the fork.
38/71
Create Tag:
A fork is a copy of a repository. Forking a repository allows you to freely experiment with
changes without affecting the original project.
A fork is really a Github (not Git) construct to store a clone of the repo in your user account.
As a clone, it will contain all the branches in the main repo at the time you made the fork.
Cherry picking in git means to choose a commit from one branch and apply it onto another.
This is in contrast with other ways such as merge and rebase which normally applies many
commits onto a another branch.
Make sure you are on the branch you want apply the commit to. git checkout master
Execute the following:
Much of Git is written in C, along with some BASH scripts for UI wrappers and other bits.
Rebasing is the process of moving a branch to a new base commit.The golden rule of git
rebase is to never use it on public branches.
The only way to synchronize the two master branches is to merge them back together,
resulting in an extra merge commit and two sets of commits that contain the same
changes.
Question: What is ‘head’ in git and how many heads can be created in a
repository?
39/71
There can be any number of heads in a GIT repository. By default there is one head known
as HEAD in each repository in GIT.
HEAD is a ref (reference) to the currently checked out commit. In normal states, it's actually
a symbolic ref to the branch user has checked out.
if you look at the contents of .git/HEAD you'll see something like "ref: refs/heads/master".
The branch itself is a reference to the commit at the tip of the branch
GIT diff – It shows the changes between commits, commits and working tree.
GIT status – It shows the difference between working directories and index.
GIT stash applies – It is used to bring back the saved changes on the working
directory.
GIT rm – It removes the files from the staging area and also of the disk.
GIT log – It is used to find the specific commit in the history.
GIT add – It adds file changes in the existing directory to the index.
GIT reset – It is used to reset the index and as well as the working directory to the
state of the last commit.
GIT checkout – It is used to update the directories of the working tree with those
from another branch without merging.
GIT Is tree – It represents a tree object including the mode and the name of each
item.
GIT instaweb – It automatically directs a web browser and runs the web server with
an interface into your local repository.
40/71
repaired merge run the “GIT commit” command. GIT
identifies the position and sets the parents of the
commit correctly.
The index is a single, large, binary file in under .git folder, which lists all files in the current
branch, their sha1 checksums, time stamps and the file name. Before completing the
commits, it is formatted and reviewed in an intermediate area known as Index also known
as the staging area.
In case the commit that needs to be reverted has already been published or changing the
repository history is not an option, git revert can be used to revert commits. Running the
following command will revert the last two commits:
41/71
Alternatively, one can always checkout the state of a particular commit from the past, and
commit it anew.
Squashing multiple commits into a single commit will overwrite history, and should be done
with caution. However, this is useful when working in feature branches.
To squash the last N commits of the current branch, run the following command (with {N}
replaced with the number of commits that you want to squash):
Upon running this command, an editor will open with a list of these N commit messages,
one per line.
Each of these lines will begin with the word “pick”. Replacing “pick” with “squash” or “s” will
tell Git to combine the commit with the commit before it.
To combine all N commits into one, set every commit in the list to be squash except the first
one.
Upon exiting the editor, and if no conflict arises, git rebase will allow you to create a new
commit message for the new combined commit.
A conflict arises when more than one commit that has to be merged has some change in
the same place or same line of code.
Git will not be able to predict which change should take precedence. This is a git conflict.
To resolve the conflict in git, edit the files to fix the conflicting changes and then add the
resolved files by running git add .
After that, to commit the repaired merge, run git commit . Git remembers that you are in
the middle of a merge, so it sets the parents of the commit correctly.
42/71
To configure a script to run every time a repository receives new commits through push,
one needs to define either a pre-receive, update, or a post-receive hook depending on
when exactly the script needs to be triggered.
Pre-receive hook in the destination repository is invoked when commits are pushed to it.
Any script bound to this hook will be executed before any references are updated.
This is a useful hook to run scripts that help enforce development policies.
Update hook works in a similar manner to pre-receive hook, and is also triggered before
any updates are actually made.
However, the update hook is called once for every commit that has been pushed to the
destination repository.
Finally, post-receive hook in the repository is invoked after the updates have been accepted
into the destination repository.
This is an ideal place to configure simple deployment scripts, invoke some continuous
integration systems, dispatch notification emails to repository maintainers, etc.
Hooks are local to every Git repository and are not versioned. Scripts can either be created
within the hooks directory inside the “.git” directory, or they can be created elsewhere and
links to those scripts can be placed within the directory.
In Git each commit is given a unique hash. These hashes can be used to identify the
corresponding commits in various scenarios (such as while trying to checkout a particular
state of the code using the git checkout {hash} command).
Additionally, Git also maintains a number of aliases to certain commits, known as refs.
Also, every tag that you create in the repository effectively becomes a ref (and that is
exactly why you can use tags instead of commit hashes in various git commands).
Git also maintains a number of special aliases that change based on the state of the
repository, such as HEAD, FETCH_HEAD, MERGE_HEAD, etc.
Git also allows commits to be referred as relative to one another. For example, HEAD~1
refers to the commit parent to HEAD, HEAD~2 refers to the grandparent of HEAD, and so
on.
In case of merge commits, where the commit has two parents, ^ can be used to select one
of the two parents, e.g. HEAD^2 can be used to follow the second parent.
And finally, refspecs. These are used to map local and remote branches together.
However, these can be used to refer to commits that reside on remote branches allowing
one to control and manipulate them from a local Git environment.
43/71
Question: What Is Conflict In GIT?
A conflict arises when more than one commit that has to be merged has some change in
the same place or same line of code.
Git will not be able to predict which change should take precedence. This is a git conflict.To
resolve the conflict in git, edit the files to fix the conflicting changes and then add the
resolved files by running git add . After that, to commit the repaired merge, run git
commit . Git remembers that you are in the middle of a merge, so it sets the parents of the
commit correctly.
Git hooks are scripts that can run automatically on the occurrence of an event in a Git
repository. These are used for automation of workflow in GIT. Git hooks also help in
customizing the internal behavior of GIT. These are generally used for enforcing a GIT
commit policy.
Binary Files: If we have a lot binary files (non-text) in our project, then GIT becomes very
slow. E.g. Projects with a lot of images or Word documents.
Steep Learning Curve: It takes some time for a newcomer to learn GIT. Some of the GIT
commands are non-intuitive to a fresher.
Slow remote speed: Sometimes the use of remote repositories in slow due to network
latency. Still GIT is better than other VCS in speed.
Files: List of files that represent the state of a project at a specific point of time
44/71
Question: What Is GIT reset command?
Git reset command is used to reset current HEAD to a specific state. By default it reverses
the action of git add command. So we use git reset command to undo the changes of git
add command. Reference: Any reference to parent commit objects
This algorithm is quite strong and fast. It protects source code and other contents of
repository against the possible malicious attacks.
This algorithm also maintains the integrity of GIT repository by protecting the change
history against accidental changes.
Continuous Integration is the process of continuously integrating the code and often
multiple times per day. The purpose is to find problems quickly, s and deliver the fixes more
rapidly.
CI is a best practice for software development. It is done to ensure that after every code
change there is no issue in software.
Build automation is the process of automating the creation of a software build and the
associated processes.
Including compiling computer source code into binary code, packaging binary code, and
running automated tests.
45/71
It enables you to quickly learn what to expect every time you deploy an environment with
much faster results.
This combined with Build Automation can save development teams a significant amount of
hours.
Automated Deployment saves clients from being extensively offline during development
and allows developers to build while “touching” fewer of a clients’ systems.
With an automated system, human error is prevented. In the event of human error,
developers are able to catch it before live deployment – saving time and headache.
You can even automate the contingency plan and make the site rollback to a working or
previous state as if nothing ever happened.
Clearly, this automated feature is super valuable in allowing applications and sites to
continue during fixes.
Different tools for supporting Continuous Integration are Hudson, Jenkins and Bamboo.
Jenkins is the most popular one currently. They provide integration with various version
control systems and build tools.
Source code repository : To commit code and changes for example git.
Server: It is Continuous Integration software for example Jenkin, Teamcity.
46/71
Build tool: It builds application on particular way for example maven, gradle.
Deployment environment : On which application will be deployed.
Jenkins is self-contained, open source automation server used to automate all sorts of
tasks related to building, testing, and delivering or deploying software.
Jenkins is one of the leading open source automation servers available. Jenkins has an
extensible, plugin-based architecture, enabling developers to create 1,400+ plugins to
adapt it to a multitude of build, test and deployment technology integrations.
Jenkins Pipeline (or simply “Pipeline”) is a suite of plugins which supports implementing
and integrating continuous delivery pipelines into Jenkins..
Maven and Ant are Build Technologies whereas Jenkins is a continuous integration tool.
The Jenkins software enables developers to find and solve defects in a code base rapidly
and to automate testing of their builds.
Jenkins supports version control tools, including AccuRev, CVS, Subversion, Git, Mercurial,
Perforce, ClearCase and RTC, and can execute Apache Ant, Apache Maven and arbitrary
shell scripts and Windows batch commands.
Pipeline adds a powerful set of automation tools onto Jenkins, supporting use cases that
span from simple continuous integration to comprehensive continuous delivery pipelines.
By modeling a series of related tasks, users can take advantage of the many features of
Pipeline:
Code: Pipelines are implemented in code and typically checked into source control,
giving teams the ability to edit, review, and iterate upon their delivery pipeline.
Durable: Pipelines can survive both planned and unplanned restarts of the Jenkins
master.
Pausable: Pipelines can optionally stop and wait for human input or approval before
continuing the Pipeline run.
Versatile: Pipelines support complex real-world continuous delivery requirements,
including the ability to fork/join, loop, and perform work in parallel.
Extensible: The Pipeline plugin supports custom extensions to its DSL and multiple
options for integration with other plugins.
The Multi branch Pipeline project type enables you to implement different Jenkins files for
different branches of the same project.
In a Multi branch Pipeline project, Jenkins automatically discovers, manages and executes
Pipelines for branches which contain a Jenkins file in source control.
Jenkins can be used to perform the typical build server work, such as doing
continuous/official/nightly builds, run tests, or perform some repetitive batch tasks. This is
called “free-style software project” in Jenkins.
48/71
Question: How do you configuring automatic
builds in Jenkins?
Amazon Web Services provides services that help you practice DevOps at your company
and that are built first for use with AWS.
These tools automate manual tasks, help teams manage complex environments at scale,
and keep engineers in control of the high velocity that is enabled by DevOps
Get Started Fast: Each AWS service is ready to use if you have an AWS account. There is
no setup required or software to install.
Fully Managed Services: These services can help you take advantage of AWS resources
quicker. You can worry less about setting up, installing, and operating infrastructure on
your own. This lets you focus on your core product.
Built For Scalability: You can manage a single instance or scale to thousands using AWS
services. These services help you make the most of flexible compute resources by
simplifying provisioning, configuration, and scaling.
Programmable: You have the option to use each service via the AWS Command Line
Interface or through APIs and SDKs. You can also model and provision AWS resources
and your entire AWS infrastructure using declarative AWS CloudFormation templates.
Automation: AWS helps you use automation so you can build faster and more efficiently.
Using AWS services, you can automate manual tasks or processes such as deployments,
49/71
development & test workflows, container management, and configuration management.
Secure: Use AWS Identity and Access Management (IAM) to set user permissions and
policies. This gives you granular control over who can access your resources and how they
access those resources.
The AWS Developer Tools help in securely store and version your application’s source
code and automatically build, test, and deploy your application to AWS.
An Elastic Load Balancer ensures that the incoming traffic is distributed optimally across
various AWS instances.
A buffer will synchronize different components and makes the arrangement additional
elastic to a burst of load or traffic.
The components are prone to work in an unstable way of receiving and processing the
requests.
The buffer creates the equilibrium linking various apparatus and crafts them effort at the
identical rate to supply more rapid services.
Amazon S3 : with this, one can retrieve the key information which are occupied in creating
cloud structural design and amount of produced information also can be stored in this
component that is the consequence of the key specified.
Amazon EC2 instance : helpful to run a large distributed system on the Hadoop cluster.
Automatic parallelization and job scheduling can be achieved by this component.
Amazon SQS : this component acts as a mediator between different controllers. Also worn
for cushioning requirements those are obtained by the manager of Amazon.
Amazon SimpleDB : helps in storing the transitional position log and the errands executed
by the consumers.
50/71
Question: How is a Spot instance different from an On-
Demand instance or Reserved Instance?
Spot Instance, On-Demand instance and Reserved Instances are all models for pricing.
Moving along, spot instances provide the ability for customers to purchase compute
capacity with no upfront commitment, at hourly rates usually lower than the On-Demand
rate in each region.
Spot instances are just like bidding, the bidding price is called Spot Price. The Spot Price
fluctuates based on supply and demand for instances, but customers will never pay more
than the maximum price they have specified.
If the Spot Price moves higher than a customer’s maximum price, the customer’s EC2
instance will be shut down automatically.
But the reverse is not true, if the Spot prices come down again, your EC2 instance will not
be launched automatically, one has to do that manually.
In Spot and On demand instance, there is no commitment for the duration from the user
side, however in reserved instances one has to stick to the time period that he has chosen.
Questions: What are the best practices for Security in Amazon EC2?
There are several best practices to secure Amazon EC2. A few of them are given below:
Use AWS Identity and Access Management (IAM) to control access to your AWS
resources.
Restrict access by only allowing trusted hosts or networks to access ports on your
instance.
Review the rules in your security groups regularly, and ensure that you apply the
principle of least
Privilege – only open up permissions that you require.
Disable password-based logins for instances launched from your AMI. Passwords
can be found or cracked, and are a security risk.
AWS CodeBuild is a fully managed build service that compiles source code, runs tests, and
produces software packages that are ready to deploy.
With CodeBuild, you don’t need to provision, manage, and scale your own build servers.
CodeBuild scales continuously and processes multiple builds concurrently, so your builds
are not left waiting in a queue.
51/71
Question: What is Amazon Elastic Container Service in AWS Devops?
Amazon Elastic Container Service (ECS) is a highly scalable, high performance container
management service that supports Docker containers and allows you to easily run
applications on a managed cluster of Amazon EC2 instances.
AWS Lambda lets you run code without provisioning or managing servers. With Lambda,
you can run code for virtually any type of application or backend service, all with zero
administration.
Just upload your code and Lambda takes care of everything required to run and scale your
code with high availability.
The platform of Splunk allows you to get visibility into machine data generated from
different networks, servers, devices, and hardware.
It can give insights into the application management, threat visibility, compliance, security,
etc. so it is used to analyze machine data. The data is collected from the forwarder from the
source and forwarded to the indexer. The data is stored locally on a host machine or cloud.
Then on the data stored in the indexer the search head searches, visualizes, analyzes and
performs various other functions.
The main components of Splunk are Forwarders, Indexers and Search Heads.Deployment
Server(or Management Console Host) will come into the picture in case of a larger
environment.
Deployment servers act like an antivirus policy server for setting up Exceptions and Groups
so that you can map and create adifferent set of data collection policies each for either
window based server or a Linux based server or a Solaris based server. plunk has four
important components :
52/71
search head in computing environment.
An alert is an action that a saved search triggers on regular intervals set over a time range,
based on the results of the search.
When the alerts are triggered, various actions occur consequently.. For instance, sending
an email when a search to the predefined list of people is triggered.
1. Pre-result alerts : Most commonly used alert type and runs in real-time for an all-
time span. These alerts are designed such that whenever a search returns a result,
they are triggered.
2. Scheduled alerts : The second most common- scheduled results are set up to
evaluate the results of a historical search result running over a set time range on a
regular schedule. You can define a time range, schedule and the trigger condition to
an alert.
3. Rolling-window alerts: These are the hybrid of pre-result and scheduled alerts.
Similar to the former, these are based on real-time search but do not trigger each
time the search returns a matching result . It examines all events in real-time mapping
within the rolling window and triggers the time that specific condition by that event in
the window is met, like the scheduled alert is triggered on a scheduled search.
1. Sorting Results – Ordering results and (optionally) limiting the number of results.
2. Filtering Results – It takes a set of events or results and filters them into a smaller
set of results.
3. Grouping Results – Grouping events so you can see patterns.
4. Filtering, Modifying and Adding Fields – Taking search results and generating a
summary for reporting.
5. Reporting Results – Filtering out some fields to focus on the ones you need, or
modifying or adding fields to enrich your results or events.
In case the license master is unreachable, then it is just not possible to search the data.
53/71
However, the data coming in to the Indexer will not be affected. The data will continue to
flow into your Splunk deployment.
The Indexers will continue to index the data as usual however, you will get a warning
message on top your Search head or web UI saying that you have exceeded the indexing
volume.
And you either need to reduce the amount of data coming in or you need to buy a higher
capacity of license. Basically, the candidate is expected to answer that the indexing does
not stop; only searching is halted
KV store 8191
A directory that contains indexed data is known as a Splunk bucket. It also contains events
of a certain period. Bucket lifecycle includes following stages:
Hot – It contains newly indexed data and is open for writing. For each index, there
are one or more hot buckets available
Warm – Data rolled from hot
Cold – Data rolled from warm
Frozen – Data rolled from cold. The indexer deletes frozen data by default but users
can also archive it.
Thawed – Data restored from an archive. If you archive frozen data , you can later
return it to the index by thawing (defrosting) it.
54/71
Data models are used for creating a structured hierarchical model of data. It can be used
when you have a large amount of unstructured data, and when you want to make use of
that information without using complex search queries.
Create Sales Reports: If you have a sales report, then you can easily create the total
number of successful purchases, below that you can create a child object containing
the list of failed purchases and other views
Set Access Levels: If you want a structured view of users and their various access
levels, you can use a data model
On the other hand with pivots, you have the flexibility to create the front views of your
results and then pick and choose the most appropriate filter for a better view of results.
All of Splunk’s configurations are written in .conf files. There can be multiple copies present
for each of these files, and thus it is important to know the role these files play when a
Splunk instance is running or restarted. To determine the priority among copies of a
configuration file, Splunk software first determines the directory scheme. The directory
schemes are either a) Global or b) App/user. When the context is global (that is, where
there’s no app/user context), directory priority descends in this order:
When the context is app/user, directory priority descends from user to app to system:
1. User directories for current user — highest priority
2. App directories for currently running app (local, followed by default)
3. App directories for all other apps (local, followed by default) — for exported settings
only
4. System directories (local, followed by default) — lowest priority
Search time field extraction refers to the fields extracted while performing searches.
Whereas, fields extracted when the data comes to the indexer are referred to as Index time
field extraction.
55/71
You can set up the indexer time field extraction either at the forwarder level or at the
indexer level.
Another difference is that Search time field extraction’s extracted fields are not part of the
metadata, so they do not consume disk space.
Whereas index time field extraction’s extracted fields are a part of metadata and hence
consume disk space.
SOS stands for Splunk on Splunk. It is a Splunk app that provides graphical view of your
Splunk environment performance and issues.
It has following purposes:
Diagnostic tool to analyze and troubleshoot problems
Examine Splunk environment performance
Solve indexing performance issues
Observe scheduler activities and issues
See the details of scheduler and user driven search activity
Search, view and compare configuration files of Splunk
The indexer is a Splunk Enterprise component that creates and manages indexes. The
main functions of an indexer are:
Indexing incoming data
Searching indexed data Splunk indexer has following stages:
Input : Splunk Enterprise acquires the raw data from various input sources and breaks it
into 64K blocks and assign them some metadata keys. These keys include host, source
and source type of the data. Parsing : Also known as event processing, during this stage,
the Enterprise analyzes and transforms the data, breaks data into streams, identifies,
parses and sets timestamps, performs metadata annotation and transformation of data.
Indexing : In this phase, the parsed events are written on the disk index including both
compressed data and the associated index files. Searching : The ‘Search’ function plays a
56/71
major role during this phase as it handles all searching aspects (interactive, scheduled
searches, reports, dashboards, alerts) on the indexed data and stores saved searches,
events, field extractions and views
Stats – This command produces summary statistics of all existing fields in your search
results and store them as values in new fields. Eventstats – It is same as stats command
except that aggregation results are added in order to every event and only if the
aggregation is applicable to that event. It computes the requested statistics similar to stats
but aggregates them to the original raw data.
log4j is a reliable, fast and flexible logging framework (APIs) written in Java, which is
distributed under the Apache Software License.
log4j has been ported to the C, C++, C#, Perl, Python, Ruby, and Eiffel languages.
log4j is highly configurable through external configuration files at runtime. It views the
logging process in terms of levels of priorities and offers mechanisms to direct logging
information to a great variety of destinations.
Following are the Pros and Cons of Logging Logging is an important component of the
software development. A well-written logging code offers quick debugging, easy
maintenance, and structured storage of an application's runtime information. Logging does
have its drawbacks also. It can slow down an application. If too verbose, it can cause
scrolling blindness. To alleviate these concerns, log4j is designed to be reliable, fast and
extensible. Since logging is rarely the main focus of an application, the log4j API strives to
be simple to understand and to use.
Logger Object − The top-level layer of log4j architecture is the Logger which provides the
Logger object.
The Logger object is responsible for capturing logging information and they are stored in a
namespace hierarchy.
58/71
The layout layer of log4j architecture provides objects which are used to format logging
information in different styles. It provides support to appender objects before publishing
logging information.
Layout objects play an important role in publishing logging information in a way that is
human-readable and reusable.
The Appender object is responsible for publishing logging information to various preferred
destinations such as a database, file, console, UNIX Syslog, etc.
This object is used by Layout objects to prepare the final logging information.
The LogManager object manages the logging framework. It is responsible for reading the
initial configuration parameters from a system-wide configuration file or a configuration
class.
Appender can have a threshold level associated with it independent of the logger level.
The Appender ignores any logging messages that have a level lower than the threshold
level.
Docker is a tool designed to make it easier to create, deploy, and run applications by using
containers.
Containers allow a developer to package up an application with all of the parts it needs,
such as libraries and other dependencies, and ship it all out as one package.
By doing so, the developer can rest assured that the application will run on any other Linux
machine regardless of any customized settings that machine might have that could differ
from the machine used for writing and testing the code. In a way, Docker is a bit like a
virtual machine. But unlike a virtual machine, rather than creating a whole virtual operating
system. Docker allows applications to use the same Linux kernel as the system that they're
running on and only requires applications be shipped with things not already running on the
host computer. This gives a significant performance boost and reduces the size of the
application.
Linux containers, in short, contain applications in a way that keep them isolated from the
host system that they run on.
Containers allow a developer to package up an application with all of the parts it needs,
such as libraries and other dependencies, and ship it all out as one package.
And they are designed to make it easier to provide a consistent experience as developers
and system administrators move code from development environments into production in a
fast and replicable way.
Docker is a tool that is designed to benefit both developers and system administrators,
making it a part of many DevOps (developers + operations) toolchains.
For developers, it means that they can focus on writing code without worrying about
the system that it will ultimately be running on.
It also allows them to get a head start by using one of thousands of programs already
designed to run in a Docker container as a part of their application.
For operations staff, Docker gives flexibility and potentially reduces the number of systems
needed because of its small footprint and lower overhead.
60/71
Question: What Is Docker Container?
Docker containers include the application and all of its dependencies, but share the kernel
with other containers, running as isolated processes in user space on the host operating
system.
Docker containers are not tied to any specific infrastructure: they run on any computer, on
any infrastructure, and in any cloud.
Now explain how to create a Docker container, Docker containers can be created by either
creating a Docker image and then running it or you can use Docker images that are present
on the Dockerhub. Docker containers are basically runtime instances of Docker images.
Docker image is the source of Docker container. In other words, Docker images are used
to create containers.
Images are created with the build command, and they’ll produce a container when started
with run.
Images are stored in a Docker registry such as registry.hub.docker.com because they can
become quite large, images are designed to be composed of layers of other images,
allowing a minimal amount of data to be sent when transferring images over the network.
Docker hub is a cloud-based registry service which allows you to link to code repositories,
build your images and test them, stores manually pushed images, and links to Docker
cloud so you can deploy images to your hosts.
It provides a centralized resource for container image discovery, distribution and change
management, user and team collaboration, and workflow automation throughout the
development pipeline.
Docker Swarm is native clustering for Docker. It turns a pool of Docker hosts into a single,
virtual Docker host.
Docker Swarm serves the standard Docker API, any tool that already communicates with a
Docker daemon can use Swarm to transparently scale to multiple hosts.
61/71
I will also suggest you to include some supported tools:
Dokku
Docker Compose
Docker Machine
Jenkins
A Dockerfile is a text document that contains all the commands a user could call on the
command line to assemble an image.
Using docker build users can create an automated build that executes several command-
line instructions in succession.
Docker containers are easy to deploy in a cloud. It can get more applications running on the
same hardware than other technologies.
We can use Docker image to create Docker container by using the below command:
This command will create and start a container. You should also add, If you want to check
the list of all running container with the status on a host use the below command:
1 docker ps -
a
In order to stop the Docker container you can use the below command:
Question: What is the difference between docker run and docker create?
The primary difference is that using ‘docker create’ creates a container in a stopped state.
Bonus point: You can use ‘docker create’ and store an outputed container ID for later
use. The best way to do it is to use ‘docker run’ with --cidfile FILE_NAME as running it
again won’t allow to overwrite the file.
Running
Paused
Restarting
Exited
Docker registry is a service for hosting and distributing images. Docker repository is a
collection of related Docker images.
The simplest way is to use network port mapping. There’s also the - -link flag which is
deprecated.
A CMD does not execute anything at build time, but specifies the intended command for
the image.
RUN actually runs a command and commits the result.
If you would like your container to run the same executable every time, then you should
consider using ENTRYPOINT in combination with CMD.
As far as the number of containers that can be run, this really depends on your
63/71
environment. The size of your applications as well as the amount of available resources will
all affect the number of containers that can be run in your environment.
Containers unfortunately are not magical. They can’t create new CPU from scratch. They
do, however, provide a more efficient way of utilizing your resources.
The containers themselves are super lightweight (remember, shared OS vs individual OS
per container) and only last as long as the process they are running. Immutable
infrastructure if you will.
Docker hub is a cloud-based registry service which allows you to link to code repositories,
build your images and test them, stores manually pushed images, and links to Docker
cloud so you can deploy images to your hosts.
It provides a centralized resource for container image discovery, distribution and change
management, user and team collaboration, and workflow automation throughout the
development pipeline.
VMware was founded in 1998 by five different IT experts. The company officially launched
its first product, VMware Workstation, in 1999, which was followed by the VMware GSX
Server in 2001. The company has launched many additional products since that time.
VMware's desktop software is compatible with all major OSs, including Linux, Microsoft
Windows, and Mac OS X. VMware provides three different types of desktop software:
VMware Workstation: This application is used to install and run multiple copies or
instances of the same operating systems or different operating systems on a single
physical computer machine.
VMware Fusion: This product was designed for Mac users and provides extra
compatibility with all other VMware products and applications.
VMware Player: This product was launched as freeware by VMware for users who do
not have licensed VMWare products. This product is intended only for personel use.
VMware's software hypervisors intended for servers are bare-metal embedded hypervisors
that can run directly on the server hardware without the need of an extra primary OS.
VMware’s line of server software includes:
VMware ESX Server: This is an enterprise-level solution, which is built to provide
better functionality in comparison to the freeware VMware Server resulting from a
lesser system overhead. VMware ESX is integrated with VMware vCenter that
provides additional solutions to improve the manageability and consistency of the
server implementation.
VMware ESXi Server: This server is similar to the ESX Server except that the service
64/71
console is replaced with BusyBox installation and it requires very low disk space to
operate.
VMware Server: Freeware software that can be used over existing operating systems
like Linux or Microsoft Windows.
The process of creating virtual versions of physical components i-e Servers, Storage
Devices, Network Devices on a physical host is called virtualization.
Virtualization lets you run multiple virtual machines on a single physical machine which is
called ESXi host.
Server virtualization: consolidates the physical server and multiple OS can be run on
a single server.
Network Virtualization: Provides complete reproduction of physical network into a
software defined network.
Storage Virtualization: Provides an abstraction layer for physical storage resources to
manage and optimize in virtual deployment.
Application Virtualization: increased mobility of applications and allows migration of
VMs from host on another with minimal downtime.
Desktop Virtualization: virtualize desktop to reduce cost and increase service
The service console is developed based up on Redhat Linux Operating system, it is used
to manage the VMKernel
This Agent will be installed on ESX/ESXi will be done when you try to add the ESx host
in Vcenter.
65/71
VMWare Kernel is a Proprietary kernel of vmware and is not based on any of the flavors of
Linux operating systems.
VMkernel requires an operating system to boot and manage the kernel. A service console
is being provided when VMWare kernel is booted.
VMkernel is a virtualization interface between a Virtual Machine and the ESXi host which
stores VMs.
It is responsible to allocate all available resources of ESXi host to VMs such as memory,
CPU, storage etc.
It’s also control special services such as vMotion, Fault tolerance, NFS, traffic management
and iSCSI.
To access these services, VMkernel port can be configured on ESXi server using a
standard or distributed vSwitch. Without VMkernel, hosted VMs cannot communicate with
ESXi server.
Hypervisor is a virtualization layer that enables multiple operating systems to share a single
hardware host.
A network of VMs running on a physical server that are connected logically with each other
is called virtual networking.
vSS stands for Virtual Standard Switch is responsible for communication of VMs hosted on
a single physical host.
66/71
it works like a physical switch automatically detects a VM which want to communicate with
other VM on a same physical server.
AVMKernel adapter provides network connectivity to the ESXi host to handle network traffic
for vMotion, IP Storage, NAS, Fault Tolerance, and vSAN.
For each type of traffic such as vMotion, vSAN etc. separate VMKernal adapter should be
created and configured.
A Datastore is a storage location where virtual machine files are stored and accessed.
Datastore is based on a file system which is called VMFS, NFS
67/71
3. Thin provision: It provides on-demand allocation of disk space to a VM. When data
size grows, the size of disk will grow. Storage capacity utilization can be up to 100%
with thin provisioning.
4. What is Storage vMotion?
Vmkernel port is used by ESX/ESXi for vmotion, ISCSI & NFS communications. ESXi uses
Vmkernel as the management network since it don’t have serviceconsole built with it.
In this way, each build is tested continuously, allowing Development teams to get fast
feedback so that they can prevent those problems from progressing to the next stage of
Software delivery life-cycle.
Continuous Testing allows any change made in the code to be tested immediately.
This avoids the problems created by having “big-bang” testing left to the end of the
development cycle such as release delays and quality issues.
In this way, Continuous Testing facilitates more frequent and good quality releases.”
Regression Testing: It is the act of retesting a product around an area where a bug was
fixed.
Verify command also checks whether the given condition is true or false. Irrespective of
the condition being true or false, the program execution doesn’t halts i.e. any failure during
verification would not stop the execution and all the test steps would be executed.
Summary
69/71
DevOps refers to a wide range of tools, process and practices used bycompanies to
improve their build, deployment, testing and release life cycles.
In order to ace a DevOps interview you need to have a deep understanding of all of these
tools and processes.
Most of the technologies and process used to implement DevOps are not isolated. Most
probably you are already familiar with many of these. All you have to do is to prepare for
these from DevOps perspective.
In this guide I have created the largest set of interview questions. Each section in this guide
caters to a specific area of DevOps.
In order to increase your chances of success in DevOps interview you need to go through
all of these questions.
- Interview Questions
Spring-Interview-Questions
References
https://theagileadmin.com/what-is-devops/
https://en.wikipedia.org/wiki/DevOps
http://www.javainuse.com/misc/gradle-interview-questions
https://mindmajix.com/gradle-interview-questions
https://tekslate.com/groovy-interview-questions-and-answers/
https://mindmajix.com/groovy-interview-questions
https://www.wisdomjobs.com/e-university/groovy-programming-language-interview-
questions.html
https://www.quora.com/What-are-some-advantages-of-the-Groovy-programming-
language
70/71
https://www.quora.com/What-are-some-advantages-of-the-Groovy-programming-
language
http://groovy-lang.org/documentation.html
https://maven.apache.org/guides/introduction/introduction-to-archetypes.html
https://en.wikipedia.org/wiki/Apache_Maven
https://www.tecmint.com/linux-process-management/
https://www.tecmint.com/dstat-monitor-linux-server-performance-process-memory-
network/
https://www.careerride.com/Linux-Interview-Questions.aspx
https://www.onlineinterviewquestions.com/git-interview-questions/#.WxcTP9WFMy4
https://www.atlassian.com/git/tutorials/what-is-git
https://www.toptal.com/git/interview-questions
https://www.sbf5.com/~cduan/technical/git/git-1.shtml
http://preparationforinterview.com/preparationforinterview/continuous-integration-
interview-question
https://codingcompiler.com/jenkins-interview-questions-answers/
https://www.edureka.co/blog/interview-questions/top-splunk-interview-questions-and-
answers/
https://intellipaat.com/interview-question/splunk-interview-questions/
https://www.edureka.co/blog/interview-questions/docker-interview-questions/
http://www.vmwarearena.com/vmware-interview-questions-and-answers/
https://www.myvirtualjourney.com/top-80-vmware-interview-questions-answers/
https://www.edureka.co/blog/interview-questions/top-devops-interview-questions-
2016/
71/71