You are on page 1of 5

 Can you briefly introduce yourself?

 Which version you used in databricks?


 How to create a notebook steps
 Revision History in Databricks?
 Can Databricks be used along with Azure Notebooks?
 What are the various types of clusters present in Azure Databricks?
 Steps to create a job in Databricks.
 What is caching?
 From where you will get the feed?
 How to convert the parquet to delta format?
 What are the different file formats you worked on ?
 Do we need to store the results of one action in other variables?
 How to generate a personal access token in databricks?
 How to reuse the code in databricks
 How to set up a dev environment in databricks?
 What is the use of % run?
 What is the use of widgets in databricks?
 Write a syntax to list secrets in a specific scope?
 Syntax to get the list of mount points
 What a cluster will do at network level.
 What is autoscaling?
 How do you manage the Databricks code while working with a team using the team foundation
server (TFS) or Git?
 What is the control plane in Azure Databricks?
 Difference between instance and cluster
 What is a Databricks secret?
 What are some issues you can face with Azure Databricks?

Practical questions:

 How to connect to external storage?


 How to run one notebook to other?
 How to read the secret value from the keyvalt from databricks?
 How to create delta Data?
 Syntax to remove particular widget?
 How to create Dataframes?

Can Databricks be used along with Azure Notebooks?

They can be executed similarly but data transmission needs to be coded manually to
the cluster. There is Databricks connect, which can get this integration done
seamlessly. 

What are the various types of clusters present in Azure Databricks?


Azure Databricks has four types of clusters, including Interactive, Job, Low-priority, and
High-priority.

8. What is caching?

The cache refers to the practice of storing information temporarily. When you go to a
website that you visit frequently, your browser takes the information from the cache
instead of the server. This helps save time and reduce the server’s load.

. What is autoscaling?

Autoscaling is a Databricks feature that will help you automatically scale your cluster in
whichever direction you need.

What are some issues you can face with Azure Databricks?

You might face cluster creation failures if you don’t have enough credits to create more
clusters. Spark errors are seen if your code is not compatible with the Databricks
runtime. You can come across network errors if it's not configured properly or if you’re
trying to get into Databricks through an unsupported location.

What is a Databricks secret?

A secret is a key-value combination that can help keep secret content; it is composed of
a unique key name contained within a secret context. Each scope is limited to 1000
secrets. It cannot exceed 128 KB in size.

What is the control plane in Azure Databricks?

The control plane is responsible for managing Spark applications

What is the difference between an instance and a cluster in Databricks?

An instance is a virtual machine that helps run the Databricks runtime. A cluster is a
group of instances that are used to run Spark applications.

What is the use of auto-scaling in Azure Databricks?


Auto-scaling allows the program to run effectively even under high load. Such a
question helps the hiring manager assess your knowledge of auto-scaling in Azure.
While answering, briefly define Databricks's auto-scaling feature and mention its key
benefit.

How do you manage the Databricks code while working with a team
using the team foundation server (TFS) or Git?

Both TFS and Git allow easy code management through effective collaboration among
teams and by utilising version control. Such questions help the hiring manager assess
your capacity to manage the code base of a project effectively and also assess if you
have experience in coding with Databricks. In your answer, mention the key features of
both TFS and Git and briefly explain the major steps you take to manage the Databricks
code.

What do you understand by mapping data flows?

Such a technical question helps the interviewer to assess your domain knowledge. You
can use this question to show your familiarity with the working concepts of Databricks.
In your response, briefly explain what mapping data flow does and how it helps with the
workflow.

Do we need to store the results of one action in other


variables?
No, there is no need to store the results of one action in other variables.
How to generate a personal access token in databricks?
We can generate a personal access token in seven steps they are:

1. In the upper right corner of Databricks workspace, click the icon named: “user profile.”
2. In the second step, you have to choose “User setting.”
3. navigate to the tab called “Access Tokens.”
4. Then you can find a “Generate New Token” button. Click it.

How to reuse the code in the azure notebook?


If we want to reuse the code in the azure notebook, then we must import that code into
our notebook. We can import it in two ways–> 1) if the code is in a different workspace,
we have to create a module/jar of the code and then import it into a module or jar. 2) if
the code is in the same workspace, we can directly import and reuse it.
How to set up a dev environment in databricks?
The five steps to set up a dev environment in databricks are:
1. Generate a branch and checkout that code to the PC.
2. By using CLI, copy the local directory notebooks to the databricks.
3. Using DBFS CLI, copy local directory libraries to DBFS.
4. By using UI or API, generate a cluster.
5. Finally, using API libraries, connect the libraries within DBFS.

What is the use of %run?


The %run command is used to parameterize a databricks notebook. %run is also used
to modularize the code.
What is the use of widgets in databricks?
Widgets enable us to add parameters to our dashboards and notebooks. The API
widget consists of calls to generate multiple input widgets, get the bound values and
remove them.
What is a secret in databricks?
A secret is a key-value pair that stocks up the secret material; it consists of a unique key
name inside a secret scope. The limit of each scope is up to 1000 secrets. The
maximum size of the secret value is 128 KB.
Write a syntax to list secrets in a specific scope?
The syntax to list secrets in a specific scope is:
databricks secrets list –scope
What is the use of Secrets utility?
Secrets utility is used to read the secrets in the job or notebooks.
How to delete a Secret?
We can use Azure Portal UI or Azure SetSecret Rest API to delete a Secret from any
scope that is backed by an Azure key vault.
What are the two types of secret scopes?
There are two types of secret scopes they are:

1. Databricks-backed scopes.
2. Azure key Vault-backed scopes.

What do clusters do at the network level?


At the network, level clusters try to connect with the control panel proxy throughout the
cluster reaction.

You might also like