Professional Documents
Culture Documents
Data Migration - Cosmos DB - Documentation
Data Migration - Cosmos DB - Documentation
3. Create Datasets:
- Define datasets for both your source and destination data.
- Specify the format (e.g., JSON, CSV) and schema information for each dataset.
- Configure dataset properties such as folder path, file name pattern, and partition
information.
Step 8: Finalize
o Optionally, perform any additional testing or validation to ensure the
integrity of the migrated data.
o Congratulations! You have successfully migrated data from one Cosmos DB
instance to another using the Azure Cosmos DB Data Migration Tool.
d) Using Azure Cosmos DB Spark Connector
- Ensure you have an Apache Spark environment set up. You can use Azure
Databricks, an Apache Spark cluster on Azure HDInsight, or a standalone Apache
Spark installation.
- Make sure you have the Azure Cosmos DB Spark Connector library added to your
Spark environment. You can add it as a Maven dependency if you're using Maven,
or download the JAR file and include it in your Spark configuration.
- Obtain the connection strings or URIs for both the source and destination Cosmos
DB instances.
- Ensure that you have the necessary permissions and access credentials (e.g., master
keys, resource tokens) to read from the source Cosmos DB instance and write to
the destination Cosmos DB instance.
- Use the Azure Cosmos DB Spark Connector to create a Spark DataFrame that reads
data from the source Cosmos DB instance.
- Specify the source Cosmos DB connection options, including the URI, database
name, collection name, and any required authentication credentials.
- Use the Spark DataFrame created in the previous step to write data to the
destination Cosmos DB instance.
- Specify the destination Cosmos DB connection options, including the URI, database
name, collection name, and any required authentication credentials.
- Choose the appropriate write mode based on your requirements. You can
overwrite existing data, append new data, or perform other actions based on the
existing data in the destination Cosmos DB collection.
- After the Spark job completes, verify that the data has been successfully migrated
to the destination Cosmos DB instance.
- Query the destination Cosmos DB collection to ensure that it contains the expected
data from the source Cosmos DB instance.
- Perform any necessary data validation or integrity checks to confirm the accuracy
of the migration.
e) Using Azure Cosmos DB Functions + Change Feed API
4. Next, note the Application (client) ID in the overview blade of the App
registration. Copy it to use it later
5. Navigate to the Authentication menu
6. Fill in a Front-channel logout URL. Again, this should contain the name,
like "https://tips01-ui.azurewebsites.net/signout-callback-oidc"
7. Check ID tokens (used for implicit and hybrid flows)
8. Click Save
9. Next, go to the Manifest menu
10. Add to the requiredResourceAccess node, so that it looks like this": [
11. Click Save
Now that we have an application registration, we can deploy the migration app.
The migration tool will deploy several resources. This includes an Azure App Service Web App that
runs the UI for the tool. Find the Web App in the Azure portal and open the UI in a browser. The
URL will use the name that you provided earlier. So, in my case, it is https://tips01-
ui.azurewebsites.net
3. You can watch the progress of any open migrations by clicking on the List menu and
refreshing your browser
4. When all documents are migrated, click Complete to mark the migration as finished
Conclusion:
a) For simplicity, Azure Data Factory and Azure Cosmos DB Data Migration Tool are user-
friendly GUI-based options.