You are on page 1of 4

DIY Exercise 13-1: Create a batch job to process records

Time estimate: 2 hours

Objectives
In this exercise, you create a batch processing system. You will:
• Control batch step processing using batch filters.
• Exchange data between batch steps using variables.
• Trigger batch processing on a schedule.
• Use watermarks to avoid duplicate processing.
• Improve performance by using batch aggregators.
• Improve performance and share data with other Mule applications using VM queues.

Scenario
The Finance department needs to audit certain transactions and needs a Mule application to
consistently retrieve data from a database and write these transactions as CSV files to a server.

To meet compliance standards, a CSV file can have no more than 50 records and the Mule
application must be deployed to a private server where the Mule application will share the same
Mule domain with other financial compliance Mule applications. You do not, however, need to
create a new Mule domain project yourself; another developer will be responsible for deploying
your project into an existing Mule domain.

Create a project that retrieves new transactions from the database using
batch
Create a new Mule application that retrieves data from the flights_transactions table in the
database using the following information:
• Host: mudb.learn.mulesoft.com
• Port: 3306
• User: mule
• Password: mule
• Database: training
• Table: flights_transactions
Schedule the main flow to automatically run every 5 seconds. Retrieve new database records
based on the value of the primary key field transactionID. Use an ObjectStore to save the
maximum transactionID processed for any batch session.
Hint: For test development, limit the query to only retrieve 10 records at a time.
Add a flow to mock the financial compliance application logic
Add a new flow to the Mule application with a VM Listener on the VM queue named validate.
Add a Transform Message component to this flow and add DataWeave code to simulate the
transactionID validation logic. It expects one record and returns the value true or false, where
true indicates that the record needs to be audited. In this simple mock flow, return a Boolean
value true if the transactionID is divisible by 4, and false otherwise:
%dw 2.0
output application/java
---
if ( mod(payload.transactionID as Number,4) == 0 )
true
else
false

Send each transaction record to a VM queue for conditional testing


Publish each transaction record to the validate VM queue and wait to consume the Boolean
response. Save the result in a variable to filter the current record in the next batch step.

Add batch filters to only process transactions that need auditing


Configure target variables and the accept expression in each batch step to keep track of your
records throughout each batch step.
Hint: For test development, you can create a flow that listens on the validate queue path and
arbitrarily return true or false for each record processed.

Write out transactions as CSV files


In a second batch step, configure an accept expression to only process this second batch step if
the previous VM queue response was true. Inside this batch step, transform the database
results to a CSV file and save this CSV file to this Mule application's file system. Use a property
placeholder for the file location so the file location can be modified by Ops staff at deployment
time. Add a batch aggregator so no more than 50 records at a time are written to each CSV file.

Log the batch processing summary


In the On Complete phase of the Batch Job, log the batch processing summary.

Test your solution


Debug your solution. Step through several polling steps and verify some queries return true
from the VM queue and are processed by the second batch step, but other database queries
return false and skip the second batch step. Also verify the output CSV files contain at most 50
records each.

Verify your solution


Import the solution /files/module13/audit-mod13-file-write-solution.jar deployable archive file (in
the MUFundamentals4.x DIY Files zip that you can download from the Course Resources) and
compare your solution.

Going further: Handle errors


Add logic to the first batch step to throw errors.
• Call a flow at the beginning of the first batch step and add event processors to this flow
that would sometimes throw an error, but not for every record.
• Experiment with what happens when you handle the error in the flow versus if you don't
handle the error in the flow.
• Add a third batch step with an ONLY_FAILURES accept policy to report failed messages
to a dead letter VM queue for failed batch steps.
• In the Batch Job scope's general configuration, change the max failed records to 1 and
observe the behavior of subsequent batch records after a record throws an error.
Change this value to 2 and observe any changes in behavior.
• In the Batch Job scope's general configurations, change the scheduling strategies
options to ROUND_ROBIN and observe the behavior, and compare it with the default
ORDERED_SEQUENTIAL option's behavior.
• Look at the logs for the On Complete phase to see how many times the same error is
reported for each record of the batch job.
• In the first batch step's referenced flow, add a Choice router and a sleep timer that
sleeps for a minute. Add logic to the Choice router to only call the sleep timer if the
transactionID ends with 6. Observe if later records in the batch job can skip ahead while
some records are paused by the sleep timer.
Note: For info about handling batch errors, see: https://blogs.mulesoft.com/dev/mule-
dev/handle-errors-batch-job/

Going further: Refactor the validation logic to another Mule application in a


new shared Mule domain
Create a Mule domain project named finance, then change the Mule application's Mule domain
from default to finance. Move the VM connector global element to the finance Mule domain
project.
Move the validation flow to a new Mule application and configure this Mule application to also
use the finance Mule domain.
Deploy the Mule domain and both Mule applications to a customer-hosted Mule runtime. Verify
batch jobs are still processed correctly.

Going further: Deploy both Mule applications to CloudHub


Instead of using customer-hosted Mule runtimes, configure both Mule applications to use an
external online JMS server that is accessible over the public internet. Deploy both Mule
applications to CloudHub and verify batch jobs are still processed correctly.

You might also like