You are on page 1of 2

Load XML files using a DataStage Parallel job

updated Mar 28, 2008 11:27 pm | 13,361 views

The intention of this tutorial is to give novice developers a quick start with l
oading XML data using a DataStage parallel job.
Step 1:
Create a simple XML file named test.xml
<xml> <customer>Mike</customer> <customer>Anna</customer> </xml>
Step 2:
Create a new DataStage parallel job with 3 stages linked together: A sequential
file stage, XML input stage (located under the Real Time category), and a peek s
Step 3:
The first trick is to load the entire XML file into a single column of a single
row. You do this by creating a column in the sequential file stage of type LongV
arChar[Max=9999]. In this example the max size is arbitrary. Set the input file
to test.xml. Next, remove all properties in the [Format] tab and add these two:
In the Record level:
Record type=implicit
In the Field defaults:
Step 4:
Now that we have the XML in a single column then we can set the XML input stage
properties. In the [Transformation settings] tab under the [Stage] tab check the
[Repetition element required] tag. In the [Input] tab select the column that yo
u defined in step 3 and check the [XML document] box. In the [Output] tag define
a column named [customer] of type varchar[max=255]. Set it as the key. In the d
escription box enter the xml path. In this case /xml/customer/text()
Tip: To reference XML attributes you would use @. For example: /xml/customer/@id
would equal 1 when using this xml: <xml><customer id= 1 >Mike</customer></xml>
Step 5:
Compile and run. Peek will produce log records that list the customers from the
XML file.
That's it. For more details on processing XML read the XML Pack documentation th
at comes with DataStage. Here is a more extensive XML tutorial for server jobs f
rom IBM:
Transform and integrate data using WebSphere DataStage XML and Web services pack
This biggest difference is that in parallel jobs you do not have a folder stage
so you need to use the sequential file stage with the setting mentioned above.

You might also like