You are on page 1of 2

Talend

Tutorial Task Aid >


Sorting a File

This tutorial uses Talend Open Studio Data Integration version 6

1. Create a new Job


a. Ensure that the Integration perspective is selected.
b. Create a new Job and name it SortCSVFile.

The Job Designer opens an empty Job.


2. Add and configure a tFileInputDelimited component


a. Add a tFileInputDelimited component to the Job.
b. To configure the a tFileInputDelimited_1 component, in the Component view of the
component, click [...] next to the File Name field, select the file from the local disk, and
click Open.
c. To describe the structure of the file, open the Schema wizard of tFileInputDelimited_1 and
click [...] next to the Edit schema field.
d. Click the [+] icon to add the first column and enter the details for the column.
e. Repeat step d for each column in the CSV file and close the Schema wizard.

3. Sort the data in your Job


a. Add a tSortRow component to the Job and link the two components.

Note: The schema of the tFileInputDelimited_1 component is inherited by the linked tSortRow
component, so you do not need to configure it.
b. To view the schema that has been inherited, in the Component view of the tSortRow
component, click [] next to Edit schema.
c. To create a new sorting rule based on the movie release year, click [+] and in the Schema
column, click releaseYear and specify the sort order by clicking desc.
d. To view the result of the sort rule, in the Job Designer, add a tLogRow component and link
the tSortRow_1 and the tLogRow_1 components.
e. To run the Job, in the Run view for the Job sortCSVFile, click Run.

Talend takes the complexity out of integration


Based on open source Scalable Future-proof Predictable cost
Visit www.talend.com Follow us on Twitter @Talend
Talend Tutorial Task Aid >

The movies in the source file will now be sorted based on the year of release.

4. Add a second sort rule


a. To add a second sorting rule, in the Component view of the tSortRow_1 component, click
(+) and, in the Schema column, choose title. Then in the sort column, choose alpha.
b. To run the Job, in the Run view, click Run.

Now, the movies will be sorted by year of release, and within each year, the movies will be
sorted in the alphabetical order of the movie title.

5. Store the result of the Job in a file


a. Add a tfileOutputExcel component to the Job Designer and link the tLogRow_1 to it.
b. To configure the output component, in the Component view of the component, specify
the path and name for the output file.
c. To include the header row in the output file, select the Include Header checkbox.
d. To run the Job, in the Run view, click Run.
e. To check the moviesSorted.xls file, navigate to the folder in which the file was created and
open the file. The file with the sorted data will be displayed.
f. To prevent the sorted data from being displayed in the Run view, right-click tLogRow_1
and click Deactivate tLogRow.
g. To run the Job, in the Run view, click Run.

The Job is run again. However, no data is displayed in the Run view.

Talend takes the complexity out of integration


Based on open source Scalable Future-proof Predictable cost
Visit www.talend.com Follow us on Twitter @Talend

You might also like