DataStage Essentials

Hashed Files Server Edition

students will be able to:         Define what a hashed file is List different types of hashed files List various uses for hashed files Create hashed files Source hashed files Use caching attributes to accelerate reads and writes Import metadata from hashed files Delete hashed files .Module Objectives Upon module completion.

What is a Hashed File?   A DataStage file written to the file system Most use a hashing algorithm based on key column values Files can be cataloged in the project’s VOC file for easy retrieval  .

Types of Hashed Files    21 different types Type 1 and 19 do not use a hashing algorithm Types 2-18 are static hashed files and use a hashing algorithm Type 25 is static and uses a B-tree algorithm Type 30 is dynamic and uses a hashing algorithm   .

Uses of Hashed Files  Good for locally storing tables of a remote database that will be read from frequently Good as an intermediate file location in sequences of jobs Main use is as a reference lookup table   .

Hashing Algorithms (Static File Types) Location where most variation occurs in key column Character type Right Middle Left Any Wholly numeric 2 6 10 14 Numeric & separators 3 7 11 15 ASCII 4 8 12 16 Any 5 9 13 17 .

Inserting Records into Hashed Files Hashing Algorithm Group 1 Group 2 Group 3 Group 4 Group 5 .

Overflow Groups  When there is not enough space remaining in a group. the group overflows Header Group 1 Group 2 Group 3 Group 4 Group 5 12288 overflow Group 2 10240 8192 6144 4096 2048 Group Address .

FILE command from a DataStage command shell or program .Creating Hashed Files Insert the name of the hashed file— can use parameters Let the job create the hashed file  Can optionally use the CREATE.

Options for Creating Hashed Files How many groups to create initially What percentage of file capacity to create a new group Similar to dropping a table Which type of file to create How large the groups will be What percentage of file capacity to remove a group .

Hashed File Locations Create and/or write records to a hashed file in a specific project— doesn’t have to be yours! Create and/or write records to a hashed file in a specific directory— can present issues .

Write Caching   Enabled from Input page of hashed file stage Allows records to be written to memory buffer initially and flushed to disk once I/O is extremely expensive—this minimizes writes to disk Can present issues when reading from a hashed file at the same time   .

Importing Metadata from a Hashed File Choose the project from the drop-down list Available hashed files within the project appear— select one or multiples .

Sourcing a Hashed File Insert the name of the hashed file or use drop-down list— can use parameters Enable or disable read caching—four methods .

Read Caching   Enabled from Output page of hashed file stage Four methods are available—be sure to choose the correct method Allows records to be read into memory buffer I/O is extremely expensive—this minimizes reads from disk Can present issues when writing to a hashed file at the same time    .

exe program and supply hashed file name –   delete_file.Deleting Hashed Files  Not advisable to manually delete operating system directories and files Issue DELETE.exe program is installed when the server is installed and is located in the server’s engine directory in the bin subdirectory .FILE command from DataStage Administrator or DataStage command shell Create shell script to run delete_file.