• Configure datasets and components along the way; let
the yellow “To Do” cues guide you. • Generally, you should configure your input and output metatdata (record formats) before adding flows.
Confidential & Proprietary
Adding an Input Dataset
1. Click on Component Button
2. Open Datasets Category
3. Choose InputFile
Confidential & Proprietary
Configuring the Input Dataset 1. Browse to find simple.dat 2. Browse to find simple.dml
3. Change label to something descriptive
Confidential & Proprietary
Adding a Filter by Expression Component
1. Open Transform Category
2. Choose Filter by Expression
Confidential & Proprietary
Adding an Output Dataset
Choose OutputFile
Confidential & Proprietary
Configuring the Output Dataset
1. Browse to see directory 2. Enter name of output file
Confidential & Proprietary
Adding Flows
1. Click on source (hold)
2. Drag to destination (release)
Confidential & Proprietary
Configuring Filter by Expression
Enter expression
Confidential & Proprietary
Flows Can Propagate Configuration
• One way to “Get rid of yellow” is to configure
datasets or components.
• Hooking up flows allows the GDE to
automatically propagate many kinds of information, like record format metadata; sometimes, connecting things is all you need to do to “Get rid of yellow.”
Confidential & Proprietary
Tip: Let Propagation Do the Work! • Define record formats for input datasets.
• Define record formats for output datasets only
when they differ from input datasets; let propagation do as much as possible.
• If record formats change, this minimizes the
impact on the graph.
• Sometimes you will need to set record formats
on components. In such cases, usually you should set the format on the output port. Confidential & Proprietary Tip: Look Before Deleting Components! • Before deleting a component in a graph, look to see whether the component defines record formats for any of its ports. If you delete a component with record format definitions, you may lose the definitions.
• To safely delete such a component: For each
port with a record format definition, go to the other end of the flow for that port (which will be some other component or dataset) and uncheck the ‘propogate from neighbor’ box for the associated port.
THE STEP BY STEP GUIDE FOR SUCCESSFUL IMPLEMENTATION OF DATA LAKE-LAKEHOUSE-DATA WAREHOUSE: "THE STEP BY STEP GUIDE FOR SUCCESSFUL IMPLEMENTATION OF DATA LAKE-LAKEHOUSE-DATA WAREHOUSE"