You are on page 1of 116

DataFlux Expression Language Reference Guide

Leader in Data Quality and Data Integration

www.dataflux.com 877–846–FLUX

International +44 (0) 1753 272 020

This page is intentionally blank 

DataFlux Expression Language Reference Guide Version 8.1.1

Updated: November 13, 2008

This page is intentionally blank .

or licensed to. either express or implied.org/licenses/LICENSE-2. Cary.DataFlux . Cary. Version 1.dataflux. ® indicates USA registration.com/techsupport Legal Information Copyright © 1997 .com Web: www.1 Copyright (c) 1999-2003 The Apache Software Foundation. you may not use this file except in compliance with the License. See the License for the specific language governing permissions and limitations under the License. NC. DataFlux Corporation LLC in the USA and other countries. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND. NC USA. All rights reserved. Version 2. Suite 201 Cary.com European Headquarters DataFlux UK Limited 59-60 Thames Street WINDSOR Berkshire SL4 ITX United Kingdom UK (EMEA): +44(0) 1753 272 020 Contact Technical Support Phone: 1-919-531-9000 Email: techsupport@dataflux.apache.dataflux. USA.0 Unless required by applicable law or agreed to in writing. software distributed under the License is distributed on an "AS IS" BASIS.Contact and Legal Information Contact DataFlux Corporate Headquarters DataFlux Corporation 940 NW Cary Parkway. Apache/Xerces Copyright Disclosure The Apache Software License. You may obtain a copy of the License at http://www. All Rights Reserved. .2008 DataFlux Corporation LLC. NC 27513-2792 Toll Free Phone: 1-877-846-FLUX (3589) Toll Free Fax: 1-877-769-FLUX (3589) Local Telephone: 1-919-447-3000 Local Fax: 1-919-447-3100 Web: www. DataFlux and all other DataFlux Corporation LLC product or service names are registered trademarks or trademarks of. Licensed under the Apache License.0 (the "License"). Apache Portable Runtime License Disclosure Copyright © 2008 DataFlux Corporation LLC.

if and wherever such third-party acknowledgments normally appear. to deal in the Software without restriction. BUT NOT LIMITED TO.com. and to permit persons to whom the Software is furnished to do so. This software consists of voluntary contributions made by many individuals on behalf of the Apache Software Foundation and was originally based on software copyright (c) 1999. SPECIAL. Permission is hereby granted.ibm.org. without prior written permission of the Apache Software Foundation. 1999. The end-user documentation included with the redistribution. must include the following acknowledgment: "This product includes software developed by the Apache Software Foundation (http://www. 5. INCLUDING. STRICT LIABILITY.apache.org/. http://www. INDIRECT. with or without modification. this acknowledgment may appear in the software itself. The names "Xerces" and "Apache Software Foundation" must not be used to endorse or promote products derived from this software without prior written permission. if any. Products derived from this software may not be called "Apache". WHETHER IN CONTRACT. DATA. EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. BUT NOT LIMITED TO.org/). 3. copy.. Redistributions of source code must retain the above copyright notice. OR CONSEQUENTIAL DAMAGES (INCLUDING.apache. this list of conditions and the following disclaimer. THIS SOFTWARE IS PROVIDED "AS IS'' AND ANY EXPRESSED OR IMPLIED WARRANTIES. please contact apache@apache. DataDirect Copyright Disclosure Portions of this software are copyrighted by DataDirect Technologies Corp. 2. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT. Expat Copyright Disclosure Part of the software embedded in this product is Expat software. sublicense. OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE. 1991 2008. For written permission. and/or sell copies of the Software. subject to the following conditions: ." Alternately. are permitted provided that the following conditions are met: 1. this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. distribute. EXEMPLARY. LOSS OF USE. OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY. publish. to any person obtaining a copy of this software and associated documentation files (the "Software"). THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. INCIDENTAL. For more information on the Apache Software Foundation. 4. 2000 Thai Open Source Software Center Ltd. Inc. free of charge. OR PROFITS. merge.Redistribution and use in source and binary forms. please see http://www. modify. Copyright (c) 1998. Redistributions in binary form must reproduce the above copyright notice. International Business Machines.. including without limitation the rights to use. PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES. nor may "Apache" appear in their name.

All Rights Reserved. Portions created by gSOAP are Copyright (C) 2001-2004 Robert A. Oracle Copyright Disclosure Oracle. WHETHER IN AN ACTION OF CONTRACT. STRICT LIABILITY. DATA. EXPRESS OR IMPLIED. England. Microsoft Copyright Disclosure Microsoft®. and Siebel are registered trademarks of Oracle Corporation and/or its affiliates. BUT NOT LIMITED TO. INCIDENTAL. Genivia inc. INCLUDING. gSOAP Copyright Disclosure Part of the software embedded in this product is gSOAP software.ac. OR CONSEQUENTIAL DAMAGES (INCLUDING.The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT. this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY. THE SOFTWARE IN THIS PRODUCT WAS IN PART PROVIDED BY GENIVIA INC AND ANY EXPRESS OR IMPLIED WARRANTIES. INDIRECT.csx. JD Edwards. Windows. and Access. Redistributions in binary form must reproduce the above copyright notice.uk/pub/software/programming/pcre/ Copyright (c) 1997-2005 University of Cambridge. this list of conditions and the following disclaimer. ARISING FROM. THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. All rights reserved. WHETHER IN CONTRACT. written by Philip Hazel and copyrighted by the University of Cambridge. OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. EXEMPLARY. are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries. WITHOUT WARRANTY OF ANY KIND. INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY.cam. PCRE Copyright Disclosure A modified version of the open source software PCRE library package. DAMAGES OR OTHER LIABILITY. van Engelen. are permitted provided that the following conditions are met: • • Redistributions of source code must retain the above copyright notice. FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. . IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM. OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE. has been used by DataFlux for regular expression support. THE SOFTWARE IS PROVIDED "AS IS". PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES. PeopleSoft. NT. TORT OR OTHERWISE. LOSS OF USE. OR PROFITS. SPECIAL. SQL Server. with or without modification. BUT NOT LIMITED TO. More information on this library can be found at: ftp://ftp. Redistribution and use in source and binary forms.

Anyone is free to copy.S. compile. BUT NOT LIMITED TO. Inc. Delivery Point Barcode Information. EXEMPLARY. DPV. OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE. Red Hat Copyright Disclosure Red Hat® Enterprise Linux®. All rights reserved.• Neither the name of the University of Cambridge nor the name of Google Inc. use. OR CONSEQUENTIAL DAMAGES (INCLUDING. or other countries. This material is proprietary and the subject of copyright protection and other intellectual property rights owned by or licensed to Tele Atlas North America. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT. THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. SQLite Copyright Disclosure The original author of SQLite has dedicated the code to the public domain. controlled. DataFlux holds a non-exclusive license from the United States Postal Service to publish and sell USPS CASS. and Red Hat Fedora™ are registered trademarks of Red Hat. BUT NOT LIMITED TO. or distribute the original SQLite code. INCLUDING. or approved by the United States Postal Service. commercial or non-commercial. INDIRECT. Sun Microsystems Copyright Disclosure Java™ is a trademark of Sun Microsystems. Inc. Tele Atlas North American Copyright Disclosure Portions © 2006 Tele Atlas North American. OR PROFITS. The price of these products is neither established. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES. and RDI information. LOSS OF USE. Postal Service. The use of this material is subject to the terms of a license agreement. RDI.S. OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY. publish. DPV. sell. DATA. in the U. ZIP+4®. nor the names of their contributors may be used to endorse or promote products derived from this software without specific prior written permission. in the United States and other countries. STRICT LIABILITY. and by any means. for any purpose. INCIDENTAL. WHETHER IN CONTRACT. . either in source code form or as a compiled binary. This information is confidential and proprietary to the United States Postal Service. EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. Inc. modify. Inc. USPS Copyright Disclosure National ZIP®. You will be held liable for any unauthorized copying or disclosure of this material. PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES. SPECIAL. © United States Postal Service 2005. ZIP Code® and ZIP+4® are registered trademarks of the U.

................................ 3  Statements ...................46  Boolean ............................................................................................................................... 2  Declaration of Symbols .......................................................................................................................................28  Debugging and Printing Error Messages .20  Selecting Output Fields ...........................................10  Functions ...27  Saving Expressions............................ 32  Integers and Real Type ............................................................. 5  The Expression Engine Language .............................................................................................22  Sub-Setting .......32  Strings ........................................................................................31  Data Types ................................................................................................................................................................................... 16  Getting Started.................................................................................................................................................................................................................................................................................................................................50  Data Structures and File Manipulation..........................................................................................................................................................................................Table of Contents DataFlux ..........24  Initializing and Declaring Variables ..........................................................................27  Counting Records ................ 5  Expressions ..............................................15  dfPower Studio Expressions Node ...........................................................................................Contact and Legal Information .......................53  DataFlux Expression Language Reference Guide v ........14  Objects ...................................................1  Table of Contents ............................................................................ 20  Testing and Evaluating ........................................................ 1  Overview of the Expression Engine Language .................................................................................................................................................................................................................................................................................. 53  Arrays ....................................................................................................................................35  Dates and Times ...............................................................................................................................

.....................................................................................................................................................73  COBOL Support .....................................................................80  Regular Expressions ............................. 96  ASCII Printable Characters ..............................................................................................................................................................................................96  ASCII Control Characters .......................................................................97  Appendix B: List of Functions ................. 106  vi DataFlux Expression Language Reference Guide .................................................................................................................................................................................................................................................92  Evaluating Incoming Data..........................................................................................................95  Appendix A: ASCII Values .....87  Blue Fusion Functions ........................................................................................................................................59  Files ..................................84  Encoding and Decoding ............................................65  List of File Functions ...........77  Databases ......................................................Groups ...............................87  Macro Variables ......71  Binary Data ...................................................................................................................................................................................................................... 98  Glossary ................................................................................................................................

dmc -VAR my_field="address" Command line declarations override the macro variable values declared in architect. Note: Some of the functions described in this reference guide were introduced in dfPower Studio version 8. If you choose to edit the architect. When you add macros using Navigator. and variables for manipulating data. the macro declaration has the following format (for example): archbatch.0 and later. The Expression Engine Language (EEL) provides many statements. dfPower edits the architect. All of the examples illustrated here also apply to other nodes where EEL is used in dfPower Studio.cfg file directly. Caution: It is recommended that you have some programming experience before using EEL. for example %%FILENAME%%. or by specifying a file location when you launch a job from the command line.exe myjob. Variables are set off by %%. You can declare macro variables by entering them in the Macros folder of dfPower Navigator.Using dfPower Batch . You can use dfPower Architect's Expression node to run a Microsoft® Visual Basic®-like language to process your data sets in ways that are not built into dfPower Studio.cfg. If you use archbatch.The Expression Engine Language DataFlux® dfPower® Studio is a powerful suite of data cleansing and data integration software applications.cfg file for you. Most examples use the Expression node in dfPower Architect. Refer to the dfPower Studio Online Help .cfg file directly.Command Line Batch for more information on using macro variables. You can choose to use variables for data input paths and filenames in the dfPower Architect Expression node. The results from Expression are determined by the code in the Expression Properties dialog. by editing the architect. This manual will guide you through solutions to address some common EEL tasks.exe from the command line. you can add multiple comments.0 and so are only available in versions 8. functions. • • • • • • Overview of the Expression Engine Language Declaration of Symbols Statements Expressions Functions Objects DataFlux Expression Language Reference Guide 1 .

Comments are text within a code segment that are not executed. Comments can be either C-style (starts with /* and ends with */) or C++ style (starts with // and continues to the end of a line). assume the same fields from the previous example. Code may include comments. Declarations must be located in the code outside of programmatic constructs.Overview of the Expression Engine Language Operations in the EEL are processed in symbols. Assume there are two symbols (output fields from the previous step) named "x" and "y. then the entire expression returns true. Statements can be located anywhere in a code segment." Following is an example of Expression code: // Declaration of integer z integer z // Assignment statement z=x+y This example creates another symbol/field. the last function to execute determines the overall expression value. but it is best to use white space and new-line characters (or semi-colons. so declaring a variable in a for loop is illegal. then the row is not sent to output. if the last function returns a true value." Consider the following straight expression in the Expression code area: x<=y EEL in the Expression code area executes on each record of your data set." and sets the value of z to x + y. A segment of Expression code can also be a straight expression. if a straight expression value is false. to make the code compatible with other systems such as DBMS/Copy) for readability. Statements — Statements are either assignments (for example: x=y) or keywords (for example: goto) followed by parameters. "z. • Declarations — Declarations establish the existence of variables in memory. EEL code consists of declarations. • • Pieces of Expression code do not need to be separated by anything. "x" and "y. In the context of the Expression main code area. and labels. Symbols are similar to variables. statements. For example. 2 DataFlux Expression Language Reference Guide . so it is better to make all declarations at the beginning of a code segment. Declared variables are available only after their declaration. they are either fields passed from the node above or are variables declared in the code. If you have more than one function in the main code area. making z ready for the next step. Labels — Labels are named locations in a code segment and can be located anywhere in the code segment. Only records where the value of x is less than or equal to y are output to the next node. For example.

In dfPower® Architect. it is truncated to this size. /*semicolon is safely ignored. It is good practice to declare symbols at the beginning of the code block. Before code is executed.4 /* more than one statement can be on a line */ Declaration of Symbols Declarations have the following syntax: ["static"]["private"|"public"]["hidden"|"visible"] type[(*size)] ["array"] identifier where type is: "integer"|"string"|"real"|"boolean"|"date"|"pointer" and identifier is a non-keyword word starting with an alphabetic character or any string delimited by back quotes (`). symbols are reset to null. You can also declare a symbol that is static private. all symbols declared in code are available in the output unless they are declared private or hidden. Note: The global symbol type is deprecated but is equivalent to static public. Static symbols are public by default. Note: Size is applicable to the string type only. public. If you assign a value to a string symbol. bytes. unless they are declared static or have been declared in the pre-processing step. Private symbols are only visible within the code block in which they are declared.The following example includes several of the concepts discussed above: // declarations integer x. Additional information about declaring symbols: • • • • • • The default symbol type is public (more on private. Symbols may be declared anywhere in code except within programmatic constructs such as for loops. The default size (if none is specified in parentheses) is 255. The keyword. and can use C-style comments*/ real y // statements x=10 y=12. qualifies a string size in bytes. String symbols may be declared with a size. and global symbols below). • • DataFlux Expression Language Reference Guide 3 . in which case they retain their value from the previous execution.

The pre-processing expression defaults all symbols to static public whether they are declared static or not. Symbols declared in Post-Processing are only available to Post-Processing. Hidden symbols are not output from the expression step in Architect. Hidden variables are visible outside the expression block but are not output. The default is visible if none is specified. This replaces the global keyword. The keywords hidden and visible can be used when declaring symbols. It specifies that the symbol's value is not reset between calls to the expression (this means between rows read in dfPower Architect). also known as the back quote. Automatic symbols (symbols from the previous step) are available to any of the three blocks. Note that this differs from public and private. that is employed. The grave accent is found above the tab key on standard keyboards. Private variables are not output either. write your variable name between back quotes (`). For example: string `my var` `my var`="Hello" Note: It is the grave accent character (`). 4 DataFlux Expression Language Reference Guide . Use back // quotes if symbols include spaces date `birth date` Public or global symbols declared in one area are available to other areas as follows: • • • • Symbols declared in Pre-Processing are available to any block. and not the apostrophe (') or quotation marks ("). To declare a variable with spaces in the name. but they are not visible outside the expression block. • • Here are some sample declarations: // a 30-character string available // only to the code block private string(30) name // a 30-byte string string(30 bytes) name // a 255 character public string string address // a global real number global real number // a public date field.• The static keyword can be used when declaring symbols. Symbols declared in Expression are available to Expression and Post-Processing.

• • • Only read-write symbols may be assigned a value. For example.. • DataFlux Expression Language Reference Guide 5 . To group more than one statement together (for example. if you assign a number to a stringtype symbol.] "end" | ["call"] function label: identifier ":" expression: described later function: identifier "(" parameter [..Statements Statements have the following syntax: statement: | "goto" label | identifier "=" expression | "return" expression | "if" expression ["then"] statement ["else" statement] | "for" identifier ["="] expression ["to"] expression ["step" expression] statement | "begin" statement [statement. all symbols are read-write. Goto and Label A goto statement jumps code control to a label statement. use begin/end.parameter.. A symbol assigned an expression of a different type receives the converted (or coerced) value of that expression. in a for loop). For example: integer x x=0 // label statement called start start: x=x+1 if x < 10 goto start Assignment Assigns the value of an expression to a symbol.] ")" Statements may optionally be separated by a semicolon or new-line character. A label can occur anywhere in the same code block. In dfPower Architect. For example. the symbol's value is null.. If the expression cannot be converted into the type of symbol. the symbol contains a string representation of that number. if you assign a non-date string to a string symbol.

DIM Function The dim() function is used to size and if necessary to resize the array. It is not possible to create arrays of objects such as dbCursor. and so on. The new size is returned. regex.Example: integer num string str date dt boolean b real r // assign 1 to num num=1 // assign Jan 28 '03 to the date symbol dt=#01/28/03# // sets boolean to true b=true // also sets boolean to true b='yes' // sets real to 30. and get. the array is created and resized.12" // sets string to the string representation of the date str=dt // sets num to the rounded value of r num=r Arrays In the expression engine. strings. dates.dim(5) // code to work on the array 6 DataFlux Expression Language Reference Guide . set. resizes. If a parameter is specified. the user can create arrays of primitive types such as integers. In essence it creates.12 (converting from string) r="30. reals. or determines the size of the array. dbconnection. and booleans. Example: // declare the array string array string_list // an array of size 5 string_list. The syntax to create arrays of primitive types is as follows: • • • • • string: array string_list integer: array integer_list date: array date_list boolean: array boolean_list real: array real_list There are three supported functions on arrays: dim. file.

DataFlux Expression Language Reference Guide 7 . data_array.dim(10) Note: The words string_array."Goodbye") print(string_list.dim(5) // sets the first string element in the array to hello string_list.set(1. It returns the old value that is available. boolean_array. and date_array are reserved.set(i.// the array is later sized to 10 string_list."Hello") print(string_list.set(i.get(i)) i=i+1 end For more on arrays. Example: string array string_list string_list. For example: string first_string first_string=string_list."hello") GET Function The get() function returns the value of a particular element in the array. see Arrays. SET Function The set() function sets the value of an entry in the array.dim(5) i=1 // set and print each entry in the array while(i<=5) begin string_list.get(1) The following is an example of using arrays in expressions: string array string_list integer i // set the dimension string_list.get(i)) i=i+1 end // resize the array to 10 string_list. integer_array.dim(10) while(i<=10) begin string_list.

Using the following example. use begin/end. The then keyword is optional. Example: // only include rows where ID >= 200 if id < 200 return false If/Else If/else statements branch to one or more statements depending on the expression. the record is not included in the output. If a false value is returned from Expression. If you need to execute more than one statement in a branch. If you nest if/else statements. • • In dfPower Architect. • • • • Use this to execute code conditionally. returning a value. as it makes the code more readable. the else statement corresponds to the closest if statement (see the previous example. return type is converted to boolean. change the value of Age to see different outcomes: string(20) person integer x integer y integer age Age=10 if Age < 20 then person="child" else person="adult" if Age==10 begin x=50 y=20 end // nested if/else if Age <= 60 if Age < 40 call print("Under 40") // this else corresponds to the inner if statement else call print("Age 40 to 60") 8 DataFlux Expression Language Reference Guide .) It is better to use begin/end statements if you do this.Return Return exits the code block immediately.

end. • • • • • • • For loops are based on a symbol which is set to some value at the start of the loop. A for loop has a start value." If you are starting at a high number and ending at a lower number. and an optional step value. If you need to execute more than one statement in the loop. If the step value is not specified. and changes with each iteration of the loop. The start. use begin/end. you must use a negative step. Example: integer i i=1000 // keep looping while the value of i is > 10 while i > 10 DataFlux Expression Language Reference Guide 9 . The expressions are only evaluated before the loop begins.// this else corresponds to the outer if statement else call print("Over 60") For For loops execute one or more statements multiple times. it defaults to "1. and step value can be any expression. an end value. Example: integer i for i = 1 to 10 step 2 call print('Value of i is ' & i) integer x integer y x=10 y=20 for i = x to y call print('Value of i is ' & i) for i = y to x step -1 begin call print('Value of i is ' & i) x=i /*does not affect the loop since start/end/step expressions are only evaluated before loop*/ end While While loops allow you to execute the same code multiple times while a condition remains true.

If you need to execute multiple statements in a for or while loop or in an if/then/else statement. Expressions • • • • An expression can be a number. a function. you must use begin/end. Call Use this statement to call a function and discard the return value. These may be nested as well. The resulting value can be one of the following: string.) * / % + – & != <> == > < >= Description Parentheses (can be nested to any depth) Multiply.i=i/2 // you can use begin/end to enclose more than one statement while i < 1000 begin i=i*2 call print('Value if i is ' & i) end Begin/End Begin/End statements group multiple statements together. The resulting value can also be null (a special type of value). divide. and pointer. date. or a function which uses other functions as parameters. modulo Add. boolean. real. An expression always has a resulting value. subtract String concatenation Equivalence ("!=" and "<>" are the same) 10 DataFlux Expression Language Reference Guide . integer. Following are some different types of expressions: Operators The following table lists operators in order of precedence: Operators (.

Example: integer x real r // order of precedence starts with parentheses. DataFlux Expression Language Reference Guide 11 . respectively. // then multiplication. Example: string str // simple string str="Hello" // concatenate two strings str="Hello" & " There" Note: Setting a string variable to a string expression results in a truncated string if the variable was declared with a shorter length than the expression.Operators <= and or Boolean and Boolean or Description String Expressions A string expression is a string of undeclared length. see Strings. see Integers and Real Type. then addition x=1+(2+3)*4 // string is converted to value 10 x=5 + "10" r=3. Strings can be concatenated using & or operated upon with built-in functions. For more on string expressions.14 // x will now be 3 x=r For more on integer and real expressions. Integer and Real Expressions Integer and real expressions result in an integer or real value.

Using and or or in an expression also results in a boolean value. and the fraction representing the fraction of a day. Results of comparisons are always boolean. A date constant is denoted with #. Example: date dt // Jan 10 2003 dt=#01/10/03# dt=#10 January 2003# dt=#Dec 12 2001 11:59:20# // date is now Dec 15 dt=dt + 3 // prints "15 December '01" call print(formatdate(dt. Example: boolean a boolean b boolean c a=true b=false // c is true c=a or b // c is false c=a and b // c is true c=10<20 // c is false c=10==20 // c is true c=10!=20 // c is true c='yes' // c is false c='no' 12 DataFlux Expression Language Reference Guide .Date Expressions • • • A date value is stored as a real value with the whole portion representing number of days since Jan 1. Boolean Expressions • • • A boolean expression can either be true or false. If a whole number is added to a date."DD MMMM 'YY")) // sets dt to the current date and time dt=today() For more on date expressions. the resulting date is advanced by the specified number of days. see Dates and Times. 1900.

Null Propagation If any part of a mathematical expression has a null value. the entire expression is usually null. The following table shows how nulls are propagated: Expression nul == value null & string null & null number + null null + null null AND null null AND true null AND false null OR null null OR true null OR false not null if null for loop while null Example: integer x integer y integer z boolean b string s x=10 y=null // z has a value of null z=x + y // b is true b=true or null // b is null b=false or null // use isnull function to determine if null if isnull(b) call print("B is null") // s is "str" s="str" & null String Null Null (applies to all arithmetic operations Null (applies to all arithmetic operations) Null Null False Null True False Null The statement following if is not executed Runtime error if any of the terms are null The statement following while is not executed Result Null (applies to all comparison operators) DataFlux Expression Language Reference Guide 13 .

Most date formats are recognized and are intelligently converted. it results in null. The following table shows the rules for coercion: Coercion Type from String from Integer from Real from Date from Boolean from Pointer yes yes yes yes no yes yes yes no yes yes no no no no To String To Integer yes To Real yes yes To Date yes yes yes To Boolean yes yes yes no To Pointer no no no no no The following table shows special considerations for coercion: Coercion Type date to string date to number string to date string to boolean Resulting Action A default date format is used: YYYY/MM/DD hh:mm:ss. 14 DataFlux Expression Language Reference Guide . The number represents days since 12/30/1899. use call. integer or real Any non-zero value is true. Each function has a specific return type and parameter types. no. • • A type can be coerced into some other types. it is converted (or coerced) into the correct type. y. true. false.Coercion If a part of an expression is not the type expected in that context. and f are recognized. A function sometimes requires a parameter to be a specific type. it is not coerced and you get an error. n. Use the formatdate() function for a more flexible conversion. Years between 50 and 99 are assumed to be in 1900. dfPower Architect provides a print() function that prints the parameter to the log file for the step. t. If the parameters provided to the function are not of the correct type. Zero is false. If you need to call a function but do not want the return value. If a value cannot be coerced. If you pass a parameter of the wrong type. others are in 2000. to boolean Functions • • • • • • A function may be part of an expression. they are sometimes coerced. The values yes.

• • • • Functions normally propagate null (there may be exceptions). Some functions might modify the value of their parameters if they are documented to do so. See Appendix B: List of Functions for a complete list of built-in functions. the expression engine supports objects for: Blue Fusion — Expressions and Functions Databases — Database connectivity (DBConnect Object) Files — Text file reading and writing (File Object) Regular Expressions — Regular expression searches (Regex Object) DataFlux Expression Language Reference Guide 15 . Example: string str integer x str="Hello there" // calls the upper function if upper(str)=='HELLO THERE' // calls the dfPower Architect print function call print("yes") // x is set to 7 (position of word 'there') x=instr(str.1) Objects The expression engine supports a number of objects. but also the types of operations that can be applied to the data structure."there". Some functions might accept a variable number of parameters. In particular. an object is a type of code in which not only the data type of a data structure is defined. Generally speaking.

dfPower Studio Expressions Node The dfPower® Studio Expressions node is a utility that allows you to create your own nodes using the Expression Engine Language (EEL) scripting language. click Tools > Base > Architect. 16 DataFlux Expression Language Reference Guide . 2. 3. To access the dfPower Architect Expression Properties dialog: 1. From dfPower Navigator open dfPower Architect. Select Utilities > Expression. This section describes the options available on the dfPower Studio Expression Properties dialog. Name Enter a name for the node. Notes Click to enter details or any other relevant information for the node. Double-click the Expression node that appears in the right-hand panel. double-click the icon or right-click and select Properties. dfPower Studio Expression Properties dialog Note: To access the properties dialog for a node that is already part of your job flow.

You can type your code here and use the Functions and Fields buttons below to copy and paste code. Short descriptions are included for each function. You can double-click the fields to copy them directly into the code area. On the Expression Properties screen. Generate rows when no parent is specified This setting allows the expression node to act as a data source and generate its own rows. Post-processing Expression This is the editing area for any Expression code that should be executed after executing the main Expression code. which lists all available Expression functions. This option is also useful in allowing you to test an expression without using a table to create rows. use functions to manipulate data. and the syntax is pasted into your expression screen after copying to the Windows Clipboard. Load Click to insert code from an existing Expression file. click OK. Pre-processing Expression Tab The editing area is for any Expression code that should be executed before executing the main Expression code. right-click in one of the Expression editing areas and select Paste to paste the selected function and its parameters. Type your code here and click the Functions and Fields buttons to copy and paste code.Expression Tab The Expression tab is where you assign values to variables. The seteof() function can be used at some point in your code to stop generating rows. such as declaring and initializing variables. DataFlux Expression Language Reference Guide 17 . Use the Fields screen the same way you use the Functions screen described above. On the Functions screen. Code in the Pre-processing Expression area is processed before dfPower Architect accesses the record set. You can also double-click directly on the functions to copy them into the code area. Save Click to save the code in all three Expression editing areas to a special Expression file. Code in the Expression area is processed once for each record in your data set. such as assigning final values and controlling output. and then click Close. Type your code here and use the Functions and Fields buttons to copy and paste code. Fields Click to display the Fields screen. Output is displayed on the Preview tab. select a function. and loop through records. Functions Click to display the Functions screen.

such as state. Grouping Click Grouping to set up your incoming data to group by a field. If a field name is specified here. a new group is created each time a new state value appears. then the value is set to false for the field and the row is returned. when the Expression node is activated. Example of grouping by state 18 DataFlux Expression Language Reference Guide . Grouping dialog Note: If you plan to group your data. For example. returning false from an expression results in no row appearing in the output. then the field is created in the output. The field is set to true or false according to the expression returned. click Sort in your Data Source or SAS Data Set node. Otherwise. Return status field If a field name is specified here. and "false" otherwise. in order to group your data by state.Pushed status field If a field name is specified here. then that field is created in the output. you must first sort your data. It holds the value "true" if the row resulted from a pushrow() action. select Pushed status field and enter the name of the field. Normally. To indicate that a row is pushed on your expression step.

no error checking or validation is done on the values you enter for each property. Note: When using the Advanced Properties dialog. as all of the information is validated before the node will run successfully. Entering information through the main Properties dialog is the preferred method for completing information for each step. This can lead to unexpected results. DataFlux Expression Language Reference Guide 19 . where you can view a list of Advanced Properties and their values.Advanced Properties Right-click on the node icon in the job flow and select Advanced to open the Advanced Properties dialog.

you must create sample rows. Expression Properties dialog This creates sample empty rows in the Preview tab. 20 DataFlux Expression Language Reference Guide . select Generate rows when no parent is specified. • • • • • • • Testing and Evaluating Selecting Output Fields Sub-Setting Initializing and Declaring Variables Saving Expressions Counting Records Debugging and Printing Error Messages Testing and Evaluating In order to test an expression prior to running an Architect job.Getting Started Following are some examples that illustrate specific concepts related to the Expression Engine Language (EEL). Exercise 1: How do I test an expression without using a table to create rows? In the Expression Properties dialog.

and you do not have output specified in the post-processing step. Exercise 2: Is it possible to create test rows with content rather than empty rows? This involves creating extra rows with the pushrow() function in the Pre-expression section. no data is output. Generate rows when no parent is specified must not be selected.Note: If you do not select Generate rows when no parent is specified. Consider the code example below: // Pre-Expression string name // the name of the person string address // the address of the person integer age // the age of the person // Expression name="Bob" address="106 NorthWoods Village Dr" age=30 // Create an extra row for the // fields defined above pushrow() // The content for the extra row name="Adam" address="100 RhineStone Circle" age=32 // Create an extra row for the // fields defined above pushrow() // The content for extra row name="Mary" address="105 Liles Rd" age=28 // Create an extra row for the // fields defined above pushrow() The pushrow() function creates the rows. Note: To use the pushrow() function. DataFlux Expression Language Reference Guide 21 .

22 DataFlux Expression Language Reference Guide . but are not meaningful in the output.Creating test rows with output Selecting Output Fields Some fields are used for calculation or to contain intermediate values. Consider the following example: // Pre-Expression // This declares a string // type that will be hidden hidden string noDisplay // Expression // Assigns any value to the string type noDisplay='Hello World But Hidden' The string field. Exercise: How do I exclude some fields in the expression from being listed in the output? You accomplish this by using the hidden keyword before declaring a variable. noDisplay. there may be a need to exclude fields from the output. As you are testing or building scripts. is not output to the Preview tab.

DataFlux Expression Language Reference Guide 23 . Observe that noDisplay is now output.Hidden variable does not display in output Verify this by removing the parameter hidden from the string noDisplay declaration.

testing new jobs can be time consuming. Exercise 1: Apply your expression to a subset of your data by controlling the number of records processed. Consider the following example: // Pre-Expression // We make this variable hidden so it is // not output to the screen hidden integer count count=0 hidden integer subset_num // the size of the subnet subset_num=100 // This function estimates and sets the # of // records that this step will report rowestimate(subset_num) 24 DataFlux Expression Language Reference Guide .Variable displays in output when not hidden Sub-Setting In working with large record sets in Architect. Shorten the time to build your expression by testing your logic against a subset of large record sets.

// Expression if(count==subset_num) seteof() else count=count + 1 Keep track of the number of records output with the integer variable count. DataFlux Expression Language Reference Guide 25 . these rows are returned as well. the node stops generating rows. Furthermore. the postgroup expression and the post expression are still executed. if any rows have been pushed using pushrow(). The exception to this is if seteof(true) is called. any pushed rows (whether pushed before or after the call to seteof()) are still returned to the node below. The rowestimate() function is employed by Architect to estimate the number of records that will be output from this step. if further pushrow() calls occur after seteof(true) is called. Once count matches the size of the subset. and further calls to pushrow() have no effect. In this case. The exact syntax for seteof() function is: boolean seteof(boolean) When seteof() is called. Also note that after seteof() is called. integers 1-100 are output. If Generate rows when no parent is specified is checked. Notably. If you remove the hidden parameter from the integer count declaration. use the seteof() function to prevent any more rows from being created. they are discarded. the node does not read any more rows from the parent node.

// Pre-Expression integer counter counter=0 integer subset_num subset_num=50 // Expression if counter < subset_num begin counter=counter + 1 end else return true By setting the return value to true or false you can use this approach as a filter to select which rows go to the next step. 26 DataFlux Expression Language Reference Guide .Controlling the number of rows with seteof() and rowestimate() Another approach to solving this problem is shown in the following example: Exercise 2: Apply your expression to a subset of your code by filtering out rows of data.

you get no output and your expression enters an infinite loop. Your expression is saved in an .exp text file format that you may load using Load. Saving Expressions Exercise: How do I save my expressions? You can save an expression without saving the entire Architect job. and it is initialized only once before the expression process takes over.Controlling the number of rows by returning a true or false value Note: If you always return false. DataFlux Expression Language Reference Guide 27 . This may lead to re-initialization of certain variables in the expression. Initializing and Declaring Variables As the expression is being evaluated. Click Save. Exercise: How do I initialize a variable only once and not with each iteration of a loop? Declare the variable in the pre-expression step. each row updates with the values of the fields in the expression. You may want to initialize a variable only once and then use its value for the rest of the expression script.

Here is the EEL code: // Preprocessing hidden integer count count=0 // Expression if(NOT isnull(`address`) ) then count=count+1 // Post Processing // Create a variable that will contain 28 DataFlux Expression Language Reference Guide . so that finalCount is assigned only after all of the rows are processed in the expression step. a connection is made to the Contacts table in the DataFlux Sample database. For more information on connecting to a data source and specifying data outputs. Start by first defining an integer type in the pre-expression step that contains the count.Introduction in the dfPower Studio Online Help. see Architect .Counting Records Exercise 1: How do I count the number of records in a table using EEL? In this example.) rather than apostrophes (ASCII &#39. // Pre-Expression // Declare and initialize an integer // variable for the record count integer recordCount recordCount=0 // Expression // Increment recordCount by one recordCount=recordCount+1 The value of recordCount increments one by one until the final count is reached. you would enter the following in the expression: // Check if the value is null if(NOT isnull(`address`) ) then recordCount=recordCount+1 In our example the value recordCount is getting updated after each row iteration. and in the post expression step assign the value of count to another field that you want to display in the output (finalCount). Using pushrow() add an extra row to the output to display finalCount. If you want to increase the count for only those values that do not have a null value.). and output to an HTML report. Note: Field names must be enclosed in grave accents (ASCII &#96. Add the final row in the post processing step. Exercise 2: How do I see the final count of the records in a table instead of seeing it get incremented one by one on every row? Declare a count variable as hidden.

you should receive in the very last row the total number of records in the table that are not null. Displaying the final record count Exercise 3: How do I get just one row in the end with the final count instead of browsing through a number of rows until I come to the last one? A simple way to do this is to return false from the main expression. you can devise a way to indicate that a row is being pushed. Therefore you can filter all the other rows from your view except the pushed one. The only row that is output is the one that was created with pushrow(). DataFlux Expression Language Reference Guide 29 .// the final value and assign it a value integer finalCount finalCount=count // Add an extra row to the output pushrow() When you enter this code and run it. Or. To indicate that a row is pushed on your expression step check Pushed status field and enter a new name for the field. The final row displayed is an extra pushed row on top of the stack of rows that is being displayed.

Once you indicate with a boolean field whether a row is pushed or not. add another expression step that filters rows that are not pushed: // Preprocessing hidden integer count count=0 // Add a boolean field to indicate // if the row is pushed boolean pushed // Expression if(NOT isnull(`address`) ) then count=count+1 // Name the if (pushed) return else return pushed status field "pushed" then true false // Post Processing integer finalCount finalCount=count pushrow() Displaying the last row with the final record count 30 DataFlux Expression Language Reference Guide .

sts file. In the previous example of calculating the number of records in a table.sts file print('The final value for count is: '& finalCount) DataFlux Expression Language Reference Guide 31 . in the end we could output the final count to the statistics file. these messages print to the Log tab. In the Post-processing section you would have: // Post Processing // Integer to have the final count integer finalCount finalCount=count // Add one extra row for post processing pushrow() // Print result to Architect. go from the main menu in dfPower Architect to Tools > View Statistics. When previewing output.Debugging and Printing Error Messages Exercise 1: Is there a way to print error messages or to get debugging information? You may use the print() function that is available to print messages to the Architect. To view this file.

This section focuses on available functions in EEL that work on integer/real types. It takes a variable as a parameter and returns true if the expression is a number. // Expression string str string input input=8 // although a string. and a real type value can be converted to an integer. input is coerced into an integer if(isnumber(`Input`)) str="this is a number" // input is a number else str="this is a string" 32 DataFlux Expression Language Reference Guide . An integer can be converted to real type. Exercise: How do I determine if the variable has a numeric value? Use the isnumber() built-in function to determine if a variable has a numeric value.Data Types The Expression Engine Language (EEL) uses the following data types: Integer and Real String Date Boolean Integers and Real Type Integers and real types are basic data types in EEL. • • • • • • • Determining Type Assigning Values Casting Range and Precision List of Operations Rounding List of Integer Functions Determining Type Determine the type of a variable before doing calculations.

Any non-zero integer value is interpreted as true. For example: integer x // string is converted to value 10 // x will have the value 15 x=5 + "10" Exercise 3: Is it possible to assign the integer value 0 to a boolean to represent false? In EEL. EEL handles the casting automatically. boolean values can have an integer value of 0. Add a negative sign in front of the value to make it negative. They can have a range of values from 2^31 to 2^31-1. Integer/Real types can be changed from one type to another. Integers and real types are not just limited to positive types. Range and Precision When working with scientific data with either very small or very large values. // Expression integer positive integer negative positive=1 negative=-1 // negative is equal to -1 Casting The need to coerce from one type to another may happen frequently with integer/real/string types. which is be interpreted as false. Exercise 2: Is it possible to combine integer/real data types with strings? Yes.Assigning Values Exercise: Can integers/real types have negative values? Yes. Simply assign one type to another. the range and precision of the integer and real types may be important. DataFlux Expression Language Reference Guide 33 . Exercise: What is the range/precision for real and integer values? Integer types are stored as 32-bit signed integers. a string type can be changed to an integer/real type. From the user's point of view there is no need to do anything. Exercise 1: Can I assign the value of a real to an integer? What about assigning an integer to a real value? Yes.

The second parameter is an integer value that determines how many significant digits to use for the output.3) Rounding Integers/real values in EEL can be rounded using the round() function. it is not possible to perform trigonometric or logarithmic calculations.7 x 10308. You can perform exponential calculations using the pow() function: real pow(real.0 x 10-324 to 1. Real types are based on the IEEE 754 definition.real) Returns a number raised to the power of another number. // Expression real exponential // exponential is 8 exponential=pow(2.126 34 DataFlux Expression Language Reference Guide . List of Operations In EEL. They have a precision of 15-16 digits and a range of 5.Real types are stored as double precision 64-bit floats. the following operations can be performed on real/integer types. Exercise: Can integer/real types be rounded? Use the round() function to do this. Exercise: What operations can I do on real/integer types? The list of operations for real/integer types includes: • • • • • Multiplication (*) Division (/) Modulo (%) Addition (+) Subtraction (-) Currently. A negative value is used to the left of the decimal point. A positive value is used to round to the right of the decimal point. Consider the following code example: // Expressions integer integer_value integer_value=1274 real real_value real_value=10.

Function abs ceil floor isnumber round pow Syntax real abs(real) real ceil(real) real floor(real) boolean isnumber(any) real round(real.integer ten integer hundred integer thousand // the value for ten will be 1270 ten=round(integer_value.-3) real real_ten real real_hundred // the value for real_ten will be 10.-2) // the value for thousand will be 1000 thousand=round(integer_value.13 real_hundred=round(real_value. 2) List of Integer Functions Below is a summary of available functions for integer/real types.-1) // the value for hundred will be 1300 hundred=round(integer_value. integer) real pow(real. 1) // the value for real_hundred will be 10.1 real_ten= round(real_value. real) Description Returns absolute value Returns the ceiling value of a number Returns the floor value of a number Returns true if expression is number Rounds a number to specified decimal places Returns a number raised to the power of another number Strings There are several functions available in EEL that affect the built-in string data type. These functions can be grouped into the following categories: • • • • • • Determining Type Extracting Substrings Parsing ASCII Conversions String Manipulations Comparing and Matching DataFlux Expression Language Reference Guide 35 .

For example: // Expression string hello hello="hello" boolean error error=false // variable that will contain the type string type type=typeof(hello) // type should be string if(type<>"string") then error=true Exercise 2: How do I determine if a string is made up of alphabetic characters? Use the isalpha() function to determine if a string is made up entirely of alphabetic characters.• • • • • Replacing Strings Finding Patterns Control Characters Evaluating Strings List of String Functions Determining Type The following exercises demonstrated how to determine the data type of a string. string typeof(any) The typeof() function returns the type of data the expression converts to. Exercise 1: How do I determine if an entry is a string? Use the typeof() function to determine if the string is a string type. boolean isalpha(any) Isalpha() returns true if the string is made up entirely of alphabetic characters. Consider the following example: // Expression string letters letters="lmnop" string mixed mixed="1a2b3c" 36 DataFlux Expression Language Reference Guide .

4) // "John" inbetween=left(greeting. 10) // "Hello Josh" inbetween=right(inbetween. For example: // Expression string greeting greeting="Hello Josh and John" string hello string John string inbetween hello=left(greeting.string alphatype alphatype=isalpha(letters) // true string mixedtype mixedtype=isalpha(mixed) // false Exercise 3: How can I retrieve all values that are either not equal to X or null values? To accomplish this. integer) Left() returns the leftmost characters of a string.5) // "Hello" John=right(greeting. For example: // Expression if State <> "NC" OR isnull(State) return true else return false Extracting Substrings Exercise: How do I get substrings from an existing string? To do this. string right(string. right(). 4) // "Josh" Another approach is to use the mid() function: string mid(string. string left(string. and mid(). integer n) Mid() returns a substring starting at position p for n characters. there are three available functions: left(). use the isnull() function. integer p. integer) Right() returns the rightmost characters of a string. DataFlux Expression Language Reference Guide 37 .

get(2) // Second word will be "dfPower" string third_word third_word=words.get(count) The aparse() function is useful if you want to retrieve the last entry after a given separator. The number of elements in the array is returned. Parsing Exercise: How do I parse an existing string into smaller strings? To do this.. The syntax for the parse() function is: integer parse(string. string. . Similar to the aparse() function is the parse() function.Example: string substring // substring will be the string "Josh" substring=mid(greeting. use the aparse() function. last_entry=words. 7. It returns the total number of parameters. 38 DataFlux Expression Language Reference Guide . integer aparse(string. 4).get(1) // First word will be "DataFlux" string second_word second_word=words..) Parse() parses a string using another string as a delimiter.get(3) // Third Word will be "Architect" string last_entry // This will have the last entry. For example: // Expression string dataflux dataflux="Dataflux:dfPower:Architect" // An array type to contain the parsed words string array words // integer to count the number of words integer count // count will have a value of 3 count=aparse(dataflux. array) Aparse() parses a string into a string array. Results are stored starting from the third parameter. words) string first_word first_word=words. ":". string.

removing spaces. For example: // Expression integer ascii_value string character_content ascii_value=asc("a"). Exercise: Is it possible to convert between ASCII characters and values? Yes. To do this. use the chr() and asc() functions.You would employ the parse() function in the following situation: // Expression integer count string first string second string third // first contains "DataFlux" // second contains "dfPower" // third contains "Architect" count=parse("DataFlux:dfPower:Architect". ":". and parse() is useful for returning individual strings. The main difference between the two functions is that aparse() is suited for arrays. Exercise 1: How do I concatenate strings? Use the & symbol to concatenate strings. second. // ascii_value is 97 character_content=chr(97) // returns the letter "a" See a complete list of Appendix A: ASCII Printable Characters. ASCII Conversions EEL has the ability to convert characters to their ASCII values. third). first. concatenating. when working with strings. you may want to perform manipulations such as adjusting the case. // Expression string Hello Hello="Hello " string World World=" World" string Hello_World Hello_World=Hello & World // outputs "Hello World" DataFlux Expression Language Reference Guide 39 . and to convert ASCII values to characters. EEL has built-in functions to perform these actions. and getting the length of a string. String Manipulations Frequently.

and search for substrings. and then use the trim() function to remove the spaces. string lower(string) Lower() returns the string in lowercase. 40 DataFlux Expression Language Reference Guide . string trim(string) Trim() returns the string with the leading and trailing white-space removed. integer len(string) Len() returns the length of a string. Example: // Expression string changeCase changeCase="MixedCase" string newCase newCase=upper(changeCase) " Comparing and Matching EEL lets you compare strings. Exercise 1: How do I compare two strings? Use an equal comparison (==) between strings to compare them. string upper(string) Upper() returns the string in uppercase. // Expression string content content=" Spaces integer content_length content=trim(content) // Remove spaces // returns 6 content_length=len(content) Exercise 3: How does one convert a string type to lower or uppercase? Use the lower() and upper() functions. find differences between strings.Exercise 2: How do I get the length of a string and remove spaces? Use the len() function to get the length of the string.

It returns: • • • -1 if first < second 0 if first equals second 1 if first > second If the third parameter is true. they are compared case insensitively. hey). boolean) Compare() compares two strings. The following examples use these functions: // Expression integer difference integer comparison string hello hello="hello" string hey hey="hey" // comparison is -1 because hello comes before hey comparison = compare(hello. another function is edit_distance(): integer edit_distance(string. string) This function returns the edit distance between two strings. Along the same lines. Specifically this function returns the number of corrections that would need to be applied to turn one string into the other. true). consider compare() and edit_distance(): integer compare(string. string.Example: // Expression string test boolean match // initialize test="Hello" match=false // compare string values if(test=="Hello") then match=true To get a more in-depth comparison. DataFlux Expression Language Reference Guide 41 . The comparison between two strings is done lexicographically. hey. // difference is 3 because there are three different letters difference = edit_distance(hello.

boolean match_string(string. that many replacements are made. search) if (match) then begin // Will find the first occurrence of day found_first=instr(content. integer) Consider the following example: // Expression string starter string replace 42 DataFlux Expression Language Reference Guide . string replace(string. or if it is a substring inside another string? The following built-in EEL functions handle this situation. boolean instr(string. string. Tuesday is rainy & Wednesday is windy" string search search="*Wednesday is windy" // note the * wildcard integer found_first integer found_next boolean match // Check if the search string is in the content match=match_string(content. string. which may contain wildcards. and returns the string with the replacement made. Consider the following code example with these functions: // Expression string content content="Monday is sunny. "day". If set to another number. stating the occurrence of the string. "day". all occurrences will be replaced in the string.Exercise 2: How do I check if a string matches. 2) end Replacing Strings The replace() function replaces the first occurrence of one string with another string. 1) // Will find the second occurrence of day found_next=instr(content. If the fourth parameter is omitted or set to 0. integer) The instr() function returns the location of one string within another string. string. string) The match_string() function determines if the first string matches the second string.

The pattern() function indicates if a string has numbers. pattern_string="abcdeABCDE98765". upper.string replaceWith string final starter="It's a first! This is the first time I came in first place!" replace="first" replaceWith="second" final =replace(starter. DataFlux Expression Language Reference Guide 43 . Exercise: How do I get a string pattern? EEL identifies the following as part of a string's pattern • • • 9 = numbers a = lowercase letters A = uppercase letters To determine the string pattern. // The result will be aaaaaAAAAA99999 result=pattern(pattern_string). replaceWith. Consider the following example with this function: // Expression string result. 2) seteof() This example produces the following results: starter It's a first! This is the first time I came in first place! replace first replaceWith second final It's a second! This is the second time I came in first place! Finding Patterns It is possible to extract patterns out of strings using EEL.and lower-case characters. use the pattern() function: string pattern(string) Pattern() generates a pattern from the input string. replace. string pattern_string.

Exercise: How can I convert field names into values? In this example. boolean has_control_chars(string) This function determines if the string contains control characters.Control Characters EEL can identify control characters such as a horizontal tab. suppose you have incoming data from three fields: field1. line feed. vareval() is a slow function and should be used sparingly. In this way. and field3 as shown in the following table. See a list of control characters at Appendix A: ASCII Control Characters. 4 Clover Ave. etc.txt field_1 1 2 3 4 5 field_2 Bob Brauer Don Williams Mr. 44 E. string vareval(string) Note: Since it has to look up the field name each time it is called. Cary Raleigh Wilmington Durham Apex field_4 field_5 NC NC NC NC NC We will write a for loop that builds the string ("field" & n). Market Street 300 Chatham Dr. Exercise: How can I detect control characters in a string? Use the has_control_chars() function. C:\varevalExample. Evaluating Strings The vareval() function evaluates a string as though it was a variable. and uses vareval() to get the value of the field. field2. 99 A Dogwood Ave. Amber Jones I Alden field_3 123 Main St. // Pre-expression string field_number string field_value // Expression hidden integer n for n=1 to 5 begin field_number='field_' & n field_value=vareval(field_number) 44 DataFlux Expression Language Reference Guide . you can dynamically select the value of a field. Jim Smith Ms.

n=n+1 pushrow() end // this next statement prevents the last row from showing up twice return false A preview of the job yields the following results: Converting Field Names to Values List of String Functions This is a list of available functions that a user can use while working with string types. Function aparse Syntax integer aparse(string. string. Returns an ASCII character code for a character. Returns a character for an ASCII character code. object array) integer asc(string) string chr(integer) Description Parses a string into a string array. The number of elements is returned. asc chr DataFlux Expression Language Reference Guide 45 .

string) parse integer parse(string. along with integers. string. Returns the rightmost characters of a string. and strings. Returns the leftmost chars of a string. stating the occurrence of the string.Function compare Syntax Description integer compare(string. If third parameter is true. integer edit_distance(string. Returns the total number of tokens. or 1 if first > second. Returns the string with leading and trailing white-space removed. Returns the location of one string within another. string) Returns the distance between two strings. Returns true if the string is made up entirely of alphabetic characters. 0 if they are equal. are considered basic data types in EEL. any) pattern right trim string pattern(string) string right(string. Generates a pattern from the input string. Determines if the string contains control characters. boolean) Compares two strings. string. booleans. Returns 0 if not found. Similar to other basic data types. Parses a string using another string as a delimiter. Determines if the first string matches the second string. string. integer) integer len(string) boolean match_string(string. integer) string trim(string) vareval string vareval(string) Dates and Times Dates. edit_distance has_control_cha boolean has_control_chars(string) rs instr integer instr(string. EEL provides functions to perform operations on dates. Returns the length of a string. which may contain wildcards. reals. Evaluates and returns the value of a variable with the given name. they are compared insensitive to case. • • Finding Today's Date Formatting a Date 46 DataFlux Expression Language Reference Guide . Returns 1 if first<second. integer) isalpha boolean isalpha(any) left len match_string string left(string.

• • • •

Extracting Parts from a Date Adding or Subtracting from a Date Comparing Dates List of Date Functions

Finding Today's Date
Exercise 1: How do I find the year, month and day values for today's date? Use the today() function. date today() This function returns the current date/time. // Expression date localtime localtime=today()

Formatting a Date
Exercise 2: What formats can a date have? Dates should be in the format specified by ISO 8601 (YYYY-MM-DD hh:mm:ss) to avoid ambiguity. Remember that date types must start with and end with the # sign. For example: Date only: // Expression date dt dt=#2007-01-10# //Jan 10 2007 Date with time: // Expression date dt dt=#2007-01-10 12:27:00# //Jan 10 2007 at 12:27:00 Exercise 3: How do I format the date? In EEL use the formatdate(), to specify a format for the date. string formatdate(date, string) Formatdate() returns a date formatted as a string. For example: // Expression // all have the same output until formatted explicitly date dt dt=#2007-01-13# string formata string formatb

DataFlux Expression Language Reference Guide

47

string formatc formata=formatdate(dt, "MM/DD/YY") // outputs 01/13/07 formatb=formatdate(dt, "DD MMMM YYYY") // outputs 13 January 2007 formatc=formatdate(dt, "MMM DD YYYY") // outputs Jan 13 2007

Extracting Parts from a Date
Exercise 4: How do I get individual components out of a date? The formatdate() function can also be used to extract parts of a date. For example: // Expression date dt dt=#10 January 2003# string year string month string day // year should be 03 year=formatdate(dt, "YY") // month should be January month=formatdate(dt, "MMMM") // day should be 10 day=formatdate(dt, "DD") Note that if the date format is ambiguous, EEL will parse the date as MDY.

Adding or Subtracting from a Date
Exercise 5: Can I do arithmetic with dates? EEL offers the ability to add or subtract days from an existing date. For example: // Expression date dt // variable that will contain the date dt=#10 January 2003# date later date earlier // add three days to the original date later=dt+3 // subtract three days from the original date earlier=dt-3

48

DataFlux Expression Language Reference Guide

Comparing Dates
To compare dates, use the formatdate() function. Exercise 6: How do I check if two dates match and are the same? Convert the date to a string type using formatdate() function and then check for the string's value. For example: date dt // the variable that will contain the date // that we want to compare against dt=#1/1/2007# // The string variable that will contain the // dt date in a string format string dt_string // The variable that will convert the // incoming date fields to string dt_string=formatdate(dt, "MM/DD/YY") string Date_string // Notice that `DATE` is the incoming field // from the data source It is written between `` so // it does not conflict with the date data type Date_string=formatdate(`DATE`, "MM/DD/YY") // boolean variable to check if the dates matched boolean date_match // Initialize the variable to false date_match=false if(compare(dt_string, Date_string)==0)then date_match=true

List of Date Functions
Following is a list of date functions. Function formatdate today date today() Syntax string formatdate(date, string) Description Returns a date formatted as a string. Returns current date/time.

DataFlux Expression Language Reference Guide

49

which may contain wildcards. but not for the check parameters. This function "pushes" the current values of all symbols (this includes both field values for the current row and defined symbols in the code) to a stack. Returns true if expression is a number. This function returns true if a string is empty. Boolean variables can be declared in the following formats: boolean b b=true //sets boolean b to true b='yes' //also sets boolean b to true b=0 //sets boolean b to false This data type is used when comparisons are made. string) boolean print(string. it is given from the top of the stack instead of being read from the step has_control_char boolean s has_control_chars(string) inlist boolean inlist(any) isalpha boolean isalpha (any) isblank boolean isblank(any) isnull isnumber match_string boolean isnull(any) boolean isnumber(any) boolean match_string(string. Determine if the string contains control characters. List of Boolean Functions This is a list of available functions that a user can use while working with boolean types. "Null" is propagated for the value parameter. When the next row is requested. Using AND or OR in an expression also results in a boolean value. Returns true if any of the check parameters match the value parameter. Determine if the first string matches the second string. Returns true if the specified file exists. Returns true if the expression is a string made up entirely of alphabetic characters.Boolean Boolean is a basic data type representing a true or false value. Returns true if the expression is null. or contains only space-like characters. Function deletefile fileexists Syntax boolean deletefile(string) boolean fileexists(string) Description Deletes a file from disk. Prints the string to the log. boolean) boolean pushrow(boolean) print pushrow 50 DataFlux Expression Language Reference Guide . null.

When the stack is empty. The user can then search for the error message to see if the associated condition was responsible for stopping the job. When a row is given from the stack. Users can define a condition. The parameter of pushrow() is reserved and should not be used." "George." and "Igor:" string oldname // oldname now contains "John. it is not be processed through the expression again." "Paul. the rows are read from above as usual.Function Syntax Description above. Useful for evaluating problems unique to an installation. and then use raiseerror to stop the job and return a user-defined error message when that condition occurs." and name contains "John" oldname=name readrow() // oldname still contains "John. This is a dfPower Architect-only function. raiseerror boolean raiseerror(string) Raises a user-defined error." but name now contains "Paul" Note that readrow has no effect when called from a pre or post expression. When called from a pre-group or postgroup expression it may cause undesirable results. For example: Assume that this step is below a step with a name field and the step outputs four rows. It returns false if there are no more rows to read. readrow boolean readrow() rowestimate boolean rowestimate(integer) Sets the total number of rows that this DataFlux Expression Language Reference Guide 51 . It reads the next row of data from the step above and fills the variables that represent the incoming step's data with the new values. This function always returns true. "John.

Function seteof setvar Syntax boolean seteof(boolean) boolean setvar(string. string) Description step reports. Sleeps the specified number of milliseconds and invokes the interrupt handler. This is only an estimate. sleep boolean sleep(integer) 52 DataFlux Expression Language Reference Guide . See the description in the Sub-Setting section. Sets the Architect macro variable value specified by the first parameter. Returns true.

Data Structures and File Manipulation
The Expression Engine Language (EEL) allows you to format and alter data through built-in functions and external processes. Specifically, you can use the following to structure and manipulate data: Arrays Groups Files Database Regular Expressions Blue Fusion Macro Variables Evaluating Incoming Data

Arrays
In EEL, it is possible to create arrays of simple types such as string, integer, date, boolean, and real. Currently there are three functions that apply to array types: set(), get(), and dim(). • • • • • • Creating an Array Retrieving Elements from an Array Changing an Array Size Determining an Array's Size Finding Common Values Between Columns Using Arrays List of Array Functions

Creating an Array
Exercise: How do I create an array and provide values for the items in the array? Use the reserved key word array to declare an array. string array variable_name integer array variable_name boolean array variable_name

DataFlux Expression Language Reference Guide

53

date array variable_name real array variable_name For example: // declare an array of integer types integer array integer_list // set the size of the array to 5 integer_list.dim(5) // the index that will go through the array integer index index=0 // Set the values of the items inside the // array to their index number for index=1 to 5 begin integer_list.set(index, index); end

Retrieving Elements from an Array
Exercise: How do I retrieve elements from an array? This example builds on the previous example: integer first integer last // Getting the first item from integer array first=integer_list.get(1); // Getting the last item from integer array last=integer_list.get(5)

Changing an Array Size
Exercise: How do I change the size of an array? Use the dim() function to change the size of an array. // array is originally initialized to 5 string array string_container string_container.dim(5) ... ... // the array is sized now to 10 string_container.dim(10)

54

DataFlux Expression Language Reference Guide

Determining an Array's Size
Exercise: How do I determine the size of an array? Use the dim() function to determine the size of an array. Remember that the dim() function is also used to set the size of an array. If no parameter is specified, the array size does not change. For example: // Expression integer array_size string array array_lister ... ... // after performing some operations on the array // array_size will then contain // the size of the array array_size=array_lister.dim()

Finding Common Values between Columns Using Arrays
Exercise: How do I find out if entries in one column occur in another column regardless of row position and number of times they occur? One way to address this problem is to create two arrays for storing two columns, then check if the values in one array exist in the other array. Find those values that match and store them in a third array for output. Begin with the following text file: C:\arrayTextDocument.txt A_ID 0 1 3 5 6 1 2 4 6 0 B_ID

Create a Data Input node as Text File Input and set the text file to C:\arrayTextDocument.txt in Architect. Create an Expression node, and declare the variables in the pre-expression step: // Pre-Expression // This is where we declare and // initialize our variables. hidden string array column_A hidden string array column_B hidden string array column

DataFlux Expression Language Reference Guide

55

set(column_A_size.dim(column_B_size) hidden integer commun_size commun_size=1 commun. `A_ID`) column_A_size=column_A_size+1 column_A. The size of the array may become quite large depending on the size of the column.dim(column_A_size) hidden integer column_B_size column_B_size=1 column_B.dim(column_B_size) In this step we retrieve input into the arrays and expand the size of the arrays as necessary.hidden integer column_A_size column_A_size=1 column_A. Later we will expand them to accommodate the number of rows that are added. All the arrays are defined in the beginning to be of size 1. so it is recommended you use this example with small tables.dim(column_A_size) // Name the Second_Column field as you need column_B.set(column_B_size. // Expression // Name your First_Column field as you need column_A.dim(commun_size) All the variables are hidden and are not displayed on the output. `B_ID`) column_B_size=column_B_size+1 column_B. // Post Expression // This is the step where most of the // logic will be implemented // index to iterate through column_A hidden integer index_column_A // index to iterate through column_B hidden integer index_column_B // index to iterate through commun array hidden integer index_commun // index to display the commun values that were found hidden integer commun_display_index // string that will contain the items // from column A when retrieving hidden string a // string that will contain the items // from column B when retrieving hidden string b 56 DataFlux Expression Language Reference Guide .

a) commun_size=commun_size+1 commun.get(index_column_B) // Compare the entries from column A with // the entries from column B if(compare(a.get(index _commun) if(compare(commun_content. don't display it again hidden boolean commun_found // This is the variable // that will display the common entries in the end string commun_display // Retrieves the entries in column A for index_column_A=1 to column_A_size Step 1 begin a=column_A.// String that will contain the contents of the // commun array when retrieving hidden string commun_content // This boolean variable // is to check if a commun entry has already // been found.set(commun_size.a)==0 ) then commun_found=true end // It is a new entry.b)==0) begin // Check if this entry was already found once commun_found=false for index_commun=1 to commun_size Step 1 begin commun_content=commun.dim(commun_size) end end end end // Display the contents of the commun array // to the screen output for commun_display_index=1 to commun_size Step 1 begin pushrow() commun_display=commun.get(commun_display_index) end A preview of the job results in the following output: DataFlux Expression Language Reference Guide 57 .get(index_column_A) for index_column_B=1 to column_B_size Step 1 begin b=column_B. Add it to the // commun array and increment its size if(commun_found==false) begin commun. If so.

and add the following filtering code: // Expression if(isnull(`commun_display`)) then return false else return true A preview of the job now shows only the values that the two arrays have in common. 58 DataFlux Expression Language Reference Guide .Values of Two Arrays If you want to see the output limited to the common values add another Expression node.

dim(integer) arrayName.any) Description Used for declaring an array of the specified size.get(integer) arrayName. Sets values for items within an array. Note: The use of grouping in EEL is similar to the use of the Group By clause in SQL.Values Two Arrays Have in Common List of Array Functions Following is a list of array functions. Groups Expressions provide the ability to organize content into groups. Retrieves the value of the specified item within an array. Once data is grouped.set(integer. The Expression Engine Language (EEL) has built-in grouping functionality that contains this logic. you can use other functions to perform actions on the grouped data. Function dim get set Syntax arrayName. DataFlux Expression Language Reference Guide 59 .

Data Source Properties dialog 4. Select Data Inputs > Data Source from the left panel of dfPower Architect. using the contacts table from a DataFlux® sample database. Click Sort. In the Data Source Properties dialog. For example: Problem Statement: We want to count the number of different states that contacts are coming from. In the Sort dialog. Click the right double arrow to select all. 1.Setting Up The Contacts Data Source Select the Contacts table from the DataFlux sample database as the data source. select State and click the right single arrow.Exercise 1: Can EEL group my data and then count the number of times each different entry occurs? Yes. 60 DataFlux Expression Language Reference Guide . 6. 5. Set the Sort Order to Ascending. 2. Step 1: Preliminary Steps . 3. browse for the Contacts table under Input Table.

on the Expression tab. select Pushed Status Field. a. DataFlux Expression Language Reference Guide 61 . Under Utilities. c. Add the following: pushrow() Click OK. Grouping dialog b. In the Expression Properties dialog. Name the field pushed. Click OK. 2. click Grouping. we can indicate that a row is pushed by creating a boolean variable. 3. Here. Furthermore. The advantage of indicating that a row is pushed is that later we can add another expression step where we filter only the pushed rows. This allows us to see the final state count rather than seeing the statecount incremented by one for each iteration. we want to push a row so that we can have an extra row to store the final count for each group state. declare any steps that you want to perform after grouping is complete. On the Group Post Expression tab. On the Grouping dialog. Select State and click the right single arrow. On the Expression tab.Sort dialog 7. Step 2: Creating the Grouping 1. select the Group Fields tab. select the Expression node.

and N Carolina are grouped together? A convenient way to accomplish this is to add an expression node or a standardization node in Architect where you can standardize all entries prior to grouping. declare any variables that you want to create before grouping any data. In this case we declare an integer statecount that counts the states in each group after the data has been grouped by state. In the Standardization Properties dialog.Quality. on the Pre-Expression tab. On the Expression Properties dialog.4. Building on the previous example. we add a Standardization step: 1. North Carolina. Enter the following: integer statecount // Variable that will be used to count each state statecount=0 Exercise 2: How can I count each state in the input so NC. 62 DataFlux Expression Language Reference Guide . Standardization Properties dialog 3. This creates a new field called STATE_Stnd. 2. Click Additional Outputs and select all. select Standardization. Under Architect . select State and specify the State (Two Letter) Definition.

Step 1: Connect to a Data Source 1. Go back to the Expression Properties dialog. Connect to the Purchase table in the DataFlux sample database. In this case. Exercise 3: How do I group my data and find averages for each group? To illustrate how this can be done. Change the Output Name to ORDER_DATE. we sort by DEPARTMENT. 3. Step 2: Sorting the Data 1. In the Data Sorting Properties dialog. In Architect. Make certain that Grouping is now by STATE_Stnd and not STATE. we sort on the data field that we use for grouping. 2. DataFlux Expression Language Reference Guide 63 . Click OK. Now that we have connected to the Purchase table. 2. Click OK. The statecount now increments by each standardized state name rather than by each permutation of state and province names. Click Grouping. we use sample data. select Utilities > Data Sorting to add a Data Sorting node after the Expression node. 6.Additional Outputs dialog 4. click Add All. Find the Field Name for ORDER DATE. 5. In the Data Source Properties dialog. select DEPARTMENT and set the Sort Order to Ascending.

3.Data Sorting Properties dialog Step 3: Creating Groups To create groups out of the incoming data we add another expression step to our job flow after the sorting step. we can go to the Expression tab and update our variables with each upcoming new row. // Expression // increase the total sales total=total+ITEM_AMOUNT // increase the number of entries count=count+1 4. declare the following fields: // Group Pre-Expression // This variable will contain the total // sales per department real total total=0 // This variable will keep track of the // number of records for each department integer count count=0 // This variable will contain the // running average total real average average=0 After declaring our variables on the Group Pre-Expression tab. 1. 2. We are presented with the three tabs: Group Fields. and Group Post-expression In the Group Fields tab select DEPARTMENT. click on Grouping. On the Group Pre-Expression tab. 64 DataFlux Expression Language Reference Guide . In the Expression node. Group Pre-expression.

...95 .txt..// error checking that the count of entries is not 0 if count !=0 then begin average=total/count average=round(average. 8497 6144. read.open("c:\filename."r") The open function opens filename. For example: File f f. 122887. count average 1 3 4 .2) end When you do a preview.9 6599... by clicking on the Preview tab in the bottom panel of Architect. Other modes are "a" (append to end of file) and "w" (write). Read and write operations are supported in the file object and there are additional functions for manipulating and working with files. The mode for opening the file is read.txt". you should see the following in the last four columns: DEPARTMENT 1 1 1 1 .. total 8497 18788.7 26399. DataFlux Expression Language Reference Guide 65 . and write files. A file is opened using the File object.. A combination of these switches may be used. This section is organized into the following divisions: • • • • • Overview of the File Object File Operations Manipulating Files Executing Programs and File Commands List of File Functions Overview of the File Object The file object can be used to open.35 6262.8 .2 2 Files Use the file object to work with files in EEL.

open("C:\filename. Given the following text files: C:\filename.open("C:\filename. Exercise 3: How do I read lines from a text file.txt Name Jim Joan Pat C:\filepet.File Operations Exercise 1: How do I open a file? To open a file in EEL.readline() // Post Expression f.txt Pet Fluffy Fido Spot 66 DataFlux Expression Language Reference Guide ."r") The second parameter to the file object indicates the mode for opening the file (read. use this expression: // Expression File f f. treating each as a single row from a data source? After opening a file use the following code to read a string line of input: // Pre-Expression File f string input f.txt". Exercise 2: How do I read lines from a file.txt". The file cursor advances one line in the text file for the each row of input from the data source. write. and create one output line for each line in the text file? Make a while loop that iterates through each line in the file with every row. "rw") // Expression input=f.close() Make sure that you have checked Generate rows when no parent is specified. or read/write).

txt") g. // Expression File f f.readline() print('The value of input is ' & input) end seteof() // Post Expression f.txt and filepet.writeline("Hello World ") // Post Expression f. you see null for the input string since at the completion of the loop. the input string has a null value. Seekbegin([position]) DataFlux Expression Language Reference Guide 67 .txt files: Exercise 4: How do I write to a file? Use the writeline() function in the file object to write to a file. If you preview the job.For example: // Expression File f File g string input input='hello' f. but the log pane shows each of the possible values listed in the filename.open("C:\filename. Exercise 5: How do I move from one position to another in a file? There are three functions in the file object that can accomplish this.txt".open("C:\filename.txt") while (NOT isnull(input)) begin input=f.close() Caution: This function overwrites the current contents of your text file.readline() print('The value of input is ' & input) input=g.open("C:\filepet.close() This prints the contents of the two files to the log. A good way to see how this example works in your job is to add an expression step that sets the end of file: // Expression seteof() The preview pane shows the value of input as null. "w") f.

pets) seteof() 68 DataFlux Expression Language Reference Guide . use the boolean function. The parameter specifies the position from the end of the file. If you move to the beginning of the file. Seekend([position]) Seekend() sets the file pointer to a position starting at the end of the file. false otherwise. in order to append to the end of a file you would select Generate rows when no parent is specified.Seekbegin() sets the file pointer to a position starting at the beginning of the file. The destination file can originate or be amended by this function. It returns true on success. All of these functions receive as a parameter the number of bytes to move from the current position in the file. The parameter specifies the number of bytes from the current position.seekend(0) f.open("C:\Text_File\file_content. Specify 0 in the seekbegin() or the seekend() functions to go directly to the beginning or the end of the file. // Expression string names string pets names="C:\filename. The parameter specifies the position.txt". copyfile(). Seekcurrent([position]) Seekcurrent() sets the file pointer to a position starting at the current position. Close the file with f.txt" copyfile(names. and enter: // Expression File f f. false otherwise. It returns true on success. which takes the originating file as the first parameter and the destination file as the second parameter. As an example. It returns true on success. false otherwise. using writeline() overwrites existing content. "rw") f.close() Exercise 6: How do I copy the contents of a file to another file? To copy the contents of one file to another.close() in the post-processing step: // Post Processing f.writeline("This is the end ") seteof() This adds the text "This is the end" to the end of the file.txt" pets="C:\filecopy.

open("C:\filename.txt". this expression produces the following: C:\filename. boolean) The filedate() function returns the date a file was created. input) By overwriting existing data. string input File a a. use the writebytes() function. "rw") b. use the readbytes() function.writebytes(10. To find the dates the file was created and modified use: date filedate (string.txt".txt".txt") created=filedate("C:\filename.Exercise 7: How do I read or write a certain number of bytes from a text file? To read a specified number of bytes from a text file. If second parameter is true.open("C:\filename.txt"." File b b.txt This string Joan Pat Manipulating Files Exercise 1: How do I retrieve information about the file? To check if a file exists use this function: boolean fileexists(string) The fileexists() function returns true if the specified file exists. The string parameter is the path to the file.readbytes(10. "r") a. string input input="This string is longer than it needs to be. true) seteof() DataFlux Expression Language Reference Guide 69 . returns the modified date. input) To write a specified number of bytes to a text file. false) modified=filedate("C:\filename. To summarize: // Expression boolean file_test date created date modified file_test=fileexists("C:\filename.

txt") seteof() Note: The directory structure must already be in place for the function to move the file to its new location.txt". copying. you have most likely entered the file path incorrectly. the following code moves filename. the following code changes the default permissions of a text file created by Architect. or deleting a file? Use the deletefile() function to delete a file: boolean deletefile(string) This action deletes a file from the disk. "rw") f. boolean newLocation newLocation=movefile("C:\filename.txt".open("C:\filename.txt from the root to the Names folder. string) For example. To get the size of a file you can open the file."C:\Names\filename. seek to the end of the file and then use the position() function to give you the size in bytes. The string parameter is the path to the file.Note: If the filedate() function returns a null value but fileexists returns true. Once you delete a file it is gone. To illustrate: // Expression File f integer byte_size f. Use movefile() to move or rename a file: boolean movefile(string.position() Exercise 2: Is it possible to perform operations such as renaming.seekend(0) // The integer variable byte_size will have // the size of the file in bytes byte_size=f. 70 DataFlux Expression Language Reference Guide . Note: Use care when employing this function. Executing Programs and File Commands Use the execute() function to execute programs: integer execute(string) For example.

Either: HKEY_LOCAL_MACHINE\Software\DataFlux Corporation\expression\escaped_string -orHKEY_CURRENT_USER\Software\DataFlux Corporation\expression\escaped_string One of these should be set to either true. Parameters: • • /q = turns echo off.exe".bat"). We invoke the MS DOS command prompt by calling cmd.To execute the command directly. In Windows. yes.exe. DataFlux Expression Language Reference Guide 71 . there is a global compatibility mode that is checked every time the expression engine is started."/c". type: execute("/bin/chmod". /c = carries out the command specified for the MS DOS prompt and then closes the prompt. type: execute("/bin/sh". Note: There has been a change to the way expression engine handles the backslash character. the backslash character does not need to be escaped.1. For example: C:\\Program Files\\DataFlux should now be entered as C:\Program Files\DataFlux To support existing expressions that escape the backslash. "file.txt") Running a Batch File by Using Execute Function //Expression execute("cmd. C:\BatchJobs. "/q" . if you need the old behavior (with escaped backslashes) you must enter the following setting in the registry. "chmod 777 file. In UNIX. Beginning with dfPower Studio version 7. or 1 in order for the old functionality to be in effect. set the environment variable ESCAPED_STRING to one of the above values. "-c". "777".txt") or to execute from the UNIX/Linux shell. List of File Functions This is a summary of file object functions and other functions that relate to file types.

specified by the first parameter. Reads the next line of data from an open file. Seekcurrent() sets the file pointer to a position starting at the current position. The fileexists() function returns true if the specified file exists. Null is returned if there was a condition such as end of file. Opens a file where the path is the first parameter. read. Moves or renames a file. string) deletefile execute boolean deletefile(string) integer execute(string) filedate date filedate(string. such as the program not being found. Seekbegin() sets the file pointer to a position starting at the beginning of the file. If second parameter is true. If left blank. it is created. Returns the current position of the cursor in the file. The parameter specifies the number of bytes from the current position. Executes a program with zero or more arguments. write) is the second parameter. The mode for opening the file (append. false otherwise. The parameter specifies the position. which is a positive integer. Returns true on success. boolean) boolean fileexists(string) fileexists movefile open boolean movefile(string. mode) position readbytes position() readbytes(integer. is returned. The text is returned. false otherwise. The exit status of the program. If the file does not exist. This is indicated by a. Reads a set number of bytes from an open file. This deletes the file specified from disk. The string parameter is the path to the file.Function close Syntax close(string) Description Closes a file with the string equal to the file's path. copyfile copyfile(string. It returns true on success. A maximum of 1024 bytes are read. Returns true on success. string) open(filename. r. The readbytes method accepts two parameters: the number of bytes to read. The number of characters actually read is returned. If an error occurs. false otherwise. then -1 is returned. and the target string that holds the bytes that are read. w or a combination of these. returns the modified date. string) readline readline() seekbegin seekbegin(integer) seekcurrent seekcurrent(integer) 72 DataFlux Expression Language Reference Guide . The filedate() function returns the date a file was created. to the location specified in the second parameter. Copies one file. the currently open file is closed.

string) writeline writeline(string) Binary Data Dataflux® expressions provide the ability to retrieve data in binary format. This returns true on success. and also determines the byte order based on your host or native system. This function overwrites existing data. where the width (w) must be between 1 and 8. inclusive. It accepts a string parameter which is the string to write to the file. DataFlux Expression Language Reference Guide 73 . and false otherwise. format_str) where: • • string = octet array containing binary data to convert. writebytes writebytes(integer. which allows you to retrieve binary data. and the data to write to the file is a string. Writes a line of text to an open file. It accepts two parameters: the number of bytes to write is an integer. inclusive.d formats/informats specify the width of the data in bytes. format_str = string containing the format of the data. as well as mainframe and packed data formats. The parameter specifies the position from the end of the file. • • Big Endian and Little Endian Formats Converting Binary Data To A Certain Format Big Endian and Little Endian Format Exercise 1: How do I retrieve binary data in either big endian or little endian format? Use the ib() function. Returns the number of bytes actually written. and the optional decimal portion as an integer which represents the power of ten by which to divide (when reading) or multiply (when formatting) the data. Writes a set number of bytes to an open file. expressed as w. The optional decimal (d) must be between 0 and 10. with the default being 4. Returns true on success.d. The function syntax is: real = ib(string. This section describes how to retrieve and convert binary data in Big Endian or Little Endian formats. false otherwise. The w.Function seekend Syntax seekend(integer) Description Seekend() sets the file pointer to a position starting at the end of the file.

"r") //This reads the 4 byte string buffer bytes_read=input_file."4. 74 DataFlux Expression Language Reference Guide . The syntax is: real = s370fib(string.d format of the data.open("C:\binary_file". The s370fib() function has been incorporated for reading IBM mainframe binary data.0") Exercise 2: How do I force my system to read big endian data regardless of its endianness? Use the s370fib() function.Example: //Expression //File handler to open the binary file file input_file //The binary value to be retrieved real value //The number of bytes that were read integer bytes_read //4-byte string buffer string(4) buffer input_file.readbytes(4. format_str = string containing the w. This function always reads binary data in big endian format. buffer) s //The width (4) specifies 4 bytes read //The decimal (0) specifies that the data is not divided by any power of ten value = ib(buffer. format_str where: • • string = octet array containing IBM mainframe binary data to convert. Exercise 5: How do I read binary data on other non IBM mainframes? Currently there are no functions available for this purpose. Exercise 4: How do I read IBM mainframe binary data? Use the s370fib() function described earlier. Exercise 3: How do I read little endian data regardless of the endianness of my system? Currently there are no functions available for this purpose. Use this function just like the function ib().

inclusive. Exercise 8: How do I format binary data to the native endianness of my system? Use the formatib() function. it is also possible to format data to a special binary format. This function retrieves IBM mainframe packed decimal values.d format of the data. The optional decimal (d) must be between 0 and 10. with the default being 1. The syntax is: integer = formatib(real.Exercise 6: Is there support for reading binary packed data on IBM mainframes? Use the function s370fpd(). string) where: • • • real = numeric to convert to a native endian binary value. Exercise 7: How do I read non-IBM mainframe packed data? Use the function pd().d format of the data. DataFlux Expression Language Reference Guide 75 . format_str where: • • string = octet array containing IBM mainframe packed decimal data to convert.d format of the data. The syntax is: real = s370fpd(string. Converting Binary Data To A Certain Format Just as it is possible to retrieve data in a special binary format. The width (w) must be between 1 and 8. inclusive. string = octet array in which to place the formatted native endian binary data. format_str = string containing the w. This function treats your data in big endian format. The syntax is: real = pd(string. with the default being 4. format_str where: • • string = octet array containing non-IBM mainframe packed data to convert. inclusive. format_str = string containing the w. returns: integer = byte length of formatted binary data This function produces native endian integer binary values. format_str = string containing the w. format_str. The width (w) must be between 1 and 16.

d format of the data. string = octet array in which to place the formatted IBM Mainframe binary data.125 Exercise 9: How do I change to other formats? Use the following functions: Non-IBM mainframe packed data integer = formatpd(real.125 //The buffer that contains the formatted data string(4) buffer format_size= formatib(number.3". string) where: • • • real = numeric to convert to a native-packed decimal value. "4. format_str. buffer) //4. format_str. "4. returns: integer = byte length of formatted packed decimal data IBM mainframe binary data integer = formats370fib(real.Example: //Expression //The byte size of the buffer that contains the content real format_size //The real type number real number //The real number that is retrieved real fib_format number=10.3 is to specify 4 bytes to read the entire data and 3 to multiply it by 1000 //The reason to multiply it by a 1000 is to divide it later by 1000 //To restore it back to a real number fib_format= ib(buffer. string) where: • • • real = numeric to convert to an IBM Mainframe binary value. string = octet array in which to place the formatted native-packed decimal data. format_str = string containing the w. format_str = string containing the w.3") //Verify that the formatting worked //Fib_format should be 10. returns: integer = byte length of formatted binary data 76 DataFlux Expression Language Reference Guide .d format of the data.

2 bytes are consumed. If the sum is greater than 9 and less than 19.out".close() end The piccomp() function determines the number of bytes (2. buffer) if (4 == rc) then comp = piccomp(buffer. If the sum is less than 5. DataFlux Expression Language Reference Guide 77 . string) where: • • • real = numeric to convert to an IBM mainframe-packed decimal value. "S9(8)") pd. COMP-3. returns: integer = byte length of formatted packed decimal data COBOL Support Using expressions. 4 bytes are consumed. format_str = string containing the w. The following examples demonstrate how to do this. string = octet array in which to place the formatted IBM mainframe-packed decimal data. or 8) to consume by comparing the sum of the 9s in the integer and fraction portions to fixed ranges.open("binary_input.readbytes(4. 8 bytes are consumed. 4. and COMP-5 data formats. The following example demonstrates one solution: //Expression //file handler to open files File pd integer rc string(4) buffer real comp if (pd. format_str. "r")) begin rc = pd. • • Reading binary data Formatting binary data Reading Exercise 1: How do I read native endian binary data for COBOL? Use the function piccomp() for this purpose. If the sum is greater than 4 and less than 10.IBM mainframe packed decimal data integer = formats370fpd(real.d format of the data. it is possible to read binary data in specified COBOL COMP.

1) pd. "S9(4)V99". The result is then divided by 2.open("binary_input. format_str) where: • • string = octet array containing COBOL formatted packed decimal data to convert. As such. format_str = string containing the PIC 9 format of the data. format_str = string containing the PIC 9 format of the data. //Expression //file handler to open files file pd integer rc string(6) buffer real comp if (pd. 4 bytes were consumed.close() end 78 DataFlux Expression Language Reference Guide . Notice that packed data will always be in big endian form. 99999V99. SV99) or of the shortened count form: [S][9(count)][V9(count)] (ex: S9(5). sv9(2)) The signature for function piccomp is: real piccomp(string.In the preceding case. Notice that all of the COBOL data functions support a PIC designator of the long form: [S][9+][V9+] (ex: S99999. S999999V99.1. It is used in the same manner as the previous piccomp() function. 1 is added to make it even. buffer) if (4 == rc) then comp = picsigndec(buffer. See the example above. The piccomp3() function determines the number of bytes to consume by taking the sum of the 9s in the integer and fraction portions and adding 1. Exercise 2: How do I read packed decimal numbers? Expression language has the convenient function piccomp3() for this purpose. 9(5)v99. Exercise 3: How do I read signed decimal numbers in COBOL format? Use the picsigndec() function. format_str) where: • • string = octet array containing COBOL formatted packed decimal data to convert. The function's signature is: real piccomp3(string.out".readbytes(6. because of the format of the string is S9(8). If the new value is odd. S9(6)v9(2). S9(7) would mean 4 bytes to consume. "r")) begin rc = pd.

boolean ebcdic .The picsigndec() function determines the number of bytes to consume by taking the sum of the 9s in the integer and fraction portions of format_str. the formatpicsigndec() function determines the number of bytes to consume by taking the sum of the 9s in the integer and fraction portions. "s99V999". In essence the function formatpiccomp() does the reverse of piccomp(). real picsigndec(string buffer. boolean trailing) where: • • • • string buffer = octet array containing a COBOL formatted signed decimal number to convert. as demonstrated by the following example: Exercise 4: How do I format from a real to COBOL format? //Expression real comp comp = 10. result = octet array in which to place the COBOL formatted native endian binary data. string format_str = string containing the PIC 9 format of the data. string result) where: • • • real number = numeric to convert to a COBOL native endian binary value. Formatting It is also possible to format data to a specific COBOL format. DataFlux Expression Language Reference Guide 79 . As with the picsigndec() function. returns: integer = byte length of formatted binary data. buffer) //The string buffer will contain the real value comp formatted to platform COBOL COMP native endian format.125 integer rc rc = formatpiccomp(comp. The default format_str is S9(4). The default ebcdic setting is false. string format_str = string containing the PIC 9 format of the data. string format_str.string format_str. boolean trailing = boolean when set to non-zero indicates the sign is trailing. ??///The signature of the function is integer = formatpiccomp(Real number. boolean ebcdic = boolean when set to non-zero indicates the string is EBCDIC. The default trailing setting is true.

• • Overview of the DBConnect Object Connecting to a Database 80 DataFlux Expression Language Reference Guide . string format_str = string containing the PIC 9 format of the data. boolean ebcdic. To write a COBOL packed decimal value: integer = formatpiccomp3(Real number. You can also return a list of data sources. string format_str. boolean trailing) where: • • • • • real number = numeric to convert to a COBOL signed decimal value. You can connect to data sources using built-in functions that are associated with the DBConnect object. string format_str.to work with databases. string result) where: • • • real number = numeric to convert to a COBOL packed decimal value. string buffer. string buffer = octet array in which to place the COBOL formatted packed decimal data. string format_str = string containing the PIC 9 format of the data. To write a COBOL signed decimal value: integer = formatpicsigndec(real number. boolean trailing = boolean when non-zero indicates to set the sign on the trailing byte. Databases Use the DBConnect object in EEL.Exercise 5: What is the list of functions available for COBOL formatting? The COBOL format functions work in a very similar manner as the previous example. boolean ebcdic = boolean when non-zero indicates to format in EBCDIC. returns: integer = byte length of formatted packed decimal data. string result = octet array in which to place the COBOL formatted packed decimal data. and evaluate data input from parent nodes. returns: integer = byte length of the formatted signed decimal.

otherwise it is rolled back. integer. release() — Explicitly releases the statement. The second parameter is optional. release() — Releases the connection explicitly. dbdatasources() — Returns a list of data sources as a DBCursor. DBConnection Object Methods • • • • • • • • execute([sql_string]) — Executes an SQL statement and returns the number of rows affected. execute() — Executes the statement. and returns number of rows affected. DBStatement Object Methods • setparaminfo([param_index].[size]) — Sets information for a parameter. begintransaction() — Starts a transaction. the transaction is committed. If string_type is string.[value]) — Sets a parameter's value. tableinfo([table]. • • • • DataFlux Expression Language Reference Guide 81 .• • Listing Data Sources List of Database Functions Overview of the DBConnect Object The DBConnect object allows you to use the expression engine to connect directly to a relational database system and execute commands on that system as part of your expression code. There are three objects associated with this functionality: • • • DBConnection — A connection to the database DBStatement — A prepared statement DBCursor — A cursor for reading a result set Global Functions • • dbconnect([connect_string]) — Connects to a database. select() — Executes the statement returning results as a DBCursor. select([sql_string]) — Runs SQL and returns a DBCursor object. prepare([sql_string]) — Prepares a statement and returns a DBStatement object. String_type can be string. If commit is true. tablelist() — Gets a list of tables. real. date.[string_type]. setparameter([param_index].[schema]) — Gets a list of fields for a table. size is the string length. endtransaction([commit]) — Ends a transaction. returns a DBConnection object. or boolean.

use a match code generation node and have match codes created for some sample names in a text file. Declare a database connection object: dbconnection test_database Connect to the database: // Set connection object to desired data source // Saved DataFlux connections can also be used test_database=dbconnect("DSN=DataFlux Sample") Listing Data Sources Exercise 1: How do I return a list of data sources? The dbdatasources() function returns a list of data sources as a DBCursor. columntype([index]) — Returns the type of the specified column (0-based index). Call this new field "Name_MatchCode. Example: The following example works with the Contacts table in the DataFlux sample database. release() — Explicitly release the cursor. valueinteger([index]) — Returns the value of the specified column as an integer (0-based index). valuestring([index]) — Returns the value of the specified column as a string (0based index). Make sure you have some match codes in that table in a field called CONTACT_MATCHCODE." This example queries the Contacts table in the DataFlux sample database to see if there are any names that match the names you provided in your text file input. columnname([index]) — Returns the name of the specified column (0-based index). valuereal([index]) — Returns the value of the specified column as a real (0-based index). This text file is your job input step. columns() — Returns the number of columns. 82 DataFlux Expression Language Reference Guide . In the step before your expression step. columnlength([index]) — Returns the length of the specified column (0-based index).DBCursor Object Methods • • • • • • • • • next() — Retrieves the next record. This function returns a DBConnection object. Connecting to a Database Exercise: How do I connect to a database? Use the dbconnect() function to connect to a database.

30) Expression window // Declare Database Cursor and define fields returned from table dbcursor db_curs string Database_ID string COMPANY string CONTACT string ADDRESS // Set parameter values and execute the statement db_stmt.valuestring(3) pushrow() end db_curs. Returns a list of data sources as a cursor.Name) db_curs=db_stmt."string".valuestring(0) COMPANY=db_curs.prepare("Select * from Contacts where Contact = ?") db_stmt.release() // Prevent the last row from occurring twice return false List of Database Functions Following is a list of database functions.setparameter(0.valuestring(1) CONTACT=db_curs. Function dbconnect dbdatasources Syntax object/array dbconnect(string) object/array dbdatasources() Description Connect to a data source name (DSN).valuestring(2) ADDRESS=db_curs.Pre-processing window // Declare Database Connection Object dbconnection db_obj // Declare Database Statement Object dbstatement db_stmt // Set connection object to desired data source // Saved DataFlux connections can also be used db_obj=dbconnect("DSN=DataFlux Sample") // Prepare the SQL statement and define parameters // to be used for the database lookup db_stmt=db_obj. DataFlux Expression Language Reference Guide 83 .next() begin Database_ID=db_curs.setparaminfo(0.select() // Move through the result set adding rows to output while db_curs.

Regular Expressions The regular expression (regex) object allows you to perform regular expression searches of strings.valuestring(0). tablename = cursTables. //Iterate through the cursor while( cursTables. tabletype = cursTables.release().tablelist().Getting Table List from Database The following code lets a user get a list of tables in a particular database.valuestring(3) pushrow() end cursTables. //Expression string DSN DSN="DataFlux Sample" string connectStr //Preparing the connection string connectStr = "DSN=" & DSN DBConnection dbConn dbConn = dbConnect( connectStr ) string string string string tablename datapath tcatalog tabletype DBCursor cursTables //Retrieve table information in a cursor cursTables = dbConn.valuestring(1).next() ) begin datapath = cursTables. tcatalog = cursTables.valuestring(2). • • Using Regular Expressions List of Regular Expression Functions 84 DataFlux Expression Language Reference Guide .

matchlength()) Exercise 2: How do I know if my regex pattern matches part of my input? Check to see if your regex pattern finds a match in the input string: regex a boolean myresult a.matchstart() & " length " & r." DataFlux Expression Language Reference Guide 85 .findfirst("abcdef") startingPosition=r.compile("a.matchstart() Exercise 4: How do I replace a string within my regex? Compile the regex and use the replace function: regex r r.findfirst("abcdef") print("Found match starting at " & r.compile("a"."ISO-8859-7") myresult=a. regex r r.compile("a.compile("xyz") r.c") if r.c") if r. Exercise 1: How do I find matches within a string? Use findfirst() to find the first match in the string.replace("abc". this is best done in the preprocessing step.findfirst("abc") Exercise 3: How do I find the regex pattern I want to match? Find the first instance of the regex pattern you want to match: integer startingPosition regex r r."def") This exercise replaces "abc" with "def" within the compiled "xyz. and findnext() to find subsequent matches in the string. you must first compile.Using Regular Expressions For a regex to work. In Architect. Here are some examples.

Searches for the first string. Returns false if there was an error. Returns the start location of the nth captured substring. Here are the positive values: IBM1047 ISO-8859-1 ISO-8859-2 ISO-8859-7 ISO-8859-11 windows-1250 windows-1252 windows-1253 windows-874 findfirst findnext matchstart matchlength replace findfirst(string) findnext(string) matchstart() matchlength() r. substringcount substringstart substringlength substringcount() substringstart(integer) substringlength(integer) 86 DataFlux Expression Language Reference Guide . Returns true if a match was found. Returns the location of the last match.List of Regular Expressions Functions Following is a list of functions related to regular expressions. This parameter is optional. Null is returned if there was no match made. Returns true if a match was found. If it is not specified. and replaces it with the second. Null is returned if there was no match made.compile(string. Returns the number of captured substrings.replace(string. encoding) Description Compiles the regular expression string. Returns the length of the last match. The second parameter indicates the encoding to use. string) Searches the specified string for a match. the default ASCII encoding is used. This differs from the replace() function in that it makes the replacement within a compiled regex. Function compile Syntax r. Continues searching the string for the next match. Returns the length of the nth captured substring.

result) string. sensitivity.decode_string. Some of the advantages of using Blue Fusion functions within the Expression Engine Language include dynamically changing match definitions. Loads the desired QKB specified. reading them from another column.Encoding and Decoding You can encode and decode text strings from different formats using generic functions. which means that users can use Blue Fusion to perform the listed functions (object methods) from within the Expression Engine Language node. Generates a matchcode for an input input. Question: How do I transcode a given expression string from its native encoding into the specified encoding? Use the encode and decode functions. matchcode matchcode(match_def. as shown in this section. decode_string) //Decode to IBM1047 EBCDIC encode_return = encode("IBM1047". input (string) is the actual string for which the matchcode DataFlux Expression Language Reference Guide 87 . Four arguments: match_def (string) is the name of the match definition. or setting different definitions. and navigate to Base > Architect > User Interface > Encoding. The Blue Fusion functions supported within the Expression node are: Function getlasterror loadqkb getlasterror() loadqkb(locale) Syntax Description Returns the error string. Example: //Expression string expression_string expression_string="Hello World" string decode_string string encode_string integer decode_return integer encode_return decode_return = decode("IBM1047". sensitivity (integer) is the sensitivity level to apply. such as ENUSA. Blue Fusion Functions The Expression Engine Language (EEL) supports the Blue Fusion object.encode_string) //Encode string should be "Hello World" Question: What are the available encodings? See the online help for DataFlux® dfPower® Architect. expression_string. One argument: locale (string) which is the designation for the locale. or FRFRA.

input. then an error occurred. input. identify identify (ident_def. input (string) is the string to be identified. casing_type (integer) which is a numerical designation that determines the type of case [1: upper case. 2: lower case. input (string) which is the string input to be determined. then an error occurred. If a 0 is returned. input (string) which is the string to be converted. result) Converts a string to the correct case. then an error occurred. input. standardize standardize (stdzn_def. case (case_def. result (string) is the generated matchcode. result (string) is the result. input. Three arguments: stdzn_def (string) which is the name of the standardization algorithm. result) gender gender (gender_def. result (string) which is the output result.Function Syntax Description is to be generated. Three arguments: pattern_def (string) which is the name of the pattern definition. If a 0 is returned. then an error occurred. result (string) which is the output result. Three arguments: ident_ref (string) is the name of the identification definition. If a 0 is returned. result (string) which is the output result. result (string) is the resulting pattern output. casing_type. input (string) which is the input string to be standardized. 3: proper case]. Determines if an input string represents an individual or organization. then an error occurred. If a 0 is returned. Four arguments: case_def (string) which is the name of the case definition. case pattern pattern (pattern_def. Three arguments: gender_def (string) which is the name of the gender definition. input. If a 0 is returned. If a 0 is returned. result) Standardizes an input string. input (string) which is the string to be analyzed. result) Analyzes an input string and reports a pattern. result) Determines whether the input string represents an male or female individual. then an error occurred. 88 DataFlux Expression Language Reference Guide .

Once a Blue Fusion object is defined and initialized. // Pre-processing // defines a bluefusion object called bf bluefusion bf. refer to their abbreviation. //initializes the bluefusion object bf bf = bluefusion_initialize(). Exercise 2: How do I create match codes? After you initialize the Blue Fusion object with a QKB in the Pre-processing tab. enter the following expressions: // Expression // define mc as the return string that contains the matchcode string mc DataFlux Expression Language Reference Guide 89 . Go to the dfPower Navigator and click Quality Knowledge Base to see which QKBs are available for your system. // initializes the bluefusion object bf bf = bluefusion_initialize() // loads the English USA Locale bf. Exercise 1: How do I start a Blue Fusion instance and load a QKB? Remember that these go into the Pre-processing tab. Function Syntax Description Initializes the Blue Fusion object for use in the Expression Language node. bluefusion_initializ bluefusion_initialize() e() This global function goes into the Pre-processing tab as shown in the following example: //Pre-processing //defines a bluefusion object called bf bluefusion bf. To load other QKBs besides ENUSA.Global Functions There is one global function. Exercises The following exercises demonstrate how the Blue Fusion object methods can be used in the Expression Engine node. the functions methods listed can be used within the Expression Engine node. and it is used to initialize the Blue Fusion object.loadqkb("ENUSA").

mc).getlasterror() else error_message = 'Successful' Exercise 3: How do I do use Blue Fusion standardize? After you initialize the Blue Fusion object in the Pre-processing tab. otherwise return a success message..// define the return code ret as an integer integer ret // define a string to hold any error message that is returned. otherwise return a success message if ret == 0 then error_message = bf. if ret == 0 then error_message = bf. //if an error occurs display it.matchcode("city". "9195550673". stdn).standardize("phone". display it. "Washington DC". and // put the result in mc ret = bf.getlasterror() else error_message = 'Successful' Exercise 4: How do I use Blue Fusion identify? After you initialize the Blue Fusion object in the Pre-processing tab. string error_message // generate a matchcode for the string Washington D. // and put the result in stnd ret = bf. enter the following expressions: // Expression // define stdn as the return string that contains the standardization string stdn // define the return code ret as an integer integer ret // define a string to hold any error message that is returned string error_message // standardize the phone number 9195550673. // if an error occurs. enter the following expressions: // Expression // define iden as the return string that contains the identification string iden // define the return code ret as an integer integer ret 90 DataFlux Expression Language Reference Guide . 85.C. // using the City definition at a sensitivity of 85.

case("Proper". if ret == 0 then error_message = bf. enter the following expressions: // Expression // define case as the return string that contains the case string case // define the return code ret as an integer integer ret // define a string to hold any error message that is returned string error_message // convert the upper case NEW YORK to proper case ret = bf. otherwise return a success message.gend). "IBM". //if an error occurs display it."Michael Smith". DataFlux Expression Language Reference Guide 91 . enter the following expressions: // Expression // define gend as the return string that contains the gender string gend // define the return code ret as an integer integer ret // define a string to hold any error message that is returned string error_message // generate a gender identification for Michael Smith. 3.case). "NEW YORK".gender("name". iden). otherwise return a success message. // if an error occurs display it. // and put the result in gend ret = bf.identify("Individual/Organization".getlasterror() else error_message = 'Successful' Exercise 5: How can I perform gender analysis? After you initialize the Blue Fusion object in the Pre-processing tab.getlasterror() else error_message = 'Successful' Exercise 6: How can I do string casing? After you initialize the Blue Fusion object in the Pre-processing tab. if ret == 0 then error_message = bf.// define a string to hold any error message that is returned string error_message // generate an Ind/Org identification for IBM and put // the result in iden ret = bf.

otherwise return a success message. // if an error occurs display it. For example. pattern).getlasterror() else error_message = 'Successful' Macro Variables Macros (or variables) are used to substitute values in a job. enter the following expressions: //Expression //define pattern as the return string string pattern //define the return code ret as an integer integer ret // define a string to hold any error message that is returned string error_message // analyze the pattern 919-447-3000 and output the result // as pattern ret = bf.getlasterror() else error_message = 'Successful' Exercise 7: How can I do pattern analysis? After you initialize the Blue Fusion object in the Pre-processing tab.pattern("character". otherwise return a success message. if ret == 0 then error_message = bf. you may want to have the job run every week.// if an error occurs display it. "919-447-3000". Then set that macro value either on the command line (if running Architect in batch) or using some other method. if ret == 0 then error_message = bf. but read from a different file every time it runs. In this situation. • • • Using Macro Variables Using getvar() and setvar() List of Macro Variable Functions 92 DataFlux Expression Language Reference Guide . you would specify the filename with a macro rather than the actual name of the file. This may be useful if you want to run a job in different modes at different times.

key2=value2" -VARFILE "C:\mymacros.cfg file directly.Using Macro Variables All of the settings in the DataFlux configuration file are represented by macros in Architect. followed by an equal sign and the value. or by specifying a file location when you launch a job from the command line. If you choose to edit the architect.txt Using macros.dmc -VAR my_field="address" Command line declarations override the macro variable values declared in architect. dfPower directly edits the architect. the path to the Quality Knowledge Base (QKB) is represented by the macro BLUEFUSION/QKB. You can choose to use variables for data input paths and filenames in the dfPower Architect Expression node. If running Architect in Windows.cfg. the values can be passed in the SOAP request packet. When you add macros using Navigator. DataFlux Expression Language Reference Guide 93 .exe myjob. you enter instead: %%MYFILE%% You can also use the macro to substitute some part of the parameter for example: c:\myfiles\%%MYFILE%%.cfg file. you can also add multiple comments. you can return to the standard dialog. you get a warning if you try to return to the standard property dialog. enter it with double percent signs before and after the value. for example: -VAR "key1=value1. For example. the -VAR or -VARFILE option of archbatch lets you specify the values of the macros. If the property value is plain text. by editing the architect.txt" In the second case. A macro can be used anywhere in a job where text can be entered. the values specified in Tools > Options > Global are used. To use a macro in a job. you can go to the Advanced tab and enter it there. If running the DataFlux Integration Server. the file contains each macro on its own line. an example of a macro declaration has the following format: archbatch.exe from the command line. for example: Old value: C:\myfiles\inputfile01. you may need to avoid the standard property dialog and use the advanced dialog thereafter. You can declare macro variables by entering them in the Macros folder of dfPower Navigator. If you are using archbatch. • • • If running in batch on UNIX. all current environment variables are read in as macros.cfg file directly. Depending on your macro. Specifically. After you have entered a macro under the Advanced Properties tab. The results from Expression are determined by the code in the Expression Properties dialog. the value of a macro is determined in the following ways: If running in batch in Windows. If an Architect step (such as a drop-down list) prevents you from entering a macro .

Variable is the name of the variable (caseinsensitive). if you set a macro in an expression step on a page. The second parameter is a value returned if the variable does not exist. for more information on using macro variables. Note that changes affect only that session of Architect. which are always read regardless of which mode is used. or from other expression nodes with getvar(). Using getvar() and setvar() Macro variable values can be read within a single function using the %my_macro% syntax. the containing job can specify the values as parameters. 94 DataFlux Expression Language Reference Guide . Predefined Macros: Predefined Macro _JOBFILENAME _JOBPATH _JOBPATHFILENAME TEMP Description The name of the current job. This is because those nodes have already read the old value of the macro and may have acted upon it (such as opening a file before the macro value was changed). The path of the current job. Returns true. and thus setvar is useful only for setting values that are read on following pages. use getvar() to read variables and setvar() to read and modify variables. and are not written back to the configuration file.Command Line Batch.cfg file can be used to store additional values. The path to the temporary directory. This issue arises only from using setvar. the new value may not be reflected in nodes on that page. See the dfPower Studio Online Help topic. With getvar() and setvar(). string) Description Returns dfPower Architect runtime variables.• • If using an embedded job. The architect. string) Note: For setvar(). Sets the Architect macro variable value specified by the first parameter. changes to the value persist from one function to the next. Using dfPower Batch . setvar boolean setvar(string. The path and filename to the current job. List of Macro Variable Functions The expression step has two functions for dealing with macros: Function getvar Syntax string getvar(string. If you are using more than one expression in your job. These variables are passed to dfPower Architect on the command line using -VAR or -VARFILE.

fieldname string fieldname(integer) fieldvalue setfieldvalue string fieldvalue(integer) boolean setfieldvalue(integer. Optional second parameter is set to the maximum string length in chars if the field type is string. Returns the name of a specific field. as well as set the value of a field. any) fieldtype string fieldtype(integer [. optionally. This is useful for enumerating through the fields and setting values. integer. First parameter is the index into the incoming fields. for example.Evaluating Incoming Data The Expression Language provides built-in functions that allow you to evaluate data coming in from a parent node. Determine the type and. List of Data Input Functions Following is a list of data input functions. and fieldvalue(n) allow you to dynamically access values from the parent node without knowing the names of the incoming fields. integer]) DataFlux Expression Language Reference Guide 95 . Set the value of a field based upon its index in the list of fields coming from the parent node. fieldname(n). the maximum length in chars (for string fields) based upon its index in the list of fields coming from the parent node. Parameter is the index into the incoming fields. Fieldcount(). Parameter is the index into the incoming fields. and determine a field's type and maximum length. Function fieldcount Syntax integer fieldcount() Description Returns the number of incoming fields. Returns a string representation of the field type. Returns the value of a specified field as a string.

ASCII Printable Characters Value 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 ! " # $ % & ' ( ) * + . . / 0 1 2 3 4 5 6 7 8 9 : .Appendix A: ASCII Values ASCII printable and control characters can be represented by the following decimal values. < = Character (space) Value 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] Character Value 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 ` a b c d e f g h i j k l m n o p q r s t u v w x y z { | } Character 96 DataFlux Expression Language Reference Guide .

Value 62 63 > ? Character Value 94 95 ^ _ Character Value 126 ~ Character ASCII Control Characters Value 0 1 2 3 4 5 6 7 8 9 10 Character Null character Start of header Start of text End of text End of transmission Enquiry Acknowledgment Bell Backspace Horizontal tab Line feed Value 11 12 13 14 15 16 17 18 19 20 21 Character Vertical tab Form feed Carriage return Shift out Shift in Data link escape Device control 1 Device control 2 Device control 3 Device control 4 Negative acknowledgment Value 22 23 24 25 26 27 28 29 30 31 127 Character Synchronous idle End of transmission block Cancel End of medium Substitute Escape File separator Group separator Record separator Unit separator Delete DataFlux Expression Language Reference Guide 97 .

Returns a character for an ASCII code. as well as the number of corrections necessary to turn one string into the other. Connect to a data source name (DSN). Used for declaring an array of the specified size. EXPERIMENTAL: Transcodes the string contents to the specified encoding. string) boolean deletefile(string) arrayName. Compares two strings. Specifically. Return types are listed before each function name. bytes. 0 if they are equal. string. Returns a list of data sources as a cursor. Deletes a file from disk. If the third parameter is true. Returns an ASCII code for a character. their parameter types. object/array) integer asc(string) real ceil(real) string chr(integer) fileName. Note: There is a new keyword. Function abs aparse Syntax real abs(real) integer aparse(string. Parses a string into a string array. Returns the number rounded up to the nearest whole number (ceiling).Appendix B: List of Functions The following table lists functions. asc ceil chr close compare copyfile dbconnect dbdatasources decode deletefile dim edit_distance copyfile(string. removing. and 1 if first is > second. Copies a file.dim(integer) integer edit_distance(string.close() integer compare(string. string. Returns -1 if first < second. they are compared case-insensitively. boolean) Description Returns the absolute value of a number. for string size. The number of elements is returned. while in string(255 bytes) the number refers to 255 bytes. string) object/array dbconnect(string) object/array dbdatasources() integer decode(string. string) 98 DataFlux Expression Language Reference Guide . and their return types. string. string(255 bytes) is used for multibyte languages. Closes the open file. Returns the number of letters that are different between two strings. Corrections can include adding. or changing a letter. The 255 in the declaration string(255) refers to characters.

optionally. string. Executes a program with zero or more arguments. fieldcount integer fieldcount() fieldname string fieldname(integer) fieldtype integer fieldtype(integer. such as the program not being found. Returns the number rounded down to the nearest whole number (floor). integer) Determine the type and. is returned. Fieldname returns the name of a specific field output from the parent node. string) integer execute(string) Description EXPERIMENTAL: Transcodes the source buffer to the string encoding. If an error occurs. A way of dynamically accessing values from the parent node without knowing the names of the incoming fields. string fieldvalue(integer) Fieldvalue returns the string value of a specific field output from the parent node. then -1 will be returned. If second parameter is true. Fieldcount returns the number of incoming fields. The exit status of the program.Function encode execute Syntax integer encode(string. the maximum length in chars (for string fields) based upon its index in the list of fields coming from the parent node. Returns a date formatted as a string. returns the modified date. boolean) fileexists floor formatdate boolean fileexists(string) real floor(real) string formatdate(date. Returns the date a file was created. Returns true if the specified file exists. which is a positive integer. The format parameter can include any string. but the following strings are replaced with the specified values: • • • • • • YYYY: four-digit year YY: two-digit year MMMM: full month in proper case MMM: abbreviated three-letter month MM: two-digit month DD: two-digit day fieldvalue filedate date filedate(string. string) DataFlux Expression Language Reference Guide 99 .

string. boolean. string. Returns 0 if not found. string. Retrieves the value of the specified item within an array. string. Null is propagated for the value parameter. but not for the check parameters. string) integer formatpicsigndec(real. The second parameter is a value returned if the variable does not exist. Returns dfPower Architect runtime variables. Returns location of one string within another. Variable is the name of the variable (case insensitive). string. string. boolean) integer formats370fib(real. EXPERIMENTAL: Returns a number formatted in z/OS integer. EXPERIMENTAL: Returns a number formatted in COBOL signed decimal. string) integer formatpiccomp3(real.Function Syntax • • • Description hh: hour mm: minute ss: second formatib formatpd formatpiccomp formatpiccomp3 formatpiccomp5 formatpicsigndec formats370fib formats370pd get getvar integer formatib(real. string. stating the occurrence of the string. string. integer) boolean isalpha (any) isalpha 100 DataFlux Expression Language Reference Guide . string. Returns true if the expression is a string made up entirely of alphabetic has_control_chars boolean has_control_chars(string) ib inlist real ib(string. string) integer formatpd(real. string. string) integer formatpiccomp5(real. string) integer formatpiccomp(real. EXPERIMENTAL: Returns a number formatted in SAS PD. EXPERIMENTAL: Returns a number from a SAS IB value. Returns true if any of the check parameters match the value parameter.get(integer) string getvar(string. EXPERIMENTAL: Returns a number formatted in COMP EXPERIMENTAL: Returns a number formatted in COMP-3. string) boolean inlist(any) instr integer instr(string. EXPERIMENTAL: Returns a number formatted in COMP-5. These are variables that are passed into dfPower Architect on the command line using -VAR or VARFILE. string) integer formats370fpd(real. EXPERIMENTAL: Returns a number formatted in z/OS packed decimal. string) EXPERIMENTAL: Returns a number formatted in SAS IB. Determine if the string contains control characters. string) arrayName.

Returns the string in lowercase. write. the current locale is returned. Delimiter is one or more characters that delimit each token (for example. Returns true if the expression is null. Parses the input string. The parsed tokens are stored in the isnull isnumber left boolean isnull(any) boolean isnumber(any) string left(string. Returns the number of parameters available. Returns the length of a string. or read/write properties. integer p. Returns the maximum value of a series of values. This locale setting is a global setting. or null if that parameter does not exist. Opens the specified file and sets read. a comma). any) DataFlux Expression Language Reference Guide 101 . or null if all values are null. boolean isblank(any) Description This function returns true if a string is empty. integer) len locale integer len(string) string locale(string) lower match_string string lower(string) boolean match_string(string. If no parameter is passed. Determine if the first string matches the second string. rw) string parameter(integer) integer parametercount() integer parse(string.open(filePath. Returns the old locale. Returns the leftmost characters of a string. Returns a sub-string of a string starting at position p for n characters. Returns true if expression is a number. null. EXPERIMENTAL: Return the inverse of the cummulative standarized normal distribution. string) real normsinv(real) open parameter parametercount parse fileName. Sets the locale which affects certain operations such as uppercasing and date operations.Function isblank Syntax characters. which may contain wildcards. or null if all values are null. integer n) real min(real) movefile normsinv movefile(string. string) real max(real) max mid min string mid(string. string. Moves or renames a file. Returns the minimum value of a series of values. Returns the value of a parameter. or contains only spacelike characters. where the second parameter specifies the number of characters from the left.

b) After execution. Returns n to the power of e. The variable "a" holds DataFlux and the variable "b" holds dfPower.a. EXPERIMENTAL: Returns a number from a COMP value. 102 DataFlux Expression Language Reference Guide . The function returns the total number of tokens found. Prints the string to the log. boolean) boolean pushrow(boolean) Generates a pattern from the input string. When a row is given from the stack. real e) boolean print(string. string) real picsigndec(string.Function Syntax Description output variables. boolean) fileName. Returns the current position in the file. it is not processed through the expression again. EXPERIMENTAL: Returns a number from a COMP-5 value. EXPERIMENTAL: Returns a number from a COBOL signed decimal value. This number is accurate even if you pass fewer than that total number of output variables. string) real piccomp3(any.position() real pow(real n. This function always returns true. ":". EXPERIMENTAL: Returns a number from a COMP-3 value. the value of cnt is 3 because three tokens were found. it is given from the top of the stack instead of being read from the step above. This function pushes the current values of all symbols to a stack (including both field values for the current row and defined symbols in the code). EXPERIMENTAL: Returns a number from a SAS PD value. pattern pd piccomp piccomp3 piccomp5 picsigndec position pow print pushrow string pattern(string) real pd(string. When the stack is empty. When the next row is requested. The output variables may be of any type. string. boolean. string) real piccomp(string. Example: integer cnt string a string b cnt=parse("DataFlux:dfPowe r:Architect". the rows are read from above as usual. string) real piccomp5(string.

This is a dfPower Architect-only function. It returns false if there are no more rows to read. If the fourth parameter is omitted or set to 0. that many replacements are made. and then use raiseerror to stop the job and return a user-defined error message when that condition occurs. Returns the rightmost characters of a string. Reads the next line from a text file. string. integer) DataFlux Expression Language Reference Guide 103 . Returns the string with the replacement made. Useful for evaluating problems unique to an installation. If set to another number." "Paul. Users can define a condition. String holds the bytes that are read. string) readrow boolean readrow() replace string replace(string." but name now contains "Paul" readline readbytes fileName.readline() readbytes(integer. integer) Replaces the first occurrence of one string with another string. Rounds a number to specified decimal right string right(string. Reads a set number of bytes from an open file." "George. It reads the next row of data from the step above and fills the variables that represent the incoming step's data with the new values. "John." and name contains "John" oldname=name readrow() // oldname still contains "John." and "Igor": string oldname // oldname now contains "John. For example: Assume that this step is below a step with a name field and the step outputs four rows. integer) round real round(real. all occurrences are replaced in the string. string. where the second parameter is the number of characters from the right. The user can then search for the error message to see if the associated condition was responsible for stopping the job.Function raiseerror Syntax boolean raiseerror(string) Description Raises a user-defined error.

boolean. boolean) string sort_words(String [. Sets the file pointer to a position starting at the current position. This is useful for enumerating through the fields and setting values. preventing further rows from being read from the step. false otherwise. EXPERIMENTAL: Returns a number from a z/OS integer. Optional parameter specifies the position. Return the string with its contents sorted alphabetically. Sets the file pointer to a position starting at the beginning of the file. string) real s370fpd(string. The first Boolean parameter causes the words within setvar sleep boolean sleep(integer) sort sort_words string sort(string. Sets status to end of file (EOF). false otherwise. Sets values for items within an array. This is only an estimate.Boolean [.set(integer. boolean setvar(string. string) seekbegin() seekcurrent seekcurrent() seekend seekend() set seteof arrayName. pushed rows are still returned.Function rowestimate Syntax places. s370fib s370fpd seekbegin real s370fib(string. Returns a string that consists of the words in the input string sorted alphabetically. Sets the file pointer to a position starting at the end of the file. Returns true on success. Optional parameter specifies the number of bytes from the current position. If parameter is true.any) boolean seteof(boolean) setfieldvalue boolean setfieldvalue(index. boolean rowestimate(integer) Description Sets the total number of rows that this step reports. Boolean] ]) 104 DataFlux Expression Language Reference Guide . false otherwise. Optional parameter specifies the position from the end of the file. Sleeps the specified number of milliseconds and invokes the interrupt handler. Returns true. Returns true on success. string) Sets the Architect macro variable value specified by the first parameter. Returns true on success. EXPERIMENTAL: Returns a number from a z/OS packed decimal.any) Set the value of a field based upon its index in the list of fields coming from the parent node.

Set the variable with the given name to a value. Writes a set number of bytes to an open file. writeline fileName. String contains the data to be written. Second Boolean parameter determines if duplicate words within the string should be discarded.writeline(string) DataFlux Expression Language Reference Guide 105 . Returns the string in uppercase. Writes string to the current cursor location in the file. any) writebytes(integer. FALSE results in descending order. Default is TRUE (ascending). Returns the string with leading and trailing white space removed. TRUE results in duplicate strings being discarded. This function overwrites existing content. Returns the type of data that an expression evaluates to. This function evaluates a string as if it were a variable. string) Returns the current date/time.Function Syntax Description the string to be sorted in ascending order. Default is FALSE (duplicates are not discarded). today trim typeof upper vareval varset writebytes date today() string trim(string) string typeof(any) string upper(string) vareval(string) boolean varset(string.

This standard applies to how floating point numbers are represented and the operations related to them. Comments can be either C-style (starts with /* and ends with */) or C++ style (starts with // and continues to the end of a line).Glossary ASCII American Standard Code for Information Interchange Comments Comments are text within a code segment that are not executed. EEL Expression Engine Language EOF end of file IEEE 754 The IEEE Standard for Binary Floating-Point Arithmetic. 106 DataFlux Expression Language Reference Guide .