You are on page 1of 26

Chandra to Learns



f you meet a girl for first time; the word “complex” is the best

definition. As you spend more time and by the grace of god you see compatibility; you start liking her. Life goes on and she makes your life a lot easier. A huge number of ifs and buts work around it. The moral of story is you need to spend time to make it happen. ACL is no different. This is what I fell love with, and probably gave this drive to write this book.




et’s understand the need for a data analytics tool.

Probably the world would not have been as it is today if there is no Excel or Such tool. If you need to work on some data and to analyze it, the obvious choice would be MS Excel. This is great tool with lots of flexibility and options. But probably one feature of this tool which will eat your head away and you may tempt to bang your nice laptop over the wall behind. Whenever MS Excel see a data of size of the range, 100,000 data lines it happily start showing “Not Responding” at top left corner. Probably I would switch to ACL in this situation. ACL works fine till 4-5 million data lines. If data is much bigger, SAS would be my next choice of tool. In comparison to SAS, ACL is much easier to handle and you can work faster on this tool. No doubt ACL, “audit command language” is one of the most preferred CAAT tool (Computer Assisted Audit Techniques). The other advantage of ACL over Excel and the most genuine reason why audit firms like is LOG generation. Auditor s always seeks evidences and re-performance if needed; log serves the purpose. Scripting is another feature using which you can do magic over ACL. You can take the analogy of Excel Macros for understanding ACL Script. But yes! ACL script is not as complex as VB Script.

I would also like to raise an important point here which is a common mistake by beginners. Although I took analogy of girlfriend, never trust on an ACL output. Keeping this as a best practice, always cross check the output at each level. Ask question: What control do you have to check the accuracy. If you don’t have any, you are not prepared for working on it. Probably I would call it a myopic vision of ACL manufacturer, ACL as the name attached to it placed as a specialized data analytics software used by audit firms; where in SAS stole place in almost every industry including Audit. The game theory by John Nash may give ACL Services Ltd., a future strategy to place its products. Of late I found, ACL Inc. tried to dissociate itself with term Audit to give a broad prospective to the tool. So they call it just ACL and not Audit command language. The consumer behaviors are driven by three factors: 1. Experience 2. Perception and 3. Imagination Perception often builds around how the product is placed and some part of it is carved by imagination. However a product experience put the foundation of a customer relationship.


What do you see?

How do you see them?
Are they animals or four legged mammals or a dog, a rabbit and a cow or just some living being? There can be more and better definition to define them. But the most perfect one should be contextually relevant. Perspective should be based on the usage of information. While reading, cleaning and formatting a data, probably it is combination of numeric, character and date field. While analyzing it may be a combination of fields like account, amount, description etc. Being an auditor it may be general sale or purchase ledger.


Understanding the data
The frequent visitors over my cubicle with their ACL worries and sweat at forehead were stuck and came to me because of one reason; they could not understand the data. Probably I would like to call this as an art rather technique to understand it .If your life touches lots of data and this is something you love, each day you can find a new color of learning. If I am working on data analytics tool, the first thing I would like to do is to understand the data. I would like to check the number of records (Rows) the data have. I would also like to segregate the data fields (Columns) on the basis of the data type. Although data types can be many but at this point of time for the purpose of simplification, I would try understanding the data in terms of Date, Character and Numeric field. Date field is easier to understand. The thin line differentiation between Character and Numeric is that “what field I would like to add up?” For example Account number is a number but I would take it as Character field because of the simple reason that I won’t like to add up account number. Though the corresponding amounts in those accounts will be numeric; as I wish adding them up. The way SAS works or store data sometimes makes me feel funny and also surprise me. Let’s say a date. What SAS does, it understand the date first while reading it and count the number of days since January 1, 1960. I don’t know why January 1, 1960 only .May be it is the birth day of any of their co-founders Jim Goodnight, Jim Barr, John Sall or Jane Helwig. This number is referred to as a SAS date value. While in ACL the input date format specifies how dates are stored in the source data. It does not affect the way ACL displays dates in views and reports. The Output format is defined using option feature by which format can be customized. This is very important to understand how ACL read the data. I would be covering them in a separate volume of this series of book. For the time being I am just skipping this topic.


Steps to create a new project in ACL.
1. Create a project folder . Copy the link of the location of folder. 2. Invoke ACL. 3. You can either create a new project or can resume any existing project (See image). 4. For creating new project, ACL would seek the location where you wish to save your project. Here we can pass the link of above mentioned folder. 5. To start a new project ACL would seek the data on which procedure to be performed. ACL will pop up “ Data Definition Wizard”. 6. Data Definition Wizard leads to steps to define a data file. 7. Next window ask for source of data. If data reside in own Machine, “Disk” would be our choice. Frakly speaking, I never used other two options. 8. Now with pop up browser, you can locate the data file. 9. In this example , I have the data in text file format. 10. Now this is a time to define the data. Since I had a TAB delimited Text file to read I need to define the field characteristics. 11. If the source of data is excel, Access or any other DB source, probably Definig data won’t be your headache. Most of he time it takes the correct format but not always.


12. If reading a delimited text file, the data specification can be input through the window on the leftside. It ask for delimiter and text qualifier if any. 13. The next windos helps to define the fields. These to steps are most important in defining data.

14. Once data is read table is ready for perfoming procedure. At the background ACL creates following System files which togather makesthis project. · ACL & .AC: ACL Project File. This file contains scripts, workspaces & table layouts of the source files used in the project (must be backed-up after each major analysis and finally before deleting from the hard drive) · .LOG & .LIX: ACL Log Files. These files contain history of all commands run in ACL by the user (must be backed-up before deleting from the hard drive) · .INX ACL data index files. These files can be re-created, however, for a ready to use back-up it is desirable that these need to be backed-up as well before they are deleted from the system · .FIL ACL output (result) data files. These files can be re-created by executing the same commands as were used to create these files, provided the source data has not changed, however, for ready-to use back-up, these must be backed-up after each major analysis and finally before deleting them from the hard drive. · IFD_WIZ.ERR error log


Transferring ACL dump: To transfer ACL files to one machine to another machine it is advisible to transfer whole ACL project folder. While opening the
ACL project in another computer, since the absolute path of the file do not remain the same, ACL ask for the path of the files and will pop up a browser. As we know all th data files can be recognised by extention “.FIL”, we can point the respective ACL data files to their tables. Once tables are assigned, the project is ready for use. Simmilar pop may appear on clicking the tables if somehow their respective “.FIL” has got deleated. We may not retreive that table in this case and only option would be left to regenerate them.


Main View


The main view consists of side window called “Project Navigator”, Main window, Command line and menu bar. Menu bar consists of Quick use
Command Icons and drop down menu. Main window is meant for output. Output can be obtained either in the form of Print which will give print command to printer, File which will create a table in ACL project, Graph which will draw a graph or Screen which will show output in Main window. If you perform a procedure generating an output on screen. This screen gets replaced by another outputon screen unlessyou pin it down. Output of Classify remains with Output of Statistics because it is pinned otherwise would have been disapperaed. Project navigator has two tabs; the overview tab and log tab. The Overview tab displays the various Tables, Folder, Script and work space. Double clicking on any table makes it active, any command in command line would run on active table. Log tab shows all previous run commands. Audit always seeks evidence and log is an evidence of procedure performance. Log is one of the most required features for ACL to be an Audit tool for data analytics. Main menu can be customized and more command icons can be added similar to Excel. Same command can be run either using Command icon or by drop down menu or by writing script. We will have a separate volume on Scripting.Total, Classify, Startify, Export, Age, Edit table layout and Add coulmns are some of the most quick hit command icons.

These four icons most prominently feature Filter line. The first icon moves your selection to your next unfilterd record. Second icon removes any active filter. Third Icon is to set filter and fourth icon to edit filter. Whenever I am not sure what filter should I put in a particular condition, I generally select fourth icon which pops up another window which helps to define your filter.Otherwise If I know the command to put the filter, I can directly type in the filter in Filter Line. For Example: ABS(AMOUNT)>10000 will put filter to display all the data in field “Amount” whose absolute value is more than 10000. Alternatively I can also type a command in Command Line to have the same effect. Example: Set Filter to ABS(AMOUNT)>10000.This filter will run on active Table. If the ACL project have many tables,the table with green dot repersent the active table.

10 | P a g e

Data Reading
If you wish to travel from Hyderbad to bangalore, you would like to know the the length of travel and how good the route is? Although there are many other variables but you would know them only if you face them. But before the journy this very important to deceide upon the mode of transportation. Data analytics tool may be taken as an analogous to a mode of transportation. I see a data in terms of two parameter complexity and size.

If I have to compare SAS with ACL in terms of various characteristics SAS would
get more scores. SAS is more versatile. It has more applicability. It is more robust. The capability to handle Hercules sized data is just incredible. If I have a data with a range of billion or say trillion datelines, probably I don’t have any option othe than SAS. But value comes with price. ACL is delicate but easy to use. If you have complex kind of data, Reading it through SAS may be bug your head and ACL may be just prove sweet heart. Suppose you have a data with not all the information to be extracted presented well in rows and columns. For example. In picture all the arrow marked data to be read as separate fields. So for Investor “339” there are set of set of entrie to be read. Investor may change after couple of entries. These kind of data can be read through SAS but it may eat up your life. ACL has some special feature for reading this kind of complex data. Reading data is altogahter a big section to cover. probably we can have that in a separate volume. For the time being we can focus on reading on simple data and performance of procedure.

11 | P a g e

Manipulating data
If you wish to have a field created as a part of information or with combination of information from another field; SUBSTR( string , start , length ) function is most handy to use. The parameter to pass are Source field, position to start and the length to be extracted.

Example 1: Making CD-25-000-2303 into an account 2303: Here, I will have a start at position 11 and the length would be 4. Example 2: Making CD-25-000-2303 into an account 2303-CD By using the expression “SUBSTR( Account_Number, 11, 4 )+ "-"+SUBSTR( Account_Number, 1, 2 )” Example3: Making CD-25-000-2303 into an account 2303-GL by adding an extra text By using the expression “SUBSTR( Account_Number, 11, 4 )+ "-"GL” Example3: Applying condition to field value

12 | P a g e

13 | P a g e

Knowing the work means knowing the tools. Lets talk about ACL functions to understand their usage. Any CAAT (computer assisted audit techniques) tool generally follow three steps to achieve an audit objective. 1. To extract and analyze data 2. Trace out trends within data, identify exceptions, 3. Test the sample for potential fraud within

Testing a sample can provide a resonable assurance but the manual testing all the data is not feasible either. The most hit tab of a ACL tool bar is data which is oriented towards data handling. Analyze as the name suggest diaganose the data while tool has other supporting utility.

14 | P a g e

When dealing with a big set of data to find the fraud, an auditor always look for the basic overview. A overview are visible from summary of data. ACL has variuous commands according to the type of summarizing operation required. Stratify to summarize the table according to numeric ranges

15 | P a g e

16 | P a g e

One of the most frequetly used method of misstateing the financial stetement is to overstate the value of asset. Among many other way of inflating the assets , account receivables is one of the balance sheet item which can be played with. Revenue recognition occurs independent of cash movement.The fundamental principal of accural acccounting is that revenue is recognised when earned as uneraned revene and an A/R, an asset against it created.When Cash towards A/R is realized , the counter transaction reduces A/R and increases Cash. From the point of view of an investor, receivable turnove(Credit sales/Avg Account Receivables) and payable tunover r is the most sought after buzzword to look at to see the Cash management of the company.For boosting sales firm extend credit sales and indirectly extending interest-free loans to their clients. A high ratio is an indicator of an efficent callection of A/R or may be operates on a cash basis. Auditor job is a bit more tougher than an invesment banker who looks the quality of Account receivable. Revenue recognition should be appropriate , applied consistently and based on firms accounting policy which must be consistent with significant accounting policies and compliance with applicable accounting pronouncements on receivables balances. I feel I am missing a lot many poins here on A/R …. Would try touchig them later.

17 | P a g e

Pivot table is one of the greatest feature excel have. The basic purpose of this feature is to summerise a data based on one or more than one fields with some additional feature of keeping filters on some or othere fields. The synonym of Pivot table in ACL can be understood as Summarise. This function can be troublesome for beginners because of very frequent encounter of warning as in the image. The simple slountion for this would be taking the output in a file than Screen.

18 | P a g e

Writing a Macro In ACL

Probably Macro is one of the most facinating buzz word in almost any coding language.If we will go by definition, Macro goes overhead . Let me directly pick it up from wikipedia. “A macro (from the Greek "μάκρο" for big or far) in computer science is a rule or pattern that specifies how a certain input sequence (often a sequence of characters) should be mapped to an output sequence (also often a sequence of characters) according to a defined procedure. The mapping process that instantiates a macro into a specific output sequence is known as macro expansion.” The term originated with macro-assemblers, where the idea is to make available to the programmer a sequence of computing instructions as a single program statement, making the programming task less tedious” I just got up and this is 2:33 AM in my 2 minutes slow watch and I am in middle of my sleep. The buzz word “Macro” is ponding my head. I just thought an idea of writing macro in ACL. Althought this needs some scripting background to write an ACL Macro. Let me explain Macro with an example of SAS coding. I can not take ACL scripting to undersatand it because unfortunately Macro as a function does not exist in ACL.

19 | P a g e

This has been an year and I did not touched ACL, May be I am feeling divorced. I don’t use ACL anymore because, I am more into auditing space these days than analytics and audit don’t

20 | P a g e

Benford Analysis This is my 8th attempt to write something about Benford Analysis. Whenever I tried convincing some one on to the point that this is just the nature
that follow the Benford Law or Vice –versa benford just reveals the nature; I never could convince anyone in first instance. So I thought to put some background before talking more about Benford. I searched this topic over internet and at few of the instances I could not find the more perfect word to describe the topic and so to keep the taste of the liturature intact I am putting down the part of those liturature in verbatim. There is an article “I ‘ve Got Your number” published in 1999 in Journal of accountancy capture this topic very well. There was a physicist at the GE Research Laboratories in Schenectady, New York, in the 1920s name Frank Benford.He noticed that the first few pages of his logarithm tables books were more worn than the last few and from this he surmised that he was consulting the first pages more often than later. Benford concluded that he was looking up the logs of numbers with low first digits more frequently because there were more numbers with low first digits in the world. Thus if we take a natually occuring dataset, the number of occurance of number with leading digit smaller( say 10,11…) will be much more than number with higher leading digit( say 90,91..). To confirm the authenticity of his observation he took 20 lists of numbers with a total of 20,229 observations. His lists came from varied sources, such as geographic, scientific and demographic data. One list contained all the numbers in an issue of Reader's Digest . He found that about 31% of the numbers had 1 as the first digit, 19% had 2 , and only 5% had 9 as a first digit. Benford then made some physics-related assumptions about the distribution of naturally occurring data and, using integral calculus, he computed the expected frequencies of the digits and digit combinations. At this point this is worth understanding that this is about natureso data should be natually occuring and so won’t work on manually assigned numbers, such as Social Security numbers, zip codes or bank account numbers will not conform to Benford's law. Any data which evolves which occurs with certain condition. Benford as Faurd Analytics: In 1993, in State of Arizona v. Wayne James Nelson (CV92-18841), the accused was found guilty of trying to defraud the state of nearly $2 million. Nelson, a manager in the office of the Arizona State Treasurer, disclosed that he had diverted funds to a bogus vendor to demonstrate the absence of safeguards in a new computer system. Because human choices are not random, invented numbers are unlikely to follow Benford's law. Here are some divergent signs that Benford's law would have drawn attention to:
§ §

As is often the case in fraud, the embezzler started small and then increased dollar amounts. Most of the amounts were just below $100,000. It's possible that higher dollar amounts received additional scrutiny or that checks above that amount required human signatures instead of automated check writing. By keeping the amounts just below an additional control threshold, the manager tried to conceal the fraud.
21 | P a g e


The digit patterns of the check amounts are almost opposite to those of Benford's law. Over 90% have 7 , 8 or 9 as a first digit. Had each vendor been tested against Benford's law, this set of numbers also would have had a low conformity, signaling an irregularity.

Im more statistical trems Benford can be defined as leading digit d (d ∈ {1, …, b − 1} ) in base b (b > 2) occurs with probability This quantity is exactly the space between d and d + 1 in a logarithmic scale. Probably well planned fraud is not an easy nut to crack but the benford can a right tool to highlight the problem area is other logics fails to detect the irrgularities. Let me work out a benford test using ACL. I took and ran Benford on transaction amount at first two leading digits. ACL quickly took first two digits and plotted the frequency distribution for those numbers. The Benford graph In ACL looks like as below. Here one can point that the transaction amount with leading digit 66 has been posted more numbers of times than expected. This way we try looking for higher deviation from exoected on either side.

22 | P a g e


ometime things react just silly; you feel banging your head.

This is a very common technique, be it SAS, ACL or even while writing an VBA Macro; you run it using the GUI , record the code and reuse this for your purpose. I had a great number of files to be read and a common procedure to be performed. The logic says, read one file perform whole procedure and record the code/ script. Once you have script in hand you can run it for all the files…… Very easy…Probably I might have done that many times over SAS using “run last submit” cammond tab. One fine day I had a similar task. I read one file, recorded the script and re-run that. But re-runing the script and Original GUI output were different. Ideally both should have given same result, as they were the same code.. Soon my fine day become ugly… banged my head… neither head broke nor wall… and so left for coffee( May be that help). When I came back, I thought to analyse the code. I copied the code from log command to word and there was nothing that I found unusual. So I copied the code from word file to new script and run it again. This time output was as expected……..

What a magic a coffee can do!!!

So again I copied command from log to script and it again did not work……. What the hell… Then I realized coping code to word and word to script is different from coping code from log to directly to script.In later case it removes all the formating and even removes some of the defined spaces from code and code react differently. What I learnt is…. Use word to between your coping ACL command and pasting Same command to Script…..It will save your head.

23 | P a g e

I was a very happy user until I learnt statistical techniques used in auditing. In a Big 4 firm where I work, there is an inhouse developed tool called STAR based on the book “Statistical Techniques for Analytical Review in Auditing by Kenneth Stringer and Trevor Stewart”. I thought, why we don’t use ACL rather using an Excel Macro based tool to perform these regressions for analytical review; then I realized that ACL does not have any statisctical capability at all. This was the first time I understood that ACL is not an statistical data analytics tool but just an analytical tool. So a far as statistics is concern I would rather call it incompetent. This is an another reason why SAS is superior. Then I started searching the nearest relative of ACL that has statistical capabilities and my search ended at IBM’s SPSS. It is used by market researchers, health researchers, survey companies, government, education researchers, marketing organizations and others. SPSS has almost everything that ACL does not. But being very generic software and audit being not a target segment, SPSS probably never thought equiping SPSS with the functions that can perform auditing procedure with ease.This is why SPSS is almost unknow to auditors. There is another serious deficiency which ACL has in Plotting graphs. ACL can present its output in graphs but can not plot a graph of given data. If I have a data of Sales and cost of sales and I wish to see the outliers in quick review, Certainly ACL is not my answer. And some time even I feel like “ACL I don’t love you any more”. Statistical innovation in auditing is not mature. I never heard anything more than regression in auditing space and that too came out of none other than the book by Kenneth Stringer and Trevor Stewart. May in time, statistics will have a better position in Auditing space and ACL Inc. might have reevaluate its capabilities of their software for survival.

24 | P a g e

25 | P a g e

This book is not complete. Keep watching this space for updation/ revision.

This book is written with non monetary purpose. Print, transfer, Copy, download, upload…. Anything you wish; please feel free. If you have any comment/ suggestion/ critic/grievances/query/request, do write me at The purpose of this book is not to be a text book for any curriculm; Just wrote it for fun.I may have missed or totally ignored some or other topics. However if you wish something should have been here, drop me a line on email ID above and I will try to accommodate it here.

26 | P a g e