You are on page 1of 42
)
)

Anomaly Detection Capability of Existing Models of Continuity Equations in Continuous Assurance

E.J.F. VAN KEMPEN ANR: 201386 MSc Accounting

Supervisor : dr. H.J.R. Litjens RA

2015 - 2016

ANOMALY DETECTION CAPABILITY OF EXISTING MODELS OF CONTINUITY EQUATIONS IN CONTINUOUS ASSURANCE Master thesis Department

ANOMALY DETECTION CAPABILITY OF EXISTING MODELS OF CONTINUITY

EQUATIONS IN CONTINUOUS ASSURANCE

Master thesis Department Accountancy Tilburg School of Economics and Management Tilburg University

E.J.F. VAN KEMPEN ANR: 201386

Supervisor : dr. H.J.R. Litjens RA

Second reader : Prof. dr. J.P.M. Suijs

Date of completion: January 8, 2016

2015 - 2016

Preface

In my previous experience as an engineering student I was constantly trained to optimize operations in terms of efficiency, while maintaining full effectiveness. I did not matter if it was optimizing power efficiency in a specifically designed multicore processor or optimizing efficiency with regards to everyday tasks as making coffee. Tasks or activities of a highly repetitive nature should never be done manually if automating the process could be done in less time. A few years later I set foot in an audit firm for the first time. Within a day I found it hard to believe that my co-workers and I were reconciling bank account statements with G/L account statements manually, by drawing tick marks if two figures were the same.

As a former engineering student I immediately cringed. Why would I ever manually check only a sample of figures if the whole population is already digitally available and I could be checked almost instantly? Unfortunately, the auditing profession is only slowly adopting more innovative ways of testing. However, I am certain that this process could be catalyzed by clearly presenting the benefits and drawbacks and by making it easier to understand and implement innovative testing procedures.

This thesis should provide a better understanding of one of the innovative testing procedures:

continuity equations. Even though I am not able to end the tyranny of the status quo on my own, this thesis is the first of a series of baby steps in that direction. I think it could be beneficial for the audit profession to learn about this tool even if it proves to be not extremely powerful.

First, I would like to thank my supervisor Robin Litjens for his guidance and support during this epic adventure. Second, I would like to thank my friend Niels Weterings for his encouragements to keep writing and running the testing scripts, while I was struggling with time management, which proved to be quintessential to finalize this thesis. Third, I would like to thank all my friends who helped in any way or form during this period. Last, I am very grateful for receiving a grant from the Ministry of Education, Culture and Science (Ministerie van Onderwijs, Cultuur en Wetenschap) to pursue my master’s degree in accountancy.

E.J.F. van Kempen Eindhoven, 2016

Abstract

Continuous assurance is a methodology to provide assurance on financial data on a near real-time basis. One of the fundamental elements of continuous assurance is continuous data auditing in which the integrity of the data provided by the client is tested. Continuity equations can be used to evidence assertions regarding data integrity. In order to do so, data is tested by predicting subsequent values based on a fitting model. In total there are five models: the linear regression model (LRM), the simultaneous equations model (SEM), the vector autoregressive model (VAR), the restricted vector autoregressive model (RVAR) and the autoregressive integrated moving average model (ARIMA). All models are compared to each other by performance regarding Type I and Type II errors.

The standalone VAR model performs best with regards to Type I errors, while the standalone RVAR model performs best with regards to Type II errors. A cascaded combination model consisting of both the VAR and RVAR model performs best with regards to both error measures.

Table of contents

I. Introduction

 

1

II. Literature review and research question

2

Continuous assurance

 

2

Continuity equations

4

Simultaneous Equations Model

5

Linear Regression Model

 

5

Basic Vector Autoregressive model

6

Restricted Vector Autoregressive model

7

Autoregressive Integrated Moving Average

8

Combination of models and prioritization

8

Research question

 

9

III.

Research Design

10

Data

 

10

Implementation of the models

12

Testing of the models

 

13

IV.

Results

14

Models

 

14

Model

tests

15

Model

combination

17

V.

Conclusion

21

Type

I errors

 

21

Type

II

errors

21

Overall

performance

22

VI.

Discussion

23

Predictability of the

sales cycle

23

Practical application

24

Appendix

A.

Data

29

Appendix B.

Implementation in R

30

Simulation test set generator

30

Model implementation

30

Test procedure

 

32

Report generator

34

Test automation script

36

I.

Introduction

For the last three decades, auditors and financial professionals have taken interest in the subject of continuous assurance. However, significant research in this field was initiated only after a proposed conceptual framework for continuous assurance was published by Vasarhelyi et al. (2004). In the following years more aspects of continuous assurance were studied, but most of these studies resulted in the development of new and innovative analysis methods and further refining the theoretical framework. Comparison of existing analysis models was not yet in scope. This thesis reports on the comparison of the anomaly detection capability of existing models of continuity equations.

Conventional audit procedures focus on time consuming manual testing on a fixed number of randomly selected supporting documents, like invoices or inventory counts. By introducing more superior audit procedures from the continuous assurance domain, like continuity equations, substantive testing can, in theory, be performed more efficiently and effectively. The level of assurance can improve, while time consumption is reduced at the same time.

However, all these audit procedures from the continuous assurance domain are fairly new and remain mostly untested in the real world. This research intends to investigate one of these procedures, continuity equations, on a more detailed level. By using continuity equations business processes could be tested by detecting anomalies in one or more of the steps within these processes. The audit procedures or manual testing can then be narrowed down to the detected anomalies.

Efficient performance of anomaly detection could lead to a paradigm shift in the field of auditing. Instead of sampling evidence randomly from the population, the level of assurance can be improved by inspecting exceptions only: audit by exception.

The remainder of this thesis consists of five sections. First, in Section II prior literature is explored and reviewed. Second, Section III covers the research design including a description of the data used, the mathematical representations of the models and the testing procedures. Third, in Section IV the results are presented leading to the conclusions in Section V. Section VI focuses on possible improvements of the research design and interesting subjects for further research.

II. Literature review and research question

As one of the most important developments in business, IT has been adopted in the business

environment to a large extent. Accounting information systems, ERP and other forms of

digitization of business processes have currently become fairly ubiquitous in the field of

accounting. As a result of these developments, IT is better able to support the growing

complexity of businesses and their transactions. On the other hand, these developments also

have implications for internal and external auditing. The growth of information generation and

availability require audits to become more effective and efficient. (Bedard, Deis, Curtis, &

Jenkins, 2008; Bachlechner, Thalmann, & Manhart, 2014) Gathering of audit evidence has

become overly tedious over time and too complex to be done manually. Bachlechner et al.

(2014) argue that the conventional audit procedures are reaching their limits. They propose a

more substantial role of IT and a software-based approach to gathering of information and

testing in internal and external auditing as a key solution to the challenges imposed by the

developments in businesses.

Studies have found that involvement of IT and a software-based approach in audits leads to

significant improvements in productivity and efficiency (Banker, Chang, & Kao, 2002), while

other studies argue that the use of software-based audit automation and decision support

systems lead to higher audit quality (Dowling & Leech, 2007; Manson, McCartney, Sherer, &

Wallace, 1998). These studies all show that audit automation could be beneficial for the audit

process and its quality. A natural implementation of an audit automation program is continuous

assurance. (Alles, Kogan, & Vasarhelyi, 2008)

Continuous assurance

The Canadian Institute of Chartered Accountants (1999) provides a definition of continuous

assurance: Continuous auditing [or continuous assurance] is a methodology that enables

independent auditors to provide written assurance on a subject matter using a series of auditor’s

reports issued simultaneously with, or a short period of time after, the occurrence of events

underlying the subject matter.” The emphasis of continuous assurance is on reducing the lag

between preparing a report and subsequently providing assurance on the matters reported. The

timeliness of audit results is key.

In order to be able to provide assurance on a near-real time basis, the auditors have to rely

heavily on automated testing. Alles et al. (2006; 2008) and Vasarhelyi et al. (2004; 2010) have

defined three elements of continuous assurance and continuous monitoring: Continuous Control Monitoring (CCM), Continuous Data Auditing (CDA), Continuous Risk Monitoring and Assessment (CRMA). Testing of procedures in the conventional audit framework and final testing focusing more on data than procedures can be mapped to the elements CCM and CDA respectively. These two elements combined can be used to provide sufficient assurance. CRMA can be used as an additional part of the control framework, but is not essential for providing assurance. The main focus of CDA is to verify the integrity of data, such as data flowing through the information system of the audited entity. The data provided by the client is the basis for all testing procedures, so data assurance forms an essential part of continuous assurance.

Alles et al. (2002; 2008) and Rezaeee et al. (2002) argue that continuous assurance can potentially provide a number of benefits: costs may be reduced due to automated testing; audit quality might increase due to the testing of a larger sample and by allowing the auditor to spend less time on testing manually and focus more on understanding the audited entity; assurance can be provided more timely. Furthermore, Malaescu & Sutton (2015) have found that as an additional benefit external auditors are willing to rely more on internal audit work when continuous auditing is implemented. Their results also show a traditional internal audit approach actually results in an increase of budgeted hours and an increase of the hourly rate of external auditors.

However prior research also identifies possible challenges that could influence adoption and utilization of continuous assurance negatively. Vasarhelyi & Romero (2014) find that the adoption of continuous assurance is largely determined by the audit team composition. If the team is considered more IT ‘proficient’ it is more likely to adopt tools from the continuous assurance domain. The availability of technology support teams further improves adoption and usability of these tools. Furthermore, Alles et al. (2002) argue that the high start-up costs to implement continuous assurance, due to the lack of solid market penetration of tools from this domain, has a negative effect on adoption of these tools. Another challenge identified by Hardy (2014) and Cao et al. (2015) is the volume of false positives in the detected anomalies. Some implementations of analytical methods result in information overload. Sometimes more false positives are generated than can be manually investigated by an audit team.

In order to implement continuous assurance, the internal or external auditors need to rely in large part on predictive models. (Krahel & Vasarhelyi, 2014; Kuenkaikaew & Vasarhelyi,

2013; Vasarhelyi, Warren, Teeter, & Titera, 2014) The resulting predicted values from these models are compared with actual values in near-real time to detect anomalies.

As part of the CDA element of continuous assurance continuity equations can be used as a framework of predictive models to evidence management assertions focusing on data integrity. (Chiu, Liu, & Vasarhelyi, 2014)

Continuity equations

Continuity equations have been a fundamental part of classical physics since the eighteenth century. These equations describe the transport of a quantity, while simultaneously ensuring conservation of this quantity (like mass and/or energy). Accordingly, similar relations can be defined for the transport of quantities within a system in the financial domain. The movement of reported quantities, e.g. ordered kilograms or invoiced units, between steps in the key business processes can be described with continuity equations.

The term continuity equations’, as a tool in the field of audit, was coined in 1991, when Vasarhelyi and Halper (1991) modeled the flow of billing data at AT&T. Years later Alles et al. (2008) properly defined continuity equations in the field of continuous assurance. Although Vasarhelyi and Halper proposed continuity equations more than 20 years ago, little research has been performed on the application in practice and implementation of a decent continuity equations model.

In most businesses the flow of goods is the most important basis for revenue recognition. As such, the flow of goods can be used to provide evidence for the completeness, timeliness and accuracy of the reported revenue. If the continuity equations hold for a specific business process, one can assert that there are no leakagesfrom the transaction flow, i.e. the integrity of the flow of goods can be asserted. Therefore, continuity equations provide a method to evidence the integrity of the basis for revenue recognition, which makes them a valuable tool in continuous assurance.

Continuity equations are based on historical data of quantities in the separate steps of business processes. For example, the sales cycle can be modeled as three separate steps: receiving the order from the customer, shipping goods to the customer and invoicing for the ordered and shipped goods. The quantity of ordered goods today will of course show up in the invoicing step a certain number of days later. The daily flow of goods between these steps can be defined

with a certain quantity and a lag between the steps . This research will focus on the sales

cycle consisting of the three previously defined process steps.

Previous research by Leitch and Chen (2003), Dzeng (1994), Kogan et al. (2010) and Alles et

al. (2005) has resulted in four theoretical models of continuity equations: linear regression

model (LRM), the simultaneous equations model (SEM), vector autoregressive model (VAR)

and the restricted vector autoregressive model (RVAR). Prior research did not include an in

depth review of any other time series analysis models in terms of anomaly detection capability,

but the ARIMA model could provide value to flows of goods which are not optimally modeled

in autoregressive terms only.

Simultaneous Equations Model

Leitch and Chen (2003) proposed a first model of continuity equations in the field of

assurance: the Simultaneous Equations Model (SEM). When applied to the sales cycle, this

model can be represented as Equation (1). Each step in the sales cycle is simultaneously

dependent on historic quantities from the previous step. These historic quantities are

represented with lag in each step. This model simplifies the sales cycle by assuming that there

is only a single fixed lag between each step.

= ∙ 1 + 1 = ∙ ℎ 2 + 2

(1)

The coefficients of this model are estimated by OLS linear regression, optimizing for the

overall 2 of the model.

Leitch and Chen tested the application of SEM on monthly data of financial statements. They

found that SEM outperformed other more conventional models of analytical procedures.

Linear Regression Model

The second model is based on a simple linear regression of the invoiced quantities on the

ordered and shipped quantities as represented in Equation (2).

= 1 + ∙ ℎ 2 + i

(2)

Again, these historic quantities are represented with lag in each step. This model simplifies

the sales cycle by assuming that there is only a single fixed lag between each step. The

coefficients of this model are estimated by OLS linear regression, optimizing for the overall 2 of the model.

Basic Vector Autoregressive model

Alles et al. (2005) introduced another model: the basic Vector Autoregressive (VAR) model. This model for the sales cycle can be represented as Equation (3). In this model , and are respectively the quantities ordered, shipped and invoiced at time , the terms are × 1 transition vectors for a multivariate linear model, the terms are × 1 vectors containing daily aggregates of quantities for the given dimension and is the number of time periods covered in the model.

= () + (ℎ) + () ℎ = () + (ℎ) + () = () + (ℎ) + ()

(3)

Each of these sub-equations models a predictor for the reported quantities in a specific step in the business process. As previously defined, the quantities are related to quantities in the other process steps by a time delay (lag). For example, if orders are shipped in exactly one day, without exception, and invoicing is performed simultaneously with shipping, the resulting predictors can be defined as Equation (4).

= 1 ∙ ℎ +1 + 2 +1

= 1 1 + 2 = 1 1 + 2 ∙ ℎ

(4)

The VAR model is estimated by OLS linear regression, optimizing for the overall 2 by trying different lags for the process steps. Only the maximum expected lag is provided to the algorithm, which then tries to find the best fitting model by iterating trough all lag possibilities up to the maximum expected lag. The exact lags do not have to be known prior to modeling, as the best fitting lags are determined while modeling.

One can easily understand that it is not always trivial to determine lags prior to the modeling process, e.g. lags in the purchasing cycle are highly dependent on the policies and processes at third parties. Therefore, the VAR model can be a powerful tool for modeling continuity equations when exact lags cannot be predefined easily.

Contrary to the SEM model, the VAR model does not assume that there is a singular fixed lag between steps. All lags up to a maximum are considered in the model. This can possibly result in a comprehensive estimated model. Therefore, most VAR models are represented using matrix notation.

Restricted Vector Autoregressive model

Kogan et al. (2010) have shown in their studies that the VAR model shows outstanding accuracy. More importantly, they showed that the Restricted VAR (RVAR) model resulted in better accuracy. With a MAPE (mean absolute percentage error) of 0.3374 on the test set it outscored even several other models, i.e. SEM and VAR type of models. Only the Bayesian VAR model performed better when taking only the MAPE into account, but it also resulted in a larger standard deviation for the absolute percentage error. Therefore, the Bayesian VAR model is not considered viable for auditing purposes. The RVAR model was found to be one of the best models for continuity equations.

The RVAR model translates roughly to optimizing for 2 of the predictor by removing insignificant coefficients from the VAR model. For example, if the mean lag between order and shipping is less than a month shipment +365 a year after ordering is obviously not significant and thus excluded from the model. This method iterates the modeling process per equation by removing all coefficients with ||-statistics below a predefined threshold, as explained in Figure 1. Kogan et al. (2010) find that a threshold of = 0.15 and its corresponding > 1.036 yields the model with the best prediction accuracy.

Final model Data Threshold Yes Exclude parameters Initial model All t-statistics Start with t-statistic
Final model
Data
Threshold
Yes
Exclude parameters
Initial model
All t-statistics
Start
with t-statistic
Re-estimate model
estimation
above threshold?
below threshold
No

Figure 1. RVAR modeling process. The initial VAR model is restricted by excluding parameters with a t- statistic below a predefined threshold. The model is re-estimated followed by the next exclusion iteration, until all parameters satisfy the t-statistic requirement.

The RVAR model usually results in less extensive and more accurate estimated models due to the restriction to significant terms only.

Autoregressive Integrated Moving Average

The autoregressive integrated moving average (ARIMA) model differs from the previous models by also including non-autoregressive terms. The ARIMA model accounts for both autoregressive and moving average terms in the model. As in the VAR and RVAR model the autoregressive terms account for the possibility that a value at time is related to its prior values or lagged terms. The moving average terms account for the possibility that a value at time is related to its residuals from prior periods. These terms seem plausible to include in a model, since the actual residuals are also part of the flow of goods and are not accounted for in the prior estimated lagged values. The ARIMA model combines both the autoregressive and moving average terms in one model, as specified in the generic model definition in Equation (5).

= + ∑

=1

+

=

+ ∑

=1

+

,−

=1

+

+

ℎ,−

=1

= + ∑

=1

+

+

,−

=1

(5)

The model requires the data to be stationary, i.e. its mean and variance do not vary in time. However, our data probably incorporates some sort of trend. Therefore, we need to use a differenced variables approach to model the variables, as generically defined in Equation (6).

=

1

Combination of models and prioritization

(6)

To counter the challenging volume of false positives identified by (Hardy, 2014) it is essential to choose the best possible models and refine these further if possible. Previous studies also propose to combine multiple models (Kogan A. , Alles, Vasarhelyi, & Wu, 2014) and

prioritization (Cao, Chychyla, & Stewart, 2015) in order to reduce the noise in the detected anomalies.

Research question

In total four different models of continuity equations are used in the field of continuous assurance. Auditors rely on the accuracy and anomaly detection capability of these models to provide assurance on the data. Kogan et al. (2014) have performed a first performance comparison of the RVAR, LRM and SEM model on actual data from a procurement cycle in a large medical supplier. They found that the models overall performed equally well. The SEM model appeared to be superior in terms of false negative error rates, while the RVAR model appeared to be superior in terms of false positive error rates. Therefore, Kogan et al. also proposed to combine the models to result in even better anomaly detection. The somewhat equal performance of the individual models might be caused by the unpredictability of the procurement cycle, because not all lag terms are controlled by the firm. However, sales cycles might be more predictable, because all the lag terms are controlled by the firm. Comparison of the models in this cycle might yield different results, because oversimplification issues in the LRM and SEM models might not be problematic in a more predictable cycle. This leads to my research question:

Which of the existing models of continuity equations in continuous auditing has the best anomaly detection capability?

III. Research Design

Data

The proposed base model for the sales cycle is based on three different quantities: the ordered quantity, the quantity of goods shipped and the quantity invoiced. These three variables can be provided by most ERP systems on a daily basis.

Data is provided by a wholesaler in technical supplies. This company uses an off-the-shelf solution of Microsoft Dynamics AX 2009. The data was extracted from separately generated reports containing transaction quantities for each of the process steps by merging the columns by date, as presented in Figure 2.

SalesOrders Shipments Invoices PK Date PK Date PK Date Quantity Quantity Quantity SalesData PK,FK1,FK2,FK3
SalesOrders
Shipments
Invoices
PK
Date
PK
Date
PK
Date
Quantity
Quantity
Quantity
SalesData
PK,FK1,FK2,FK3
Date
SO
GS
IS

Figure 2. Data model consisting of daily aggregates for three different stages in the sales cycle: ordered quantity (SO), quantity of goods shipped to customer (GS) and quantity invoiced (IS) combined by date via a SQL join clause. The date serves as the primary and foreign keys of the data source involved.

The data reflects actual day-to-day transaction quantities of February 2007 up to November 2007, excluding Sundays and holidays during which the company was closed for business. Saturdays are still included because, sometimes, high priority orders are shipped on Saturdays. Two data sets are provided: data from a Dutch subsidiary and data from a German subsidiary.

The resulting data is exported as a CSV file to be imported by the model implementations in R. The CSV file consists of four data fields, i.e. date, the quantities ordered, quantities shipped and quantities invoiced. More detailed information about the data can be found in Appendix

A.

Panel A

Variable

n

Mean

Std.Dev.

25th Pct.

Median

75th Pct.

Sales orders (SO) Goods shipped (GS) Invoices sent (IS)

264

66,845

60,676

38,384

62,548

83,122

264

62,068

46,099

42,295

63,326

40,865

264

60,211

47,237

78,393

60,745

81,303

Panel B

Pearson correlations

||

||

||

||

1.000

0.600*

0.588*

||

1.000

0.960*

||

1.000

*:values significant on the 1% level.

Table 1. Panel A: sample characteristics of the data set consisting of 264 observations of actual day-to-day transaction quantities in sales orders, goods shipped en invoices sent. Panel B: Pearson correlations between the quantity variables.

B: Pearson correlations between the quantity variables. Figure 3 . Plot of daily aggregates for three

Figure 3. Plot of daily aggregates for three different stages in the sales cycle: ordered quantity (SO), quantity of goods shipped to customer (GS) and quantity invoiced (IS) as provided in the data set.

Table 1 and Figure 3 presents descriptive statistics about the three quantity fields in the data set of the Dutch subsidiary. The Pearson correlations show that the GS and IS variables are strongly related. This is fully in line with the notion that invoices are generated at the same time as the goods are shipped most of the time. Furthermore, the charts clearly show less activity on Saturdays compared to weekdays. On Saturdays only priority orders and over-the- counter sales are handled.

The data is split into two separate parts, which account for roughly two thirds and one third of the observations included in the data set respectively. The first part will be used as a training set to estimate the model parameters for all models. The second part is used as a test set. After estimation, the models will be tested by generating predictions for the test set.

Implementation of the models

The models will be implemented in R, a widely accepted language for statistical processing and data analytics. A rudimentary implementation of these models is already available in the form of R packages. All models are implemented in four stages: data collection, pre-processing, modeling and prediction.

The LRM and SEM models implementation is based on the built-in lm function and the systemfit package, which has been developed and published by Arne Henningsen and Jeff D. Hamann and is available via CRAN. (Henningsen & Hamann, 2007)

The VAR and RVAR model implementation code is centered around the vars package, which has been developed and published by Bernhard Pfaff and Matthieu Stigle and is available via CRAN. (Pfaff & Im Taunus, 2007; Pfaff, 2008; Pfaff, 2008) The package includes several functions for modeling VARs, testing the VARs and presenting the results.

The ARIMA model is implemented by using the auto.arima function as provided by the forecast package, as developed by Rob J. Hyndman. The package is made available via CRAN and GitHub. (Hyndman, 2015; Hyndman & Khandakar, 2008) It includes functions for modeling and analyzing univariate time series model forecasts. The auto.arima function is used to automatically select the optimal parameters for number of autoregressive terms, number of moving average terms and differencing order to model the best fitting ARIMA model.

The modeling implementation in R can be found in Appendix B.

Testing of the models

After the model parameters are estimated based on the training set the resulting models are

tested. Anomaly detection capability is tested by counting Type I and Type II errors in the

model predictions based on a slightly modified test set.

The test set is altered by decreasing the quantities in five randomly selected observations by

100%. These altered observations serve as injected anomalies in the test set. The test set,

including the seeded anomalies, are then processed by the model implementation and

anomalies are reported.

In order to improve randomness and reduce the apparent selection bias the testing is repeated

100,000 times, while randomly selecting ten observations to be altered by 100% in the original

test set for every repetition. Subsequently each observation is compared to the amounts

calculated with the model by using a two-sided t-test, as specified in Equation (7). It is assumed

that the predictor ̂ ~ (, ).

̂ −

2

,−1

√ ≤ ≤ ̂ +

2 ,−1

(7)

The mean number of Type I and Type II errors found serves as the test statistic for comparison

purposes. Robustness of the results is further tested by using a computer-simulated test set with

preset levels of noise and randomly injected anomalies.

The test procedure, as implemented in R, can be found in Appendix B.

Models

IV.

Results

The five models were trained with the training subset of the full data sets. The two data sets

with realistic data from the Dutch and German subsidiaries were split into a training set and a

validation set prior to the testing procedures. The simulated data set was split using the same

method. The training set is used to train the five models and obtain a definitive model definition

with fixed coefficients.

Panel A Dutch subsidiary German subsidiary Simulated test set Adjusted R 2 LRM VAR RVAR
Panel A
Dutch subsidiary
German subsidiary
Simulated test set
Adjusted R 2
LRM
VAR
RVAR
0.9337
0.7989
0.0308
0.7671
0.8117
0.8361
0.7648
0.8112
0.8346
Panel B
Lag
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Dutch subsidiary
SO
X
GS
X
X
X
IS
X
X
German subsidiary
SO
X
X
X
GS
X
X
IS
X
Lag
15
16
17
18
19
20
21
22
23
24
25
Dutch subsidiary
SO
X
X
X
GS
X
X
X
X
X
IS
X
X
X
German subsidiary
SO
GS
IS

Table 2. Panel A: adjusted R 2 model characteristics, as based on the training subset; Panel B: lagged

coefficients for the three steps in the sales cycle which are left in the RVAR model after the exclusion of

insufficiently significant lagged terms.

Training resulted in the model characteristics as shown in Table 2. The adjusted R 2 of the

models could be an initial indication of how well the model fits the training data. The LRM

model fits the training subset of the Dutch subsidiary very well with an adjusted R 2 of 0.9337, while the VAR and RVAR model resulted in a lower adjusted R 2 of 0.7671 and 0.7648 respectively. When the training subset of the German subsidiary was used to train the models, a different effect was shown. The LRM model fit the data the least with an adjusted R 2 of 0.7989, while the VAR and RVAR model fit the data slightly better with an adjusted R 2 of 0.8117 and 0.8112 respectively. The simulated test set showed another different set of results. The LRM model fit the data worst with an adjusted R 2 of 0.0308, while the VAR and RVAR models both fit significantly better with adjusted R 2 of 0.8361 and 0.8346 respectively. This might be caused by the intentional non-linearity of the simulated data set.

The adjusted R 2 of all RVAR models was slightly lower than its VAR base model, even though non-significant coefficients were eliminated from the final RVAR model definition. This might be the result of overfitting effects of the VAR model. As shown in Table 2, most coefficients were eliminated in the final RVAR model definition. The coefficients around lags of 4, 7 and 14 days could indicate intuitive default delays between ordering and shipping.

Model tests

As shown in Table 3 all four models were tested on the two data sets with realistic data. The validation subset of data from the Dutch subsidiary consisted of 103 samples with 10 randomly injected anomalies per repetition. Prior to testing and injection of anomalies the LRM and SEM model identified 40 and 43 anomalies in the validation subset, while the VAR model identified 18 anomalies and the RVAR model found 60 samples to be erroneous. The ARIMA model found just one anomaly prior to injection. These pre-test anomalies might be the result of the non-smooth characteristics of the flow of goods on these sample days. Furthermore, the realistic data was not audited prior to testing, so the data might contain actual anomalies, which are identified with this test. Only the first explanation is confirmed to be apparent in the data set, as can be seen in Figure 3. These pre-test anomalies have to be taken into consideration when interpreting the results. The expected average of Type I errors, as potentially identified by the testing procedure, can be defined as in Equation (8). This equation describes the expected average of Type I errors as the number of true positive injections in the pre-test anomaly set, which are correctly identified as such, subtracted from the number of pre-test anomalies.

|e | = |e | |e e e |

(8)

All four models caused both Type I and Type II errors. The LRM and SEM model both perform likewise on average with around 36 to 39 Type I errors, while the VAR model performed best in this regard with only 16.25 Type I errors on average. The RVAR model performed worse with triple the amount of Type I errors compared to its base VAR model. The ARIMA model test resulted in 0.9 Type I errors on average. With regards to Type II errors the LRM and SEM models also performed similarly with 2.72 false negatives identified on average. The RVAR model performed best with only 1.27 false negatives identified. In total the VAR model performed best in terms of specificity and positive predictive value, while the RVAR model was superior in terms of sensitivity and negative predictive value. The overall positive predictive value of the models turns out to be fairly low with a maximum value of 0.38 for the VAR model and 0.92 for the ARIMA model.

Testing based on data from the German subsidiary showed similar, but slightly worse, results. The LRM and SEM models performed similarly with 46.09 and 52.43 Type I errors respectively. The ARIMA model test did not result in a Type I error. The RVAR model performed worst with 62.36 Type I errors, while its base VAR model performed best with 33.44 false positives identified. However, the RVAR model performed best regarding Type II errors with 1.35 Type II errors. The other models performed worse with around 2.7 false negatives identified during testing, during which the ARIMA model performed worst with almost all injections remaining falsely unidentified. In terms of sensitivity the RVAR model showed to perform best with 0.8654, while the other models all performed worse with sensitivity at around 0.74. As with the Dutch subsidiary the specificity was best when using the VAR model with a specificity of 0.7506. The large number of Type I errors resulted in a fairly low positive predictive value of 0.2302 using the VAR model or 1.00 using the ARIMA model. Negative predictive value was similarly high as with data from the Dutch subsidiary.

The ARIMA model test results appear to be valuable in terms of Type I errors. The model misidentifies almost no true negatives. However, in terms of Type II errors this model performs worst with almost all true positives misidentified as negatives. These results might be caused by the excessively wide confidence intervals, which are used to identify anomalies. The wide confidence intervals accept a large rang of possible values as valid actual values.

Model combination

The results show it might be interesting to combine the positive detection characteristics of the VAR and RVAR model and possibly the ARIMA and RVAR model. Therefore, a combined test procedure was additionally implemented to exploit positive detection characteristics of both models in one procedure. The procedure was implemented as shown in Figure 4. In the first combination test procedure the ARIMA and RVAR models are cascaded to determine if an observation should be flagged as an anomaly. In the second combination test procedure the VAR and RVAR models are cascaded exactly the same. If the RVAR model suspects the observation to be an anomaly, it is subsequently tested by the ARIMA model or VAR model. If this subsequent model confirms the suspected observation to be erroneous, the observation is definitively flagged as an anomaly. If the RVAR model initially does not flag the observation, it is definitively not flagged.

Start Start RVAR RVAR condition condition test test FA LSE TRUE ARIMA ARIMA FALSE condition
Start Start
RVAR RVAR
condition condition test test
FA
LSE
TRUE
ARIMA ARIMA
FALSE
condition condition test test
TR
UE
Observation Observation is is
Observation Observation is is
NOT NOT FLAGGED FLAGGED
FLAGGED FLAGGED

(a)

Start Start RVAR RVAR condition condition test test FA LSE TRUE VAR VAR FALSE condition
Start Start
RVAR RVAR
condition condition test test
FA
LSE
TRUE
VAR VAR
FALSE
condition condition test test
TR
UE
Observation Observation is is
Observation Observation is is
NOT NOT FLAGGED FLAGGED
FLAGGED FLAGGED

(b)

Figure 4. (a) Combined test procedure in which the ARIMA and RVAR models are cascaded. First the RVAR model is used to test the observation, if it is suspected to be an anomaly the ARIMA model is used to confirm the anomaly. If the anomaly is confirmed by the ARIMA model, the observation is flagged. Otherwise the observation is not flagged. (b) Combined test procedure in which the VAR and RVAR models are cascaded. First the RVAR model is used to test the observation, if it is suspected to be an anomaly the VAR model is used to confirm the anomaly. If the anomaly is confirmed by the VAR model, the observation is flagged. Otherwise the observation is not flagged.

The combined model results are also shown in Table 3. The combination of the ARIMA and RVAR model theoretically performs best, but might not be usable due to the confidence intervals being too wide, as stated previously. The combination of the VAR and RVAR model incorporates the positive characteristics of both models, resulting in a low Type I error rate combined with a low Type II error rate. The results show that the combined model tests result in the low amount of Type II errors caused by the RVAR model and the low amount of Type I errors caused by the VAR model. Using the combined test procedure of the RVAR and VAR models results in the highest sensitivity, specificity, positive and negative predictive value in both cases of using the Dutch or German data.

Dutch subsidiary German subsidiary
Dutch subsidiary
German subsidiary
 

Combi

Combi

Combi

Combi

 

LRM

SEM

VAR

RVAR

ARIMA

ARIMA i

VAR ii

LRM

SEM

VAR

RVAR

ARIMA

ARIMA i

VAR ii

Test characteristics

 

Repetitions

100,000

100,000

100,000

100,000

100,000

100,000

100,000

100,000

100,000

100,000

100,000

100,000

100,000

100,000

N

103

103

103

103

103

103

103

104

104

104

104

104

104

104

Injected anomalies

10

10

10

10

10

10

10

10

10

10

10

10

10

10

Pre-test anomalies

40

43

18

60

1

1

18

51

58

37

69

-

-

35

Type I errors Average Minimum Maximum Standard deviation

36.12

38.83

16.25

54.19

0.90

0.90

16.25

46.09

52.43

33.44

62.36

-

-

31.64

31

33

10

50

-

-

10

41

48

28

59

-

-

26

40

43

18

60

1

1

18

51

58

37

68

-

-

35

1.4728

1.4917

1.1477

1.4892

0.2960

0.2960

1.1477

1.5048

1.5019

1.4482

1.4321

-

-

1.4293

Type II errors Average Minimum Maximum Standard deviation

2.72

2.72

2.81

1.27

9.80

1.27

1.27

2.79

2.79

2.41

1.35

8.75

1.35

1.35

-

-

-

-

8.0

-

-

-

-

-

-

3.0

-

-

9

9

9

7

10

7

7

9

9

8

7

10

7

7

1.3444

1.3444

1.3615

1.0063

0.4151

1.0063

1.0029

1.3514

1.3514

1.2877

1.0284

0.9985

1.0284

1.0284

Performance

measures

 

Sensitivity

0.7283

0.7283

0.7186

0.8735

0.0195

0.8735

0.8735

0.7210

0.7210

0.7592

0.8654

0.1250

0.8654

0.8654

Specificity

0.7192

0.6900

0.9328

0.5249

1.0978

1.0978

0.9328

0.6161

0.5486

0.7506

0.4430

1.1064

1.1064

0.7698

PPV i

0.2168

0.2048

0.3809

0.1558

0.9172

0.9172

0.3809

0.1783

0.1602

0.2302

0.1382

1.0000

1.0000

0.2402

NPV ii

0.9716

0.9716

0.9706

0.9866

0.9046

0.9866

0.9866

0.9712

0.9712

0.9750

0.9859

0.9148

0.9859

0.9859

i : cascaded test procedure using the RVAR and ARIMA model; ii : cascaded test procedure using the RVAR and VAR model; iii : positive predictive value; iv : negative predictive value

Table 3. Test results of the four models on two data sets containing realistic data, from the Dutch subsidiary and German subsidiary

Simulated test set

 

Combi

Combi

 

LRM

SEM

VAR

RVAR

ARIMA

ARIMA i

VAR ii

Test characteristics

Repetitions

100,000

100,000

100,000

100,000

100,000

100,000

100,000

N

801

801

801

801

801

801

801

Injected anomalies

10

10

10

10

10

10

10

Pre-test anomalies

615

636

375

425

33

33

364

Type I errors Average

607.33

628.06

370.32

419.69

32.59

32.59

359.45

Minimum

605

626

365

415

28

28

354

Maximum

613

634

375

425

33

33

364

Standard deviation

1.3293

1.2725

1.5717

1.5696

0.6264

0.6264

1.5689

Type II errors Average Minimum Maximum Standard deviation

Performance measures Sensitivity Specificity PPV iii NPV iv

0.01

-

1.19

0.96

7.38

0.96

0.96

-

-

-

-

1.0

-

-

1

-

7

6

10

6

6

0.1112

-

1.0207

0.9281

1.3841

0.9281

0.9281

0.9987

1.0000

0.8811

0.9037

0.2624

0.9037

0.9037

0.2448

0.2186

0.5445

0.4821

0.9714

0.9714

0.5582

0.0162

0.0157

0.0263

0.0233

0.2348

0.2348

0.0271

1.0000

1.0000

0.9985

0.9988

0.9908

0.9988

0.9988

i : cascaded test procedure using the RVAR and ARIMA model; ii : cascaded test procedure using the RVAR and VAR model; iii : positive predictive value; iv : negative predictive value

Table 4. Test results of the four models on two data sets containing a computer generated simulation test set.

V.

Conclusion

As shown in the previous section all five models performed quite similarly in terms of order of magnitude of the two error types. However, differences do exist between model performance. As stated in the research design the research question to be answered is: which of the existing models of continuity equations in continuous auditing has the best anomaly detection capability? To fully answer this question, both Type I and Type II error performances have to be taken into account.

Since the ARIMA model and the ARIMA-RVAR cascaded model appear to be weak indicators of true anomalies due to its extremely wide testing intervals, i.e. almost any observation will fall within the testing interval and thus remain unflagged, these models are not considered to be viable models to detect anomalies.

Type I errors

Type I errors, or false positives, are an important aspect in assurance with respect to audit efficiency. Identified anomalies need to be investigated further and could bring up the need for additional assurance activities on the cycle or account under investigation. Falsely identified anomalies lead to loss of resources if further investigations prove to be unnecessary afterwards.

In terms of performance with regard to Type I errors, the VAR model performed best with the expected average number of Type I errors as described with Equation (8).

Type II errors

Type II errors, or false negatives, are an important aspect in assurance with respect to audit effectiveness and less with respect to audit efficiency. Anomalies that are not detected during the testing procedures could lead to a false sense of certainty with regards to audited object. These kinds of risks are by definition part of the audit risk model as the detection risk element. The detection risk element is the risk that an auditor fails to detect a material misstatement. Auditors use the audit risk model to manage the overall risk of an audit engagement. If one of the risk elements imposes an impermissible risk level, additional assurance activities have to be performed. Minimization of Type II errors can be considered to be indispensable for the audit risk.

In terms of Type II errors, the RVAR model performs best with on average less than 1.4 false negatives identified in the worst performing test set (German subsidiary). The number of Type

II

errors during this test amount to approximately 13.7% of injected anomalies in this set, while

it

amounts to approximately 12.4% of injected anomalies in data of the Dutch subsidiary.

Overall performance

The model with the overall best performance is the cascaded combination of the VAR and RVAR models. The VAR model performs best with regards to Type I errors and worst with regards to Type II errors, while the RVAR model performs best with regards to Type II errors but worst with regards to Type I errors. Choosing one solitary test model from these two models would result in optimal performance with regards to one error type, while making major and possibly unacceptable concessions with regards to the other error type. In order to optimize performance and eliminate the need to choose a prevalent error type a combination model was tested. The cascaded combination model takes the best properties of both models and combines these into a new model, which performs best with regards to both Type I and Type II errors. Therefore, the VAR-RVAR combination model has the best anomaly detection capability of existing continuity equations.

VI.

Discussion

Predictability of the sales cycle

The data used in this study might be more predictable and therefore easier to capture in a mathematical model than other data of interest for an auditor. The data used in this study was provided by a Dutch provider of technical supplies and consists of quantities of three different steps in the sales cycle. The sales cycle might be more predictable due to the fact that from the initial step in the sales cycle, the audited entity has almost full control over any lags or delays incurred in between steps. For example, company policy might state that goods are invoiced at the same time as the goods are shipped to the customer. The lag is then fixed to zero days. Deviations might occur, but only exceptionally. In the purchasing cycle the audited entity only has limited control over the lags incurred in the cycle, since these are primarily determined by the supplying parties.

The dependency on the predictability of the cycle is also visible in the results. The adjusted R 2 of the LRM model for the simulated test data is close to 0, while the adjusted R 2 of the other models is even higher than 0.83. This was completely expected since the simulated test set was based on a non-linear source, an offset sine function, and thus using a simple linear regression model would not be able to fit the data as well as a more complex model as the VAR or RVAR model.

Confidence interval vs. prediction interval

In the test procedure a two-sided confidence interval is improperly used to calculate the lower and upper boundaries of acceptable deviations from the predicted value. This might cause issues if the statistical implications of this choice are interpreted wrongly. One might interpret a 95% confidence interval to imply 95% certainty of the measurements at this point in time to be within this interval, while statistically this is a false assumption in this test procedure. In reality this interval only reflects plausible values of the mean of the measurements at this point in time to be within this interval. The assumption would be true if a prediction interval was used to calculate the boundaries. The confidence interval is only used as a fairly simple yet powerful tool to generate an interval to reflect plausible outcomes.

However, prediction intervals in this case would cause the intervals to be too wide to be successfully applied in audit. The prediction intervals would almost be as wide as the total

range of the data set and thus result in close to 100% Type II errors, since all reported/measured quantities are within the prediction interval.

If the auditor refrains from the interpretation fallacy the use of the narrower confidence interval, instead of the more statistically correct prediction interval, would give a viable set of boundaries.

Practical application

The use of innovative techniques by auditors should ultimately result in improvements regarding audit efficiency and/or effectiveness. With regards to effectiveness, the level of assurance could increase, while efficiency improves when conventional audit procedures can be executed faster or the required level of assurance is reached with less effort. Without improvement in one or both of these aspects, innovative techniques should and will not be used by auditors.

Type I errors

Type I errors, or falsely identified anomalies, potentially cause a decrease in audit efficiency. The detection of anomalies implies that auditor should perform additional testing to the detected anomalies, under the assumption that the anomalies were correctly identified. Any innovative audit technique should aim to decrease the likelihood of falsely identifying anomalies.

The results in Table 3 shows that the average number of Type I errors using the best possible model, counts for over 15% of the total tested sample. If this implies that auditors should perform further testing on 15% of the total population, even the best possible model would be unacceptable. The sheer number of false positives are probably not acceptable to auditors, even if the initial testing with the proposed model would incur zero cost.

However, based on these results there are no valid conclusions to be drawn regarding practicality of this model. The number of Type I errors is highly influenced by the number of identified pre-test anomalies and reflect the number of Type I errors which are to be expected, based on Equation (8). It would be naive to assume that all the pre-test anomalies are basically Type I errors. The data was not audited integrally and in detail to ensure that no errors are apparent in the data set.

For the purpose of this study practicality was not in scope. However, it should be noted that practicality is an important aspect with respect to adoption of innovative audit techniques.

Type II errors

The number of Type II errors identified in this study for this data set is relatively high. In the Dutch data set, using the best possible model, 12.4% of injected anomalies remained incorrectly unidentified as an anomaly. In the simulated test set this figure drops to 4.1% using the best possible model. For any model to be actually used in practice the detection risk should be as low as possible, but at least as low as the desired detection risk as set by the auditor prior to the engagement. The results on the realistic data do indicate that the detection risk is possibly not decreased to a level below the desired preset risk level, when the current models of continuity equations are used. This might impact the adoption of this tool by auditors.

Possible improvements in the proposed model

The models in this study are based on roughly 65% of a years’ worth of data. The remaining part, or 35% of a year’s worth of data, is used for testing. In reality auditors might use the full previous years’ worth of data to train the models and then test a single sample at a time. When this sample is not flagged as an anomaly, the model is retrained on the historical data, now including the single tested sample. That model is then used to test for the next single sample. This method of retraining the model might increase model fitness and thus perform better. Further research should be performed to test this hypothesis.

REFERENCES

(CICA), C. I. (1999). Continuous Auditing. Continuous Auditing. Toronto, ON, Canada. Alles, M. G., Brennan, G., Kogan, A., & Vasarhelyi, M. A. (2006). Continuous monitoring of business process controls: A pilot implementation of a continuous auditing system at Siemens. International Journal of Accounting Information Systems, 7(2), 137-161. Alles, M. G., Kogan, A., & Vasarhelyi, M. A. (2002). Feasibility and economics of continuous assurance. Auditing: A Journal of Practice & Theory, 21(1), 125-138. Alles, M. G., Kogan, A., & Vasarhelyi, M. A. (2008). Putting Continuous Auditing Theory into Practice: Lessons from Two Pilot Implementations. Journal of Information Systems, 22(2), 195-214. Alles, M., Kogan, A., & Vasarhelyi, M. (2008). Audit automation for implementing continuous auditing: Principles and problems. Ninth International Research Symposium on Accounting Information Systems(August 2015), 1-24. Alles, M., Kogan, A., & Vasarhelyi, M. (2008). Putting continuous auditing theory into practice: Lessons from two pilot implementations. Journal of Information Systems, 22(2), 195-214. Alles, M., Kogan, A., Vasarhelyi, M., & Wu, J. (2005). Continuity Equations in Continuous Auditing: Detecting Anomalies in Business Processes. Bachlechner, D., Thalmann, S., & Manhart, M. (2014). Auditing service providers:

Supporting auditors in cross-organizational settings. Managerial Auditing Journal, 29(4), 286-303. Banker, J. D., Chang, H., & Kao, Y.-c. (2002). Impact of Information Technology on Public Accounting Firm Productivity. Journal of Information Systems, 16(2), 209-222. Bedard, J. C., Deis, D. R., Curtis, M. B., & Jenkins, J. G. (2008). Risk monitoring and control in audit firms: A research synthesis. Auditing: A Journal of Practice & Theory, 27(1),

187-218.

Cao, M., Chychyla, R., & Stewart, T. (2015). Big Data Analytics in Financial Statement Audits. Accounting Horizons, 29(2), 1-10. Chiu, V., Liu, Q., & Vasarhelyi, M. a. (2014, 12). The development and intellectual structure of continuous auditing research. Journal of Accounting Literature, 33(1-2), 37-57.

Dowling, C., & Leech, S. (2007). Audit support systems and decision aids: Current practice and opportunities for future research. International Journal of Accounting Information Systems, 8(2), 92-116. Dzeng, S. (1994). A comparison of analytical procedures expectation models using both aggregate and disaggregate data. Auditing: A Journal of Practice & Theory, 13(3), 1-

24.

Hardy, C. A. (2014). The messy matters of continuous assurance: Preliminary findings from six Australian case studies. Journal of Information Systems, 28(2), 140428102139008. Henningsen, A., & Hamann, J. D. (2007). systemfit: A Package for Estimating Systems of Simultaneous Equations in R. Journal of Statistical Software, 23(4), 1-40. Retrieved from http://www.jstatsoft.org/v23/i04/ Hyndman, R. J. (2015). forecast: Forecasting functions for time series and. Retrieved from http://github.com/robjhyndman/forecast Hyndman, R. J., & Khandakar, Y. (2008). Automatic Time Series Forecasting: The forecast Package for R. Journal of Statistical Software, 27(3). Kogan, A., Alles, M. G., Vasarhelyi, M. A., & Wu, J. (2010). Analytical Procedures for Continuous Data Level Auditing: Continuity Equations. Kogan, A., Alles, M. G., Vasarhelyi, M. A., & Wu, J. (2014). Design and Evaluation of a Continuous Data Level Auditing System. Auditing: A Journal of Practice & Theory, 33(4), 221-245. Kogan, A., Alles, M., Vasarhelyi, M., & Wu, J. (2014). Design and Evaluation of a Continuous Data Level Auditing System. Auditing: A Journal of Practice & Theory, 33(4), 221-245. doi: 10.2308/ajpt-50844 Krahel, J. P., & Vasarhelyi, M. a. (2014). AIS as a Facilitator of Accounting Change:

Technology, Practice, and Education. Journal of Information Systems, 28(2), 1-15. Kuenkaikaew, S., & Vasarhelyi, M. a. (2013). The predictive audit framework. International Journal of Digital Accounting Research, 13(April), 37-71. Leitch, R. A., & Chen, Y. (2003). The effectiveness of expectation models in recognizing error patterns and generating and eliminating hypotheses while conducting analytical procedures. Auditing: A Journal of Practice & Theory, 22(2), 147-170. Malaescu, I., & Sutton. (2015). The Reliance of External Auditors on Internal Audit's Use of Continuous Audit. Journal of Information Systems, 29(1), 1148-1159.

Manson, S., McCartney, S., Sherer, M., & Wallace, W. a. (1998). Audit Automation in the UK and the US: A Comparative Study. International Journal of Auditing, 2(3), 233-

246.

Pfaff, B. (2008). VAR, SVAR and SVEC models: Implementation within R package vars. Journal of Statistical Software, 27(4), 1-32. Pfaff, B. (2008). vars: VAR Modelling. R package version, 1-3. Pfaff, B., & Im Taunus, K. (2007). Using the vars package. Rezaee, Z., & Sharbatoghlie, A. (2002). Continuous auditing: Building automated auditing capability. Auditing: A Journal of Practice & Theory, 21(1). Vasarhelyi, M. A., & Halper, F. B. (1991). The continuous audit of online systems. Auditing:

A Journal of Practice & Theory, 10(1), 110-125. Vasarhelyi, M. A., & Romero, S. (2014). Technology in audit engagements: a case study. Managerial Auditing Journal, 29(4), 350-365. Vasarhelyi, M. A., Alles, M. G., & Kogan, A. (2004). Principles of analytic monitoring for continuous assurance. Journal of Emerging Technologies in Accounting, 1(1), 1-21. Vasarhelyi, M. A., Alles, M., & Williams, K. T. (2010). Continuous assurance for the now economy. Institute of Chartered Accountants in Australia Sydney, Australia. Vasarhelyi, M. A., Warren, D., Teeter, R. A., & Titera, W. R. (2014). Embracing the Automated Audit. Journal of Accountancy(April).

Appendix A.

Data

The data is provided by a Dutch wholesaler in technical supplies and contains daily aggregates of the three separate steps in the sales cycle.

SalesOrders Shipments Invoices PK Date PK Date PK Date Quantity Quantity Quantity SalesData PK,FK1,FK2,FK3
SalesOrders
Shipments
Invoices
PK
Date
PK
Date
PK
Date
Quantity
Quantity
Quantity
SalesData
PK,FK1,FK2,FK3
Date
SO
GS
IS

Figure 2. Data model consisting of daily aggregates for three different stages in the sales cycle: ordered quantity (SO), quantity of goods shipped to customer (GS) and quantity invoiced (IS) combined by date via a SQL join clause. The date serves as the primary and foreign keys of the data source involved.

The data is imported by using the following R code:

serves as the primary and foreign keys of the data source involved. The data is imported

Appendix B.

Implementation in R

The code used to generate the simulation test set, modelling, testing and reporting are presented in this appendix. However, they are also available via GitHub 1 and contained on an accompanying CD-rom.

Simulation test set generator

on an accompanying CD-rom. Simulation test set generator Model implementation 1 GitHub repository
on an accompanying CD-rom. Simulation test set generator Model implementation 1 GitHub repository
on an accompanying CD-rom. Simulation test set generator Model implementation 1 GitHub repository
on an accompanying CD-rom. Simulation test set generator Model implementation 1 GitHub repository
on an accompanying CD-rom. Simulation test set generator Model implementation 1 GitHub repository
on an accompanying CD-rom. Simulation test set generator Model implementation 1 GitHub repository

Model implementation

CD-rom. Simulation test set generator Model implementation 1 GitHub repository continuityequations-thesis

1 GitHub repository continuityequations-thesis repository:

https://github.com/erikvankempen/continuityequations-thesis/releases/tag/0.1

31
31
31
31
31
31
31
31
31
31
31
31
Test procedure 32

Test procedure

Test procedure 32
Test procedure 32
Test procedure 32
Test procedure 32
Test procedure 32
33
33
33
33
Report generator 34

Report generator

Report generator 34
35
35
35
35
Test automation script 36
Test automation script 36

Test automation script

Test automation script 36
Test automation script 36
Test automation script 36
Test automation script 36