You are on page 1of 16

International Journal of Accounting Information Systems 46 (2022) 100573

Contents lists available at ScienceDirect

International Journal of Accounting


Information Systems
journal homepage: www.elsevier.com/locate/accinf

Full population testing: Applying multidimensional audit data


sampling (MADS) to general ledger data auditing☆
Jamie W. Freiman a, *, Yongbum Kim b, Miklos A. Vasarhelyi c
a
Marist College, United States
b
Ramapo College, United States
c
Rutgers University, United States

A R T I C L E I N F O A B S T R A C T

Keywords: Changes to the General Ledger (GL) represent a link between transactional business events from
Audit Journal Entries and prepared financial statements. Errors in these very large datasets can result in
Big data material misstatements or account misbalance. Unfortunately, a plethora of conditions renders
General ledger
traditional statistical and non-statistical sampling less effective. As a full-population examination
Audit analytics
Full population testing
procedure, Multidimensional Audit Data Sampling (MADS) mitigates these issues. In conjunction
Suspicion scoring with top practitioners, we utilize a design science approach in applying the full-population MADS
methodology to a real dataset of GL account balance changes. Issues such as the effectiveness of
internal controls, detection of low-frequency high-risk errors, and earnings management concerns
are addressed. This paper demonstrates how vital insights can be gained using MADS. More
importantly, this approach also highlights the exact portion of the population that is error-free
with respect to the auditors’ tests.

1. Introduction

The General Ledger (GL) is a unique element in accounting. It serves as a vital link between aggregated individual business events
and financial statements. Business transactions get consolidated into account balance updates in the GL, which are then used to
compile financial statements. Such updates to the GL form an auditable dataset that poses both significant risk and a great opportunity
for auditors depending on the effectiveness of audit testing. Errors in the account balances represented in the GL may carry through to
the final financial statement, possibly resulting in a material misstatement. Therefore, it is in an auditor’s interest to ensure that these

The authors of this paper would like to thank the following for their help and contribution with respect to the research, methodology, and

review of this academic study. Dr. Won No for his guidance on the Multidimensional Audit Data Selection methodology. The Rutgers AICPA Data
Analytics Research Initiative for providing data, insight, input and funding for this research. Dorothy McQuilkin, Amy Pawliki, Ami Beers, and the
members of CPA Canada and the AICPA who provided insight and review on the paper and process. Brian Miller, Jason Guthrie, Brian Collins, and
the auditors from Ernst and Young, Deloitte, Baker Tilly, KPMG, PricewaterhouseCoopers, Crowe, BDO, RSM, and Grant Thornton who provided
valuable feedback and data for this study. We would also like to thank the members of the Rutgers Continuous Audit Reporting lab who reviewed
and helped with this paper. Specifically Dr. Alexander Kogan for his help in refining our methodology and reviewing our paper. Mrs. Barbara Jensen
for her tireless work in organizing the various moving parts that were necessary to complete our study. Dr. Christina Freiman for her dedicated
review and edits over the course of this paper’s many iterations.
* Corresponding author.
E-mail address: jwf77@scarletmail.rutgers.edu (J.W. Freiman).

https://doi.org/10.1016/j.accinf.2022.100573

Available online 17 August 2022


1467-0895/© 2022 Elsevier Inc. All rights reserved.
J.W. Freiman et al. International Journal of Accounting Information Systems 46 (2022) 100573

Fig. 1. MADS Model Building Process.

balances are free from material errors to thereby assure the assembled financial statements for use by outside parties and potential
investors, a sentiment echoed by those audit practitioners involved in this project.Fig. 1.
In order to ensure that no material errors are contained in the reported accounting data, auditors examining the GL should examine
account balances and all postings related to the changing of such balances. The volume of individual account changes reflected in these
ledgers can make spotting potential misstatements difficult for auditors. However, suppose auditors can accurately review changes to
all accounts represented in the GL. In that case, they will be able to garner a higher level of confidence in the integrity of the GL.
To deal with this problem, auditors have traditionally relied upon statistical and non-statistical sampling techniques to select
account changes for verification. These techniques are not without fault. Statistical techniques are poor at detecting low-frequency,
high-risk events, especially in large populations that exist today (Neter and Loebbecke, 1975; Hall et al., 2001) and non-statistical
techniques suffer from targeting biases due to the selective nature of judgmental sampling methodology (Deming, 1954; Arkin,
1957; Blocher and Bylinski, 1985; Elder and Allen, 1998; Hall et al., 2000). In the era of big data, such issues are further exacerbated.
When testing millions, if not billions, of records, a sample size of 300 or even 3,000 may not provide sufficient evidence for con­
clusions. Simply increasing the sample size without testing the entire population will not solve the problem, as indicated by Neter and
Loebbecke (1975).
In an attempt to modernize the audit approach to address these issues, five senior audit partners, each from a different one of the
world’s largest audit firms, partnered with our research team. These audit partners provided us with data in the form of a questionnaire

2
J.W. Freiman et al. International Journal of Accounting Information Systems 46 (2022) 100573

(Table 4), freeform responses as a proxy for audit brainstorming (Table A-1), and general guidance throughout the study. This provided
us with a real-world proxy for experienced auditor judgments when necessary. The insights and approaches utilized in this paper were
used by those firms to provide instruction as to how to apply alternative sampling techniques such as the one demonstrated in this
paper.
Some suggested modern methodologies use full population examination techniques such as suspicion scoring and exception
weighting to resolve the aforementioned sampling issues (Kim, 2011; Issa, 2013; Kim and Kogan, 2014; No et al., 2019). These ap­
proaches rely on the concept of scoring each record in a population. As examination procedures are applied to the whole population,
each record is given a score that represents the risk of misstatement. Higher suspicion scores indicate riskier records. While different
studies utilize different scoring functions, a summation of the risk scores from all the tests applied to each record is used to determine
the suspicion score assigned to it. This paper describes a methodology for using this technique that we have named Multidimensional
Audit Data Sampling (MADS). MADS is a process of full population examination involving data processing, filtering, suspicion scoring,
and finally, record prioritization or the process of ranking records by risk. This method has been employed in several different ac­
counting contexts such as revenue recognition or payroll cycles (Lee et al., 2019; No et al., 2019; Yoon et al., 2019), supporting its
potential use as an audit data analytic technique. This paper extends the literature on this approach by applying MADS to the GL, which
represents the core of accounting data and has not been studied before.
Our consortium of practitioners and researchers agreed that the advantage of full population testing in combination with suspicion
scoring is the depth of insight that is gained. Such methodologies may detect typical issues that auditors wish to identify, but they may
also uncover low-frequency high-risk records that may otherwise go undetected. Furthermore, they provide accurate audit evidence
concerning portions of the population that are not flagged by other audit tests and are “error-free” with respect to tests completed (Lee
et al. 2019; No et al., 2019; Yoon et al., 2019). While statistical samples extrapolate information on the error-free portion of the
population with a degree of mathematical uncertainty, full population testing methodologies such as MADS enable auditors to detect
records that are error-free with respect to their test criteria (No et al., 2019). Such a finding can be used as a form of audit evidence and
give auditors a greater level of confidence in their audit opinion.
This paper takes a design science approach to address an accounting issue (Kogan et al., 2019). In this case, the problem of ensuring
the fair representation of a financial statement is viewed through the lens of GL account updates. The artifact used to tackle this issue is
the MADS methodology. This methodology was developed to surmount sampling issues in large populations (No et al. 2019) and has
been applied successfully in other audit-related contexts (Lee et al., 2019; No et al., 2019; Yoon et al., 2019). This application instance
is unique as changes to accounts in the GL that represent a link between business events and financial statements. The effectiveness of
this approach is evaluated through assessments of the importance of findings and risk evaluations provided through consultations with
auditors. If the MADS methodology can detect issues or risks rated by auditors as high concern or interest, this application may be
considered successful.
The goal of this paper is twofold. Primarily, the goal is to illustrate the effectiveness of MADS as an artifact in detecting issues within
a GL dataset. Secondly, the paper contributes by demonstrating how examining individual changes to accounts within the GL can lead
to important audit-related discoveries. Several insights are gained concerning earnings management, low-frequency high-risk prob­
lems, and the effectiveness of internal controls.
To achieve these goals, we applied the MADS methodology to postings related to changes of accounts within the GL of a large multi-
national manufacturing company. The audit objectives are the traditional audit links between financial statements and granular level
data, such as GL postings. To this end, great care was taken to ensure that traditional audit objectives were being met throughout this
study. Filters were designed to test and examine the GL at the posting level to ensure the aggregated statements were correct. These
filters target a specific audit objective as rooted in the GLARE (General Ledger Audit Risk Evaluation) framework (Freiman, 2020). This
framework highlights specific audit test categories based on the types of data available to auditors and their audit objectives.
Throughout this study, completeness, accuracy, cutoff, classification, rights and obligations, occurrence, and valuation objectives are
tested, representing the traditional management assertions. Internal controls were also tested since failures involving internal controls
over the GL may result in material misstatements in the financial statements. Additionally, auditors are required to perform such tests
of internal controls in the post-SOX era. By applying such an approach to the breadth of business data found within individual changes
to accounts within the GL, insights can be gained over the representational faithfulness of the financial statements.
The remainder of the paper is organized as follows. Section two covers the literature motivation. The following section discusses the
MADS methodology employed in this paper. The subsequent sections cover how MADS was integrated with the data used in this study.
The penultimate section analyzes the results gathered from this study. The final section concludes.

2. Literature motivation

This section will discuss the literature that motivates the necessity for this study. Firstly, the literature about statistical and non-
statistical methodologies is examined. This discussion primarily focuses on the issues that pertain to these sampling approaches.
The literature illustrates how such methodologies may be inadequate in a modern paradigm. These issues provide the stimulus for
developing the MADS methodology applied in this paper. Secondly, the contemporary literature surrounding the development of the
applied MADS approach is reviewed. This literature establishes an avenue for using this alternative approach to examining a GL
dataset, as this paper does.

3
J.W. Freiman et al. International Journal of Accounting Information Systems 46 (2022) 100573

2.1. Low-Frequency error detection

A key issue that exists with sampling techniques is their effectiveness (or lack thereof) in detecting low-frequency errors in a
population. This problem has been well documented (Neter and Loebbecke, 1975; Hall et al., 2001). Neter and Loebbecke’s
comprehensive analysis applied various sampling techniques to four distinct accounting populations that ranged from accounts re­
ceivable to inventory. They found that none of the applied methodologies were particularly well suited for detecting low-frequency
errors in a population. This issue was more pronounced when dealing with large populations that might have low-frequency, high-
risk areas of concern. This problem was of particular concern to the active audit practitioners in our research cooperative as per our
questionnaire (Table 4).
The logical solution to this problem may be to increase the size of the sample. Sample size plays a vital role in an audit because
errors can only be detected if they are included in a sample. Neter and Loebbecke tested sample sizes of 100 and 200. They recom­
mended in several cases that a larger sample size be used. It is important to note that the conclusion was made based on comparatively
small population sizes (no population had more than 10,000 observations). While the auditors working with our research team rec­
ommended a sample of between 200 and 300, the population in comparison was much more extensive (millions of records). Combined
with the information provided by Neter and Loebbecke, the sizable population in our study indicated a need for either a larger sample
or a new approach. However, Hall et al. (2001) found that the benefits of increasing sample size were limited, if statistically significant
at all. This further indicated a need for a new methodology in large population examination.

2.2. Non-Statistical method biases

Apprehensions about problems regarding non-statistical methods of sampling date back to the 1950′ s (Deming, 1954; Arkin, 1957).
Despite concerns and the development of newer methodologies, a survey conducted by Hall et al. (2000) found that non-statistical
sampling was used for 85 percent of audit sampling tasks. It was believed that this might be driven by an emphasis on a client’s
inherent risk and auditor knowledge (Elder et al., 2013). However, several studies have suggested that auditor expectations had
adverse effects on how judgmental sampling methodologies were applied, impacting the resulting sample and interpretation of that
sample (Blocher and Bylinski, 1985; Elder and Allen, 1998; Elder and Allen, 2003). Elder et al. (2013), and Akresh et al. (1988)
provided a detailed examination of statistical and non-statistical sampling literature.

2.3. Modern methodologies

Due to the aforementioned issues with traditional sampling techniques, several streams of research have been aimed at method­
ologies designed to develop samples and rank exceptions. These methodologies are aimed primarily at full population examination and
exception prioritization for investigation. The common problem with applying full population examination is the existence of a large
number of exceptions. The most common solutions to this problem are weighting and suspicion scoring (Issa, 2013) and multi-tiered
filtering (Kim and Kogan, 2014; Kim, 2011). While weighting and suspicion scoring rank records are based on various factors, multi-
tiered filtering approaches are designed to sift through large data populations. When used in combination under a system such as
MADS these approaches can also satisfy the desire to improve audit performance within the established standards. MADS is a
methodology that extends this literature and attempts to provide a new approach to sampling that resolves some of the aforementioned
concerns (No et al., 2019).
A modern approach is to apply a computer assisted audit technique (CAAT). CAATs are defined as the use of any kind of information
technology or software in the audit process, to assist the auditor in planning and control, and thus, simplify the audit process (Sayana &
CISA 2003; Hamdan and Al-olimat, 2021) These techniques may prove helpful but may be applied in a piecemeal and patchwork way
which may limit performance (Sayana & CISA 2003). Paukowits and Paukowits (2000) document the struggles internal auditors face
due to a lack of application framework for CAATs. As a result, generalized audit software (GAS) has been developed to take a holistic
approach to apply CAATs. Even these GAS approaches are not always consistently applied as they may be developed with specific goals
in mind or at least perceived to be by auditors (Omoneye, 2013). The application of MADS seeks to extend this literature. MADS is a
CAAT with a generalized audit approach that is adaptable to each audit. By applying the broader MADS methodology to each audit and
making customizable tweaks, auditors will be able to gain the efficiencies of CAATS through a structured and adaptable framework.
Unfortunately for modern auditing, there is a lack of competing methodologies to deal with some of these concerns. This, in
combination with the interest of practitioners in this project, further highlighted the need for further research. As a result, this study
applies the MADS methodology to a large GL-related dataset to fill the gap in both the GL and MADS application literature and provide
insight into resolving some of the issues with traditional sampling techniques. In the future, it is hoped that this line of research will
engage academics in developing new methodologies, which include applying MADS to new accounting data sets or improving MADS to
mitigate the limitations mentioned later in this study.

2.4. Utilization of General Ledger datasets

This study also examines the effectiveness of GL-related datasets for drawing conclusions about audit assertions. To mimic a live
audit scenario, we incorporate additional data sources such as a chart of accounts and employee data to supplement the GL-related
dataset. Such datasets may provide additional information useful in auditing GL data. While there is a dearth of literature on this
subject, there has been discussion of the necessity for such studies. Debreceny et al. (2005), for example, outlined how Embedded Audit

4
J.W. Freiman et al. International Journal of Accounting Information Systems 46 (2022) 100573

Fig. 2. MADS Visual Example.

Modules could be utilized with ERP systems to facilitate audits. They mentioned that metadata, similar to that used in this study, could
be leveraged to better understand audit data and more insightful audit conclusions. They also detailed the advantages of taking a
holistic organizational approach to auditing data such as using multiple data sources in conjunction. This sentiment was echoed by
Gray and Debreceny (2014), who asserted the need to audit the linkage between journal entries and the GL. Our study aims to provide a
method for auditing this linkage by examining the incremental increases and decreases in account values in the GL.
Several studies outline the benefits of combining data sources. These arguments were best summarized by Earley (2015), who
surmised that to leverage analytics properly, multiple data sources should be combined. This is consistent with how an auditor with
access to many datasets may approach audit tests. While our access to the entire company was limited, we were provided with three
data sources. One data source used in this study is data on employees. While this data may be available to auditors, it is separate from
the GL portion of an audit.
To further contribute to the literature, this study 1) utilizes data on the increase and decrease of accounts in the GL) that represents
a link between journal entries and the GL postings, 2) combines multiple, largely unrelated data sources to enhance the understanding
of the target data, and 3) illustrates the advantages and effectiveness of full population testing.

3. Methodology

For financial statements to be considered a fair representation of a company’s financial position, the information utilized in these
documents must be complete and accurate. The most granular level of this data, transactional level information, is represented as
account balances in the GL. As transactions occur, they are reflected in updates to account balances in the GL. The MADS methodology
utilizes filters to highlight updates that auditors would consider to increase audit risk. Based on the evaluated importance of these
filters and which filters a particular update failed, the update is assigned a suspicion score. These scores are then used to rank account
updates from most to least risky.1
The first step in applying the MADS application is to obtain a data set of all the changes to accounts within the GL. As individual
business events occur, they are recorded as journal entries that are posted to the related ledger. The GL balances at the end of the
accounting period are placed in the financial statements.
The approach described in this paper targets the individual changes in each account as they occur. Today such data is often
recorded using company-wide ERP systems. While not all business data is reflected in these individual changes to accounts, enough
information is captured to examine several audit objectives, as will be outlined later in this paper. Because such changes to accounts
happen in large volumes, a methodology must be applied to conduct testing over the population. It is imperative that such a

1
The overarching methodology applied in this study was the MADS approach to sampling developed by No et al. (2019). This methodology has
been replicated in other studies (No and Huang 2019, Yoon et al. 2019, Lee et al. 2019) successfully, and the original outlined framework is
displayed in Fig. 1 (No et al. 2019).

5
J.W. Freiman et al. International Journal of Accounting Information Systems 46 (2022) 100573

Fig. 3. Implemented Process Diagram.

methodology properly examines the changes to account balances.


The MADS approach begins with determining potential problems that may occur in the existing dataset. Based on the determination
of these problems and risks, filters are designed to find the existence of each problem. This process of risk evaluation relies pre­
dominantly on auditor judgment and is in line with established current standards as part of the audit planning process (AS 2101). In
this application of MADS, we incorporate the GLARE framework (Freiman, 2020), which incorporates recommended tests based on
identified risks. These tests are rooted in academic work, standards, and best practice guidelines. In addition to this, we worked in
conjunction with auditors from audit firms. Their advice and feedback acted as a proxy for auditor judgment. These auditors were from
five of the largest accounting firms. Their input was in the form of quantifiable written feedback and oral free form discussion.
As filters are applied to the entire dataset, the output is evaluated by the research team and audit partners, and the filters are refined
when needed. This iterative process is necessary to ensure that filters flag only risky postings and not legitimate business practices.
Audit Standard 14 emphasizes collecting evidence related to “unusual or unexpected transactions, events, amounts, or relationships”
(PCAOB, 2010). In some cases, a secondary filter is applied to a sub-population of items that are flagged by a primary filter. This
secondary filtering is conducted to provide further insight into risky patterns within the data and is to be used at the auditors discretion
as per the MADS methodology. This procedure was designed based on results from primary filters and consultations with auditors
(auditor judgment). For example, we discover backdated records (primary filter) and further filter those to identify those records
backdated across a financial quarter (secondary filter).
Weighting the records is conducted in line with the guidance provided by Audit Standards 11, 12, and 14, (PCAOB, 2010) which
pertain to evaluating analytic results and material impacts of findings. In practice, auditors would assign weights based on their
judgments regarding the importance of each filter, risks associated with that filter, and the results of applying the filter. While auditor
judgment is subjective and can never be fully removed or summarily evaluated, repetitious use of our approach over many clients may
result in audit firms issuing quality assurance guidance for suggested weights and testing. Each general ledger posting is then assigned
a suspicion score based on the filters it has violated. This score is based on the weights of each filter that a particular posting may have
failed. These scores are used to rank the records for final prioritization. A visual representation of this process, including example
filters, is displayed in Fig. 2.
Additional analytic techniques may be applied to provide a further meta-analysis of the filter results at any point during this process.
An additional materiality analysis was conducted to ensure that the final prioritized sample contained postings of a material nature.
This will be discussed in detail in the following sections.

6
J.W. Freiman et al. International Journal of Accounting Information Systems 46 (2022) 100573

Table 1
Variable Descriptions.
Variable Description Data Type GLARE Class

Account Number The identifier of each account being credited or debited. Alphanumeric Account
Amount (local The value of the credit or debit. Numeric Adjustment Value
currency)
Description A brief description of what the line item relates to. This usually includes a numeric code along Alphanumeric Other
with a few identifying words.
Effective Date This is the date on which the journal entry became effective. Date Timing
Entry Date This is the date that the journal entry was entered into the system. Date Timing
Preparer ID This field identifies who prepared and entered an entry into the system. Alphanumeric Entrant Information
Source A two-letter code that classifies the type of entry being made. Alphabetic Other
Document Number The identifier of each journal entry. All line items from the same entry have the same Numeric Other
document number.
Account Type A type of an account such as asset, liability, equity, revenue, and expense. Alphabetic Account/Other
Account class Name of the account Alphabetic Account
Employee Status Status of employment based on the employee data as of Mar 2019 Numeric Entrant Information/
Other

Table 2
Omitted Variables.
Variable Description Data Type Reason for Omission

Amount (group The value of the line item in group currency. Numeric All records have zero values.
currency)
Business Unit Denotes the business unit that the line item pertains to. Numeric We examined one business unit so that all records have
the same value.
Fiscal Period Indicates the fiscal month that the event transpired in Numeric This information can be derived from the other variable
(effective date) in the data.
Document Type A brief description of the event which generated the Alphabetic This was a 1:1 match with the other variable in the data.
Description entry
Preparer ID Description A brief description of the preparer Alphanumeric All records have missing values.

4. Overview of instance implementation

The overall implementation of our approach is detailed in Fig. 3. MADS is designed as a flexible process but has never been applied
to GL-related data. As a result, the original MADS framework (No et al., 2019) was adapted to deliver the process illustrated in Fig. 3.
The changes fit within the spirit and overall guidance that the MADS framework provides. Primarily, our approach incorporates data
analytic techniques into the initial risk-based filtering. The following few sections will outline the implementation of each stage and
provide greater detail.

5. Data

The data used in this study pertains to one operating segment of a large multi-national manufacturing firm. The external auditor of
the firm provided the data. As a result, there was very little ability to follow up with the firm itself if any issues, questions, or findings
occurred during this study. While the external auditor of this particular firm was able to provide insight into some of the concerns
raised during the application of the MADS process, this feedback was limited. Despite this, there is sufficient evidence to suggest that
this methodology did detect potentially material issues related to account changes reflected within the final balances found in the GL.
Issues such as potential failures of controls, earnings management, and system error are identified and discussed in detail in the results
section.
Three separate datasets were used: a log of account change postings affecting GL balances, the chart of accounts, and the list of
employees authorized to make those changes. These were the only available data sources provided by the auditor and were in line with
data utilized by auditors in practice.

5.1. The General Ledger

The general ledger was for one operating segment of a large multi-national manufacturer that functions as an independent business.
As such, the full gamut of business activities that result in account balance changes are represented. In total, there are 3,398,356 data
points, which reflect changes to GL accounts. These changes stem from 546,104 unique journal entries. Each of the changes to account
balances was effective during the 2014 fiscal year. The average absolute value of an individual account change was $2,322, and the
total absolute value of all of the changes within the dataset is $2,434,788,451.56. Table 1 outlines all 13 of the variables in this study,
including the eight present in the primary dataset. This includes all variables in the available data.
Five of the 13 variables were excluded from examination for the following reasons: (1) all or most values associated with that

7
J.W. Freiman et al. International Journal of Accounting Information Systems 46 (2022) 100573

Table 3
GLARE Framework Overview.
GLARE Category Potential Data Variable(s) Audit Objective(s)

Value Adjustment Value Valuation


Accuracy
Frequency Adjustment Value Existence
Entrant Data Accuracy
Timing Information Valuation
Account Information
HR Related Entrant Data Existence
Adjustment Value Accuracy
Account Information
Temporal Timing Information Cutoff
Completeness
Accounting Estimations Account Information Valuation
Other Rights and Obligations
Accuracy
Controls Adjustment Value Completeness
Entrant Data Existence
Timing Information Accuracy
Account Information
Other
Predictive Adjustment Value Completeness
Account Information Existence
Entrant Data Accuracy
Timing Information Cutoff
Other

variable were missing, (2) all values for that variable were the same, (3) a variable was derived from another in the same data set (e.g.,
group currency converted to local currency), or (4) a variable had a 1-to-1 relationship with another variable in the same data set. Each
of these excluded variables, along with a description of their omission, is listed in Table 2.

5.2. Alternative data sources

Auditors have access to more than just the GL and account changes in a live audit environment. Additionally, to run some filters,
information beyond what is provided in the primary dataset may be necessary. For example, to determine if an employee was
authorized to make an entry we need employee data, or to determine if an account is a revenue or liability account, we need the Chart
of Accounts. To satisfy these criteria, two additional sources of data were utilized. The first of these was a chart of accounts (COA). The
COA was used to decipher the encoded account numbers found in the primary dataset. This was also used to determine the account
types and classes and establish expected account behavior (contra vs normal account) when analyzing account level aspects of various
changes.
The second additional source was a list of employees authorized to make account changes. The list included employee names and
their corresponding ID number. This information was used to examine employee patterns of behavior and generate the final variable,
employee status in order to conduct some of our tests. This binary variable “Employee Status” denoted whether the employee was still
authorized to make entries at the beginning of 2019.
While the changes to accounts pertain to 2014, the COA was from 2018, and the employee list was from early 2019, this had
minimal impact on the integrity of the study. Few, if any, changes relate to accounts not on the COA. Additionally, a filter was designed
to account for discrepancies between those preparing postings in 2014 and those authorized to do so in 2019. The auditors
acknowledged that this was common practice. This approach highlights how they utilize data from multiple sources during an audit
task in practice. Secondly, it highlights their ability to audit a company using data from prior year audits.

6. Filter generation

After reviewing the data, the next step was to determine potential risks and associated filters. The procedure implemented for this
involves five steps: 1) examine available data, 2) determine potential problems, 3) identify audit objectives, 4) create filters for those
problems, and 5) prioritize fitlers using auditor judgment. Based on a fundamental analysis of the GL data and a common under­
standing of both the industry and audit-based principles, a list of risks with proposed filters was generated. These risks were validated
with our senior audit partners to ensure we identified those that were most appropriate. One key portion of the audit planning process
is auditor judgment. Filters do not necessarily prove a problem but rather indicate that one may exist. The purpose of substantive
testing, conducted after a sample is generated, is to follow up on such issues and determine acceptability.
To eliminate as much subjectivity and need for expertise as possible, the GLARE framework (Freiman, 2020) is used to satisfy the
filter generation process’ middle three steps of identifying risks and selecting tests. The GLARE framework utilizes a multi-step process
for suggesting filters. First, the dataset is analyzed. Each variable is assigned to one of five variable types: entrant information,
adjustment value, timing information, account, and other. These are each linked with audit objectives. Based on the audit objectives

8
J.W. Freiman et al. International Journal of Accounting Information Systems 46 (2022) 100573

Table 4
Auditor Assessments.
What could go wrong Potential Tests Audit Objectives Level of Importance: High (H),
Medium (M), Low (L)
Occurrence Completeness Accuracy

Entries made multiple times Search for duplicate entries x x Firm 1: High Firm 2: Low Firm 3:
High Firm 4: High Firm 5: Medium
Incorrect account credited/ Examine entries into an account that fall x Firm 1: High Firm 2: Medium Firm
debited outside 1.5 times the Inner Quartile 3: Medium Firm 4: *BLANK* Firm
Range 5: *TBD*
Incorrectly entered amounts Credit Debit Balancing x Firm 1: Medium Firm 2: Low Firm
3: Medium Firm 4: *BLANK* Firm
5: High
Timing Problems:
Entries outside business Date on weekend or holiday x Firm 1: Medium Firm 2: *Depends*
hours (potential fraud) Firm 3: Low Firm 4: Medium Firm
5: Low
Backdating entries Effective date and entry date separated x x Firm 1: High Firm 2: High Firm 3:
by greater than 35 days Medium Firm 4: High Firm 5:
Medium
Aggressive earnings Effective date or entry date within 7 days x x Firm 1: High Firm 2: High Firm 3:
management of quarter end/start and separated by 14 High Firm 4: High Firm 5: Medium
days or more.
Fraud and Internal Controls
Employee makes entries Examine values outside 1.5 times the x Firm 1: Medium Firm 2: Medium
outside of their normal Inner Quartile Range for employee Firm 3: Medium Firm 4: Low Firm
value range entries 5: *TBD*
Systematic error occurs Examine values outside 1.5 times the x x Firm 1: Medium Firm 2: Medium
within a department Inner Quartile Range based on the Firm 3: *BLANK* Firm 4: High Firm
department or source 5: *TBD*
Repeating payment error/ Entries from one account with the exact x x Firm 1: Low Firm 2: Low Firm 3:
fraud same value made within the same day or High Firm 4: High Firm 5: Medium
the following day
Entries disguised as Examine entries based on description to x Firm 1: Low Firm 2: Low Firm 3:
“normal” based on their determine entry type. Flag entries *BLANK* Firm 4: Medium Firm 5:
descriptions outside 1.5 times IQR *TBD*

Table 5
Tests and Target Objectives.
Test Category Test Audit Objective GLARE Risk Category

Data Entry Duplicate Entries Occurrence Control/Frequency


Zero Sum Completeness Control/Value
Same account debited and credited in one entry Accuracy Control/Predictive
Description Test of Controls Control
Records per entry Occurrence Frequency
Timing Backdated Records
Regular Cutoff Temporal
Earnings Management Risk Cutoff Temporal
Advanced Entries
Regular Cutoff Temporal
Earnings Management Risk Cutoff Temporal
Weekend Entries Test of Controls Temporal/Control
Preparer Related Multiple Names Test of Controls Control
Inactive Employees Test of Controls HR Related/Control
Other Intracompany transactions Rights and Obligations Control
Unusual Source Classification Predictive
Cash or Inventory Outflow Rights and Obligations Estimation/Value
Unusually high value Valuation Value

and available data, each of the seven GLARE categories suggests potential tests. These categories and their links to data variables and
audit objectives are outlined in Table 3.
The dataset in question contained variables that covered all four of the major necessary GLARE variable classes (adjustment
amount, entrant information, account data, and timing data) and provided several “other” category variables. Based on this infor­
mation, ten initial filters were generated. These represented an exhaustive use of our available variables and targeted six of the seven
GLARE risk categories (the exception being accounting estimations).
The auditors were consulted in order to achieve a consensus decision on our approach. To formalize this process these auditors
ranked the level of importance and concern associated with each potential problem as high, medium, low, or irrelevant. These filters

9
J.W. Freiman et al. International Journal of Accounting Information Systems 46 (2022) 100573

Table 6
Filter Results.
Test Test Weight Number of Portion of
Category Records Population

Data Entry Duplicate Entries 3 for duplicate records. 151,114 4.45 %


1 for records associated with a duplicate. 759,191 22.34 %
Zero Sum 2 for all records within the entries. 0 0.00 %
Same account debited and 3 for records that fail. 52,373 1.54 %
credited in one entry 1 for records in the same entry. 204,674 6.02 %
Description 3 for records with missing description. 2,304,049 67.80 %
1 for all other records in the entry that contains two or more records with 35,630 1.05 %
missing descriptions.
Records per entry 1 to all records in a journal entry that contains more than 100 records. 696,423 20.49 %
Timing Backdated Records
Regular 1 for all records of an entry backdated by more than 28 days. 2,408 0.07 %
Earnings Management Risk 2 for all records of an entry backdated across a quarter and related to revenue 17,395 0.51 %
Advanced Entries or expense accounts.
Regular 1 for all records entered 28 days or more before becoming effective. 5,273 0.16 %
Earnings Management Risk 3 for all records of an early entry made within the last week before the end of 941 0.03 %
a quarter, with an effective date in the following quarter, and related to
revenue or expense accounts.
Weekend Entries 1 for all records entered on a weekend. 149,881 4.41 %
Preparer Multiple Names 1 for all records entered by an Employee ID associated with multiple 0 0.00 %
Related different employee names.
Inactive Employees 2 for all records entered by an employee that was terminated after the 464,468 13.67 %
accounting period.
Other Intracompany transactions 1 for all intracompany transaction records. 128,223 3.77 %
Unusual Source 1 to all records of an entry if the preparer is using a source for the first time. 5,107 0.15 %
Cash or Inventory Outflow 1 for all records pertaining to cash or inventory outflows. 637,933 18.77 %
Unusually high value 1 for all records of journal entries in the top 1 % of value. 65,848 1.94 %

were generated to target specific audit objectives. Additionally, the auditors were given the opportunity to provide further feedback in
the form of unidentified risks or tests that they would suggest. These results can be found in Table 4. The additional feedback is
included in the Appendix.
Based on the auditors’ feedback, additional filters were necessary to target those risks they felt were essential or ones they felt were
missing. As a result, the GLARE framework was used again, and additional filters (such as targeting for earnings management) were
generated. The final list of filters based on this feedback, and additional data sources, can be found in Table 5. In this final list of filters,
each GLARE risk category is represented. These filters are designed to test a host of traditional audit objectives. The link between the
filters, their audit objectives, and the designated GLARE risk category is outlined.

7. Original filters

Of the 14 filters used (see Table 6), five came directly from the list of filters proposed to the auditors (Duplicate records, Credit and
Debit balancing, and three timing filters). These filters were straightforward and targeted issues identified as significant risk areas by
those auditors. The timing filters consisted of one filter that examined entries made on weekends, one filter designed to look for
backdated alterations, and another filter that looked for changes in balance made in advance. A majority of these filters were designed
to ensure that adjustments to accounts were made in the appropriate time frame and examine management assertions involving cutoff.
Systemic issues involving backdated or advance changes may impact financial statements and result in a material misstatement of
under-or over-stated revenues. Because of the relation to earnings management, the latter two filters were supplemented with sub-
filters. While failing any of these filters did not confirm a problem, it indicated a need for follow-up substantive testing.
A backdated change to an account was defined as a change whose effective date occurred more than 28 days (one month) prior to its
entry date. As mentioned, an additional earnings management sub-filter was applied. This sub-filter further narrowed down all
backdated alterations that occurred over a change of quarter. The 28-day time window was maintained to avoid detecting standard
practices at the end of the quarter. These typically occur immediately surrounding the quarter change. This same process was applied
to adjustments made in advance. Advance changes were defined as having an entry date that preceded an effective date by more than
28 days. A pattern of earnings management may be detected by examining for systematic preferences to back date-specific types of
account changes and entering others in advance. While some of these changes may be immaterial, the practice will cause material
skewness in the fair representation of financial data at the end of each quarter.

8. Auditor suggested filters

Five filters were developed to target issues that were identified during our consultation with the auditors, much like would be done
during the brainstorming phase of live audit planning. These filters include the two employee filters (multiple preparers and inactive
employees), two filters pertaining to business units (unusual source and intracompany transactions), and the filter designed to detect

10
J.W. Freiman et al. International Journal of Accounting Information Systems 46 (2022) 100573

missing descriptions. The following subsections detail each of the broad categories and the five filters themselves.

8.1. Employee filters

The employee-related filters utilized employee data to detect potential risks or irregularities. The first filter looking for multiple
preparers is a test of an internal control requiring that only one individual prepare an alteration to an account. A failure to meet this
control may not directly indicate a problem with the control, but it will indicate an irregular business event that may warrant further
investigation. This would be a direct example of how this approach can also evaluate the effectiveness of internal control pursuant to
Sarbanes-Oxley Act requirements.
The filter for inactive employees was developed in conjunction with one of the auditors consulted in this study. This filter takes
advantage of the fact that the GL data source predates the provided employee roster by five years. Some preparers who made postings
in the 2014 GL were no longer on the 2019 employee roster. While some of these employees might have left the firm to pursue other
opportunities, some might have been fired due to substandard performance. As a result, the records prepared by these individuals
should come under increased scrutiny. Since we cannot tell which ones were fired and which left voluntarily, we apply this to all
employees no longer with the company. While temporally disjointed data such as this will likely not be utilized in a contemporary
audit, it is a substitute for filters that would otherwise examine information on employees hired or fired over the previous fiscal year(s).
It also highlights the possibility and benefits of combining distinct data sources.

8.2. Business unit filters

Two filters were developed to detect abnormal entry practices within business units. The first, unusual source, was used in
conjunction with employee testing. This filter targets unusual behavior within a business source segment. Employees within de­
partments typically utilize the same source as each department is assigned a specific source. Therefore, it would be unusual for an
employee to utilize different sources from outside their normal operations. As a result, a filter was developed to search for instances
where employees used a source that was abnormal based on their standard pattern of posting behavior. This filter aims to ensure that
these particular account changes are classified correctly and have not mistakenly been misclassified.
The second of these filters was engineered to look for intracompany transactions. While this is a normal business practice, the
presence of a large volume of these transactions may indicate an issue. This filter aims to ensure that the rights and obligations of a
particular asset or intracompany transfer are accounted for correctly. Such a filter also aims to ensure that a change has been entered
into the correct account. Additionally, these types of transactions should be subject to further scrutiny, as evidenced in the failure of
auditors to find intracompany transference of debt in the case of Enron.
The final filter examines the descriptions field. Initially, the aim was to analyze these descriptions for abnormal text patterns. This
concept was discussed with the auditors and the feedback was very positive. However, upon further examination of the posting de­
scriptions, it was found that many changes lacked a description. The lack of data to establish a “normal” behavior pattern prevented
this test from being conducted. However, since the lack of description represented an internal control deficiency for this company, a
filter was developed to detect this anomaly.

8.3. Additional filters

The remaining four filters do not specifically relate to suggestions made by the auditors; however, they are designed to tackle risks
or issues similar to those evaluated as being important. Two of these relate to data entry procedures and accuracy. They are in line with
the risk that preparers may not make changes to accounts correctly.
The zero-sum filter makes sure that the preparer accurately transfers all the details of a particular journal entry into account
balance changes. This filter is designed to test for completeness and accuracy. In an ideal scenario, credits and debits in an individual
journal entry balance. Failure in this regard would constitute both a failure of internal controls and some level of misstatement.
The next filter, records per entry, is designed to highlight account alterations that are part of an extensive journal entry. After
relating account changes to their originating journal entries, there were many entries with 100 + account alterations. Transferring this
volume of data increases the likelihood of user error. As a result, a filter was applied to examine these large batches of alterations. This
filter targets management assertions of accuracy and occurrence, ensuring that no additional changes are being recorded that did not
happen.
The final two filters also highlight high-risk areas. These were designed based on the feedback from the auditors. The first filter
examines records that directly impact revenues and the income statement. In addition, it targets the ownership of rights and obli­
gations that involve accounts related to the income statement. No filters up until this point directly target this aspect of financial
reporting. To this end, a filter was designed to highlight all changes in accounts related to a cash inflow or outflow. By adding emphasis
to these changes, a direct line of audit data can be drawn to aspects of the income statement.
The final additional filter examines account changes having the top one percent of the value. While materiality is covered later in
this methodology, this filter is applied to ensure high-value changes, which may have issues, make it into our final sample. This filter
aims at ensuring that such high magnitude individual changes to account balances are accurate.

11
J.W. Freiman et al. International Journal of Accounting Information Systems 46 (2022) 100573

Table 7
Suspicion Score Results.
Total records

Suspicion Score No amount filtering ≥ $3,000

13 25 N/A
12 524 N/A
11 372 2
10 1,330 3
9 5,470 198
8 40,090 526
7 120,730 2,665
6 280,384 8,429
5 545,121 12,973
4 642,944 32,763
3 819,279 43,155
2 127,738 7,334
1 247,685 23,526
0 566,664 25,738

9. Suspicion scoring

A suspicion score can be defined as a number representing the total perceived riskiness of a record as a function of the filters that it
failed. These functions take many forms, however, we decided to keep it simple for a variety of reasons: 1) ease of use and under­
standing by practitioners, 2) simplification of the necessary guidance in applying our approach in practice, and 3) a stated desire in
consultation with auditors to avoid overcomplication if not necessary. The function that resulted is denoted in Equation 1. The
equation sums the weights of each individual filter that a particular change violates. The sum of these weights becomes the suspicion
score for that change to the GL. These scores are then utilized for ranking of records for the final stage, sampling, to be conducted.
Equation 1: Weighted Suspicion Scoring.
∑n
si = j=1
xij wj

xij = 1 if the record i violates the filter j, 0 otherwise.

wj = weight of the filter j.

The feedback from auditors was used to assign the various weights used in Equation 1. Initially, those filters associated with the
highest level of perceived importance were assigned the highest weights, as per Table 4. Based on initial results along with subsequent
input and feedback from the participating auditors these were later fine-tuned. This emulates the risk assessment processes used in a
live audit. As auditors review initial results, they adjust risk expectations to fine-tune their substantive tests. In addition to the primary
weights, secondary weights were also applied. These were applied when a filter flagging one suspicious record increased suspicion of
other related adjustments. For example, if enough records of a particular account are suspicious, the entire account may be labeled as
suspicious. In this case a small secondary weight would be applied to the entire account in order to represent this small amount of
increased risk. The complete list of the primary and secondary weights for each filter can be found in the results section in Table 6.

10. Prioritization and sampling

Once each adjustment was assigned a suspicion score, the population was ranked from highest to lowest by suspicion score. The
suspicion scores resulting from our application ranged from thirteen to zero. The next stage would be to select a sample by determining
a sample size (n) and selecting the top n prioritized records to conduct tests of details. The senior audit partners recommended sample
size of 300 postings. This was based on their industry experience utilizing similar data sets. An additional concern was that the majority
of this sample’s top 300 scoring records might be made up of account alterations with immaterially low amounts.
To address this concern, a stage-two filter was applied to the final ranking of records pursuant to the original MADS framework.
This filter removed from the list any record whose value was below a chosen materiality threshold. The senior audit partners sugested
this process as a materiality threshold is commonly used on an audit. A threshold of $3,000 was selected. This threshold was chosen
after applying various other higher and lower amounts values and analyzing the impact on the final list of suspicious changes. $3,000
was above the average adjustment value of $2,322. Additionally, this threshold still provided a sample of changes that scored as part of
the riskiest score range (≥5). The auditors working with the team evaluated this value as being decidedly lower than what they would
use in practice but agreed that a very conservative auditor may apply this threshold, a factor that should be considered in our study.
The results of prioritized sampling, both with and without the filter, are shown in Table 7. Additional results using different thresholds
can be found in the appendix.

12
J.W. Freiman et al. International Journal of Accounting Information Systems 46 (2022) 100573

Table 8
Auditor Evaluation of Risks.
What Could Go Wrong Avg. Score

Aggressive earnings management 2.8


Backdated postings 2.6
Search for duplicate postings 2.4
Incorrect Account Usage 2.33
Systematic Department errors 2.33
Credit debit balancing 2
Repeating payment (extreme values based on account) 2
Abnormal Patterns of Employee Behavior 1.75
Entry outside of Business Hours 1.5
Description testing 1.3

11. Evaluation of results

The evaluations of results were conducted in a variety of ways. Firstly, the overall results of the various filters are examined and
discussed. Secondly, the results of particular filters are compared to the evaluated importance of corresponding risk factors as reviewed
by auditors. Finally, the results of the final scored population are discussed with respect to materiality, account type, and risk.

11.1. Results by filter

The results of each filter can be found in Table 6. There are several useful insights that can be gleaned from evaluating this
information.
The first and most important feature of note on this table is that the full population of records passed two of the filters. Each of these
two filters (Zero Sum (row three) and Multiple Names (row 12)) is designed to check basic control efficacy. These examples highlight
how this methodology can affirmatively show that a control is effectively operating as designed. Auditors should not discount a sit­
uation in which zero failures occur when applying this methodology, especially to a dataset such as the GL. While functionality and
effectiveness of the filters should be checked, this result illustrates that these controls operate as intended. However, such results are
not shared by all control-related filters. Two other filters were designed to detect the functionality of basic controls. They each failed to
various degrees. Firstly, a duplication filter was applied that matched instances where records of changes were identical across all
fields. This occurred in over 4 percent of the population. Of these, there were many instances where 100 or more identical matches
were found for one individual account alteration. While this may be explainable as a systematic error, this could imply the existence of
a more serious issue such as fraud. This concern would warrant additional auditor follow-up.
The filter detected the other control violation looking for missing description fields. The firm’s auditor explained that the
description was a mandatory field as per the client’s internal controls. As a result, a variety of text mining-based filters were to be
applied. After running a preliminary analysis, it was found that more than two-thirds of the account changes were missing a
description. This constituted a substantive breach of an internal rule. Potential explanations range from improper employee training to
system-based recording errors. Regardless of the reasons for the missing descriptions, this constitutes an audit finding that would be
forwarded to the client’s management team for correction.
These results indicate that the MADS methodology can be applied to detect misstatements or suspicious records for further tests of
details, and that it can be used to evaluate the effectiveness of internal controls. This conclusion should be of particular interest to
auditors who can use this to zero in on the practical effectiveness of various controls and assess the impact that any shortcomings are
having at a financial level. This serves to demonstrate that MADS can at least operate as effectively as traditional methods. Additional
evidence will demonstrate additional specific advantages of MADS.

11.2. Results by perceived importance

The second criterion used to evaluate the effectiveness of applying MADS in this context was evaluating the ability of filters to
detect issues that auditors deemed most important. To this end, data from our consultations with auditors were compiled into the meta-
analysis found in Table 8. This analysis aims to formalize the subjectivity of perception when evaluating the importance of results as
per practitioners. This analysis of perceived importance was conducted as follows. First, the risk values (low, medium, and high) of
various risks were assigned scores of 1, 2, and 3, with a higher risk receiving a higher score. Then an average importance score was
calculated for each potential problem discussed with the auditors. In this way, the lowest possible concern was 1, and the highest was
3, with the averages falling throughout that range. The results of this calculation can be found in Table 8.
According to Table 8, the auditors considered the two most significant risks to be backdated postings and aggressive earnings
management based on these results. Both of these issues were detected using the MADS methodology. While neither occurred at a
significantly high rate, additional factors make this a serious concern. While other techniques may have detected these problems, the
low occurrence of these issues may mean they go undetected.
Backdated account changes occur at a combined rate of just over half a percent. The average value for these changes is $18,171.90.
This value is almost eight times that of the population average. This fact introduces a concern of low-frequency, high-value issues

13
J.W. Freiman et al. International Journal of Accounting Information Systems 46 (2022) 100573

within the population. An importance score of 2.6 indicates that this is a significant concern for auditors who may not have detected
this issue using alternative methods.
In addition to backdated alterations to accounts, there was evidence that highlighted potential earnings management. To test
auditors’ number one concern, earnings management, a set of secondary filters was designed. These filters were applied to backdated
and advanced account changes with a lag of more than one week on either side of a quarter change. Backdated earnings management
risk entries occurred within 0.51 % of the population, highlighting the infrequency of occurrence. These were, however, high
magnitude changes, with 37 of these change-of-quarter backdated account changes falling within the largest 1 % of adjustments.
Advanced changes surrounding a change-of-quarter occurred at a rate of 0.03 %. Additional analysis was done in conjunction with
the COA data to determine what accounts were impacted and when they were impacted. Upon conducting this additional analysis, it
was found that advanced changes were predominantly made to expense and liability accounts. Additionally, while this occurred at the
end of every quarter, it was most prevalent in Q4 and Q2. This, combined with the information that changes to revenue accounts were
much more commonly backdated than any other account type, paints a picture of downward earnings management.
In this case, the auditors’ greatest concern, earnings management, seemed to be occurring. While follow-up substantive testing is
needed, it is no doubt a comfort to know that this methodology will detect the issue should it occur. While alternative approaches may
exist to detect these issues, MADS bundles a CAAT utilizing approach into one process. For example, if a low frequency high impact
event occurs it may not be detected using traditional sampling. It may or may not be discovered depending on the judgement of an
auditor on applying further testing or CAATS. Under the MADS methodology, the results above indicate that one all-encompassing
process can detect even the most infrequent problems. The importance of this finding is compounded, given the value of the ad­
justments that were flagged.

11.3. Results by score and materiality

The final evaluation metrics examine the overall scores of the various account changes and evaluate each of these results’ mate­
riality. First, the materiality of various filters will be examined. Next, the scores will be evaluated. Finally, a combination of mate­
riality, scores, and account types will be examined.

11.3.1. Materiality
With respect to the overall value of the changes that failed filters, three filters stood out. Each of these exceeded a 10 % materiality
threshold. This being 10 % of the total value of all transactions. While most of these filters do not directly indicate a problem, they
suggest risks that may warrant further investigation. The first of these pertains to adjustments made by employees who were no longer
active in 2019. These circumstances were deemed risky because while some employees may have voluntarily left the company, others
may have been fired for malfeasance. The average value of an account change made by an inactive employee was $21,096.23. This
value is more than nine times the population average. The total value of these changes amounted to over $1 billion. This may be a case
of explainable employee turnover; however, an auditor may wish to investigate such an irregularity. Especially concerning is that
almost half of the value of flagged account changes occurs within 13 % of the flagged population. These were all entered by individuals
who were no longer with the company-five years later.
The other filter exceeding our 10 % materiality threshold, was designed to target intracompany transfers. These occur when re­
sources are moved around the company and are a legitimate part of doing business. Due to the high average value per event of over
$10,000, a small weight was applied to these records to move them up the priority list. This was done in the case where additional
filters were violated. The final filter to exceed a 10 % materiality threshold of the total value was the missing description filter. This
breach, along with it being a requirement, lends credence to the claim that this was a material failure of internal control.

11.3.2. Score
A notable finding gleaned from examining the scores is the 566,664 account adjustments that did not receive any score. This
portion of the population had no issues with respect to all of the filters and risks of concern to auditors. Such a definitive finding and
claim cannot be made using any traditional sampling techniques and should be considered a significant benefit of this approach. While
other methods may produce such results, they may require additional time or testing by auditors specifically designed to target this
result.
The adjustments were broken down into account types to further understand how these high score changes impact the financial
statements. Both liability and asset accounts consistently produce material values for the sum of the account changes that fall into the
highest risk category. The value of asset account alterations that belong to the highest risk score group falls just shy of $800 million.

12. Impact

Low-frequency high-value problems are of great concern to auditors because they may significantly impact the financial state­
ments. Such issues may not always be easy to detect. This is evident in our findings related to earnings management behavior.
Managing earnings downward before a quarter change results in the release of financial statements that inaccurately represent a firm.
In addition to earnings management concerns, the frequency of problems occurring within asset and liability accounts could directly
impact the balance sheet. Together, these findings illustrate that examining account changes in the GL can provide a fruitful avenue for
auditors wishing to gather audit evidence. This approach can both support audit objectives and produce valuable audit findings.
The detection of problems successfully within this population should not be overlooked when evaluating MADS’s importance and

14
J.W. Freiman et al. International Journal of Accounting Information Systems 46 (2022) 100573

Table A1
Audit Partner Suggested Risks.
Firm 1 Firm 2 Firm 3 Firm 4

Segregation of incompatible It is difficult to assign ratings to these without For accounts that are expected to have regular Rounded values or
duties - entries posted to having the understanding of the business process entries within a given range, it would be consistent ending
incompatible accounts by the behind the data set. We recommend for purposes interesting to statistically analyze (based on entries
same preparer of this proof of concept, the “ratings” are risk profile) and try to identify outliers that may Entries by User ID
“hypothetical”, based on some assumptions made indicate error/fraud. The auditor would need to Entries on unusual
being the sample data set used in the analysis, and risk-profile accounts where this may be likely business hours
used to prove out the methodology behind the and determine if that account has a certain
MADs research. acceptable range of size of entry, etc. but this
could be helpful in risk assessment and in
In a live project engagement, the auditor will gain designing further substantive procedures.
a sufficient understanding of the business process
(including journal entry process, flow of
transactions through the system, etc.)
and the “ratings” would be assessed based on that
the understanding and the identified/assessed
risks. The ratings for each “filter” will vary as such
project to project.
SUGGESTED ADDITIONAL TESTS: What about looking at frequency of entries by Entries with
certain personnel? For example, the CFO is not missing
like Firm 1 to be making entries to the GL – description
Unusual combinations based on understanding of design test to analyze entries made by Entries with
typical journal entries for each business process unexpected personnel. missing User ID
Journal entry recorded in one period and reversed Manual entries vs
in the next period (a non-recurring entry). Automated entries
Journal entries created and approved by same 1word phrase
user, or where the user who created the entry is search
blank, or approver is blank.

impact on evaluating financial statements. This study employed this methodology on a GL-related dataset to detect real and impactful
issues successfully.
Finally, consideration must be given to the applicability of this approach to practice and standards. We ensured that our approach
meets these requirements by working in conjunction with auditors from several large audit firms. The auditors worked with us to
ensure this application was in line with current skills and abilities of those performing contemporary audits. Additionally, they worked
to ensure an element of realism in our application of this audit data analytic approach and that it fit with their vision of the future of
audit from a practice perspective. Generally, practitioners agreed that this approach provided ample benefits.

13. Conclusion

The list of adjustments to accounts within a GL provides a unique problem and set of conditions that is a crucial link between the
journalizing of business events and financial statements. However, the size of such datasets makes auditing this vital link a problematic
analytic step due to the number of potential exceptions or anomalies that might be identified. Auditors may employ traditional
sampling techniques to surmount this challenge; however, these come with their own sets of issues. We work with practitioners to
illustrate the effectiveness of MADS at detecting high-risk, low-frequency events, material failures in internal controls, systematic
issues such as earnings management, and portions of the population that are seemingly error-free in our dataset.
The evolution of computer technology and analytic methods and the challenges posed by auditee adoption of electronic systems
have changed the potential and scope of corporate assurance. The MADS methodology filters databases with algorithms of “what can
go wrong” and obtains a set of “notable items” that could be examined. These, however, may be too numerous for individual ex­
amination and some prioritization and choice must be performed. Several methods have been attempted for this secondary filtering
and it seems that the preferable method is contingent on the circumstances and nature of the database.
It is important to understand those notable items are the residual of the filtering and that the entire population has been examined for
key risks in its selection. The evolution from narrow samples to full population testing requires substantive research and projects like
the one presented in this paper. This need is further emphasized as this approach was specifically called for and favored by the
practitioners that worked with our research group.
There are some limitations to our study. Firstly, this approach was applied to one operating segment of a large manufacturing firm.
As no two datasets are the same, it is a possibility that the insights gleaned from this application are the exception and not the rule.
Future research should apply this methodology to other industries or datasets from other companies within the same industry.
Secondly, while the project was conducted in conjunction with auditors to develop our filters, this study was subject to data
constraints that may not exist in a live audit. As a result, the findings and list of filters could vary if access had been given to appropriate
contemporary data.
Overall, the results in this paper are very encouraging. This study successfully tied together multiple sources of data through the
application of the MADS methodology. Additionally, we provided valuable audit insights and findings primarily through a GL-related

15
J.W. Freiman et al. International Journal of Accounting Information Systems 46 (2022) 100573

Table A2
Various MADS Sample Materiality Thresholds.
Total records

Suspicion Score No amount filtering ≥ $1,000 ≥ $2,000 ≥ $3,000

13 25
12 524
11 372 2 2 2
10 1,330 19 5 3
9 5,470 1,356 370 198
8 40,090 2,361 855 526
7 120,730 7,171 3,734 2,665
6 280,384 22,717 12,141 8,429
5 545,121 34,903 18,434 12,973
4 642,944 104,135 53,369 32,763
3 819,279 153,488 73,156 43,155
2 127,738 14,729 9,197 7,334
1 247,685 48,682 32,455 23,526
0 566,664 97,868 42,706 25,738

dataset all of which would have impacted the published financial statements.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to
influence the work reported in this paper.

Appendix

References

Akresh, A., Loebbecke, J., Scott, W., 1988. Audit approaches and techniques. In: Abdel-Khalik, A., Solomon, I. (Eds.), Research Opportunities in Auditing: The Second
Decade. American Accounting Association, Sarasota, FL, pp. 32–49.
Arkin, H. 1957. Statistical sampling in auditing. The New York Certified Public Accountant (July): 454-469.
Blocher, E., Bylinski, J., 1985. The influence of sample characteristics in sample evaluation. Auditing: A J Pract Theory 5 (1), 79–90.
Debreceny, R.S., Gray, G.L., Ng, J.J.J., Lee, K.S.P., Yau, W.F., 2005. Embedded audit modules in enterprise resource planning systems: Implementation and
functionality. J. Inf. Syst. 19 (2), 7–27.
Deming, W., 1954. On the contributions of standards of sampling to legal evidence and accounting. Curr. Bus. Studies 19 (October), 14–32.
Earley, C.E., 2015. Data analytics in auditing: Opportunities and challenges. Bus. Horiz. 58 (5), 493–500.
Elder, R.J., Akresh, A.D., Glover, S.M., Higgs, J.L., Liljegren, J., 2013. Audit sampling research: A synthesis and implications for future research. Auditing: A J. Pract.
Theory 32 (Supplement 1), 99–129.
Elder, R., Allen, R., 1998. An empirical investigation of the auditor’s decision to project errors. Auditing: A J. Pract Theory 17 (2), 71–87.
Elder, R., Allen, R., 2003. A longitudinal field investigation of auditor risk assessments and sample size decisions. Account. Rev. 78 (4), 983–1002.
Freiman, J.W., 2020. Essays toward the development, implementation, testing, and automation of risk-based full population general ledger auditing systems. Rutgers
University Graduate School at Newark, Diss.
Gray, G.L., Debreceny, R.S., 2014. A taxonomy to guide research on the application of data mining to fraud detection in financial statement audits. Int. J. Account. Inf.
Syst. 15 (4), 357–380.
Hall, T., Hunton, J., Pierce, B., 2000. The use of and selection biases associated with non-statistical sampling in auditing. Behav. Res. Account. 16, 231–255.
Hall, T., Herron, T., Pierce, B., Witt, T., 2001. The effectiveness of increasing sample size to mitigate the influence of population characteristics in haphazard sampling.
Auditing: A J Pract Theory 20 (1), 169–185.
Hamdan, M.N., Al-olimat, N.H., 2021. Reducing The Challenges Of (Caats) In Light Of Covid-19 And Its Impact On Audit Evidence:(Xbrl) As A Mediating Variable.
Acad. Account. Financial Stud. 25, 1–12.
Issa, H., 2013. Exceptional exceptions Diss. Rutgers University Graduate School at Newark.
Kim, Y., 2011. Continuous Monitoring. Rutgers University Graduate School at Newark, Diss.
Kim, Y., Kogan, A., 2014. Development of an anomaly detection model for a bank’s transitory account system. J. Inf. Syst.
Kogan, A., Mayhew, B.W., Vasarhelyi, M.A., 2019. Audit data analytics research – an application of design science methodology. Account. Horiz. 33 (3), 69–73.
Lee, K., Li, Q., Tojiboyev, N., 2019. Application of multidimensional audit data selection framework: payroll process. Working Essay.
Loebbecke, J.K., Neter, J., 1975. Considerations in choosing statistical sampling procedures in auditing. Journal of Accounting Research 1975, 38–52.
Neter, J., Loebbecke, J.K., 1975. Behavior of major statistical estimators in sampling accounting populations: an empirical study; Auditing research monograph, 2.
Handbooks and Manuals, AICPA Guides, p. 9.
No, W.G., Lee, K., Huang, F., Li, Q., 2019. Multidimensional audit data selection (MADS): A framework for using data analytics in the audit data selection process.
Account. Horiz. 33 (3), 127–140.
Omoneye, O.O., 2013. Computer aided audit techniques and fraud detection. Res. J. Finance Account. 4 (5).
Paukowits, F., Paukowits, K., 2000. Bridging CAATs and risk. Internal Auditor 57 (2).
Public Company Accounting Oversight Board, 2010. Auditing Standard, No. 2101:. Audit Planning.
Sayana, S.A., Cisa,, c., 2003. Using CAATs to support IS audit. Information systems control journal 1, 21–23.
Yoon, Y., M. Vasarhelyi, and S. Cho. 2019. Data Analytics in Auditing: Implementing the MADS framework in the Revenue Cycle. Working Essay.

16

You might also like