Professional Documents
Culture Documents
as a data-driven organization.
Prologue
v
Data governance is one of the foundations to build a data culture. The
role of leaders is crucial, one of which is to pay attention and formulate
data governance to optimize data utilization in the Ministry of Finance.
Bobby A. Nazief, Ph.D.
Special Staff of Information and Technology System
vi
Building A Data Culture in the Ministry of Finance
Published in 2022.
The creation is disseminated under the License of Creative Commons
Attribution-NonCommercial-ShareAlike 4.0 International
(https://creativecommons.org/licenses/by-nc-sa/4.0/deed.en). You
are allowed to use a part or the whole content of the book by mentioning
the source. You are allowed to use, reproduce, duplicate, share, and
disseminate the book in any forms, formats, and methods for non-
commercial purposes. It is prohibited to use, reproduce, duplicate, share,
and disseminate the book in any forms, formats, and methods for
commercial purposes.
vii
viii
TABLE OF CONTENTS
ix
The Importance of Data Literacy ....................................................................... 27
Supporting Factors for Data Literacy .............................................................. 29
Those Who Have to Master Data Literacy ..................................................... 30
What Needs to be Done to Grow Data Literacy ........................................... 31
DATA ANALYTICS ECOSYSTEM............................................................................... 33
Growing Ecosystem ................................................................................................. 33
Strategic Direction for Data Analytics ............................................................. 36
HUMAN RESOURCES .................................................................................................... 39
Pool of Expertise ....................................................................................................... 39
Required Competencies ........................................................................................ 40
Approaches in Building Resources ................................................................... 47
INTRODUCTION OF DATA ANALYTICS TO THE MINISTRY OF FINANCE
................................................................................................................................................ 52
Data Analytics Ideas ................................................................................................ 52
Strategic Initiative of Data Analytics ............................................................... 55
Lessons to Learn ....................................................................................................... 59
SUCCESS FACTORS IN BUILDING A DATA CULTURE .................................... 63
Commitment of Leaders ........................................................................................ 63
Management of Changes ....................................................................................... 66
Obstacles for Data Culture .................................................................................... 68
ORGANIZATION STRUCTURE OF DATA ANALYTICS .................................... 71
Centralization Model .............................................................................................. 71
Decentralization Model ......................................................................................... 73
Center of Expertise Model .................................................................................... 74
Functional Model ...................................................................................................... 76
Principles of Data Analytics Unit ....................................................................... 78
DATA ANALYTICS AND DATA CONFIDENTIALITY ........................................ 81
Data De-identification ............................................................................................ 81
x
Governance of Data Confidentiality and Privacy ........................................ 83
Understanding the Data Owned ......................................................................... 84
Risk Management of Data Analytics ................................................................. 85
MODELING TECHNIQUES FOR DATA ANALYTICS.......................................... 87
Data Mining ................................................................................................................. 87
Text Mining ................................................................................................................. 90
Social Network Analysis ........................................................................................ 93
Data Visualization .................................................................................................... 93
Development Process of Data Analytics ......................................................... 96
DATA ANALYTICS ROADMAP ............................................................................... 101
Governance .............................................................................................................. 104
Human Resources ................................................................................................. 105
Digital Infrastructure ........................................................................................... 105
Data as Assets.......................................................................................................... 106
CLOSING .......................................................................................................................... 108
xi
xii
WELCOME ADDRESS BY THE MINISTER OF FINANCE
xiii
WELCOME ADDRESS BY THE SECRETARY GENERAL OF THE
MINISTRY OF FINANCE
Heru Pambudi
xiv
WELCOME ADDRESS BY THE ASSISTANT OF MINISTER FOR
ORGANIZATION, BUREAUCRACY, AND INFORMATION
TECHNOLOGY
Sudarto
xv
WELCOME ADDRESS BY THE ASSISTANT OF MINISTER FOR
REVENUES AS CHIEF DATA OFFICER
Oza Olavia
xvi
PREFACE
Authors
xvii
EXECUTIVE SUMMARY
xviii
analyst who works in an organization with an excellent data culture will
create positive impacts even though he/she does not have current
technology in field of data analytics. From the illustration, it is clear that
data culture is the main factor to successfully transform the Ministry of
Finance into a data-driven organization.
The biggest problems in transformation into a data-driven
organization lie on cultural factors, humans, and business processes. The
book offers some recommendations that can be implemented through
ecosystem approach so that every element in it can grow together and
support each other. Considering the broad scope of responsibility
dimensions of the Ministry of Finance along with its vertical units, the
process of adaptation cannot take place instantly, but gradually. On
individual level, data-based transformation has to empower employees
to be more productive and more competent.
The process of decision making and actions does not belong to
the scope of data analytics, but falls on the domain of public policies.
Therefore, transformation of the Ministry of Finance into a data-driven
organization is expected to produce concrete decisions and actions that
bring benefits and values for the organization and public welfare,
departing from meaningful understanding, based on relevant
information, obtained from proper analysis and accurate and reliable
data.
Building a data culture certainly takes time and needs to be
supported by management of changes. It aims to make a data culture the
mainstream in the Ministry of Finance and to ensure that all parties are
ready to support the changes of data culture. Support from leaders
becomes the defining factor for the maturity level of data analytics
ecosystem in the Ministry of Finance. The role of leaders is important not
xix
only in viewing the broad image of organizational transformation, but
also in ensuring that every employee gets accustomed to it and can use
data in producing added values for organization and public.
Strong support from leaders is the enabler for changes of
organization culture. By demonstrating commitment and support,
strong leadership becomes key instrument in overcoming resistance of
people who go against changes. If data culture becomes a habit, it will get
attached to every line of organization. In building a data culture in the
Ministry of Finance, the leaders play three crucial roles, namely as the
builder, the sticker, and the grower of data culture.
The success of a data-based transformation lies on human
resource. Basically, all employees in the Ministry of Finance need to
appreciate and understand data analytics in general. In every working
unit, a data analytics practician needs to be present, which has task
specialization in field of data. To improve the quality of human resources
that master data analytics skills, the organization needs to take the role,
namely, to provide human resource development facilities through
education and trainings or other learning media according to the level of
ability and needs of organization.
To ensure that the process of data-based transformation can run
in organized manner and successfully towards data-driven organization,
the selection of appropriate organization structure becomes one of the
crucial factors. In implementing the analytical organization structure of
the Ministry of Finance, there are three principles that have to be
maintained, i.e. data analytics is not monopolized by one unit or only
understand certain functions, data analytics unit has to be inclusive and
transparent for the other units from various different levels, and data
xx
analytics unit has to establish relationship and can communicate easily
with business process owner unit.
Eventually, building a data culture is a representation of a
sustainable process. The process is drawn up in data analytics roadmap
containing directions and strategies that become a guide in
implementing the process of data analytics-based transformation. The
data analytics roadmap contains programs placed in four important
dimensions, namely governance, human resource, digital infrastructure,
and data as assets. The four dimensions contain short-term and medium-
term programs. It is expected that the approach and work procedure will
keep improving so as to produce innovations, renewability, and changes
of behavior as an impact of data culture.
xxi
xxii
INTRODUCTION
Background
1
been treated as assets that can produce added values for institution and
public as a whole. In addition, data are still considered products of a unit
that manages information technology.
Such phenomenon occurred once on initial stage of computer
introduction in the working units of the Ministry of Finance. At that time,
computer literacy was not formed yet so computer was considered “a
foreign thing” and a lot of employees felt reluctant to learn and to be
skilled at operating computer. Along with the passing of time and change
of generations, most tasks in every working unit of the Ministry of
Finance use computer. Nowadays, computer has become an inseparable
part of our daily life and is considered a compulsory skill of every
employee.
Rigid rules and bureaucratic obstacles among the working units
of the Ministry of Finance take part in the challenges that hamper data
sharing. It often happens that a working unit faces difficulty to obtain
data that are managed by another working unit. The reluctance to share
data is also caused by concerns about abuse of access right grant to other
people from outside the working unit that produces the data.
The opportunity to optimize data utilization gets fresh air as
awareness and understanding of data start to grow in some internal
areas of the Ministry of Finance. It is reflected by the emergence of a few
initiatives to use data analytics skill to support the performance of tasks
in some working units. Although the initiatives have not become the
mainstream, the idea to optimize data utilization is believed to bring
significant impacts. Considering the vital role of the Ministry of Finance
for national economy, the improvement of fiscal policies and data-based
state financial management is believed to be able to improve public
welfare.
2
Objectives
Departing from the background, the book is compiled to achieve
seven objectives, namely:
1. To be a guide for leaders and employees of the Ministry of Finance
in building a data culture in the Ministry of Finance;
2. To be a reference for everyone in understanding and implementing
transformation into data-driven organization in respective working
unit;
3. To introduce data analytics as a method to solve problems;
4. To enrich the literature of data analytics in the context of fiscal
authority in Indonesia;
5. To support the realization of good data literacy to improve the
quality of performance of tasks and functions of the Ministry of
Finance;
6. To encourage the readiness of human resource to have
understanding and competencies required to implement data
analytics; and
7. To encourage the utilization of data analytics to formulate policies
and to take decisions.
Scope
The book introduces cultural approach to develop and utilize
data analytics skills in the Ministry of Finance. Hopefully, the book will
keep being renewed in accordance with the development of science,
technology, needs, and public discussion. The aim is to keep the content
relevant to the development of eras and to be able to be valid reference
for leaders and everyone who desire to learn about data-based
transformation in the Ministry of Finance.
3
4
WHAT IS DATA-DRIVEN ORGANIZATION?
1 When the book was written, there was no Indonesian phrase equivalence used
in general for the term data-driven organization. The book suggests “organisasi
yang digerakkan berdasarkan data” as an equivalence based on equivalence of
words and meaning.
2 Treder, M. (2019). Becoming a Data-driven Organisation. Berlin Heidelberg:
Springer.
5
Figure 1
Data Value Chain
3
Anderson, C. (2015). Creating A Data-Driven Organization: Practical Advice from
the Trenches. Sebastopol, CA: O’Reilly Media, Inc.
6
accurate decisions and actions so that the understanding gained from
data-based information can produce value4.
The use of data value chain can shed light on the definition of
data-driven organization. However, before drawing conclusions, we
need to see the context and the environment where the organization is
located. Therefore, it requires environmental scanning in understanding
an organization holistically5. Environmental scanning is done because
the environment where the Ministry of Finance is located differs from
the environment of other organizations, including private companies.
The environment determines the constraints as to how far an initiative
of transformation into a data-driven organization can be implemented.
Environmental scanning also becomes an important part of organization
strategies to improve the ability to adapt to the environment6.
One of thinking frameworks of environmental scanning that can
be used to finalize the concepts of data-driven organization for the
Ministry of Finance is Parsons’ Organization Model7. Organization can be
viewed in three layers, namely institutional, managerial, and technical. If
we put them into a hierarchy, the three layers will form a pyramid as in
Figure 2.
4 Ibid.
5 Vecchiato, R. (2012). Environmental Uncertainty, Foresight and Strategic
Decision Making: An Integrated Study. Technological Forecasting and Social
Change, 79(3), 436-447.
6 Hambrick, D. C. (1982). Environmental Scanning and Organizational Strategy.
Figure 3
Theoretical Framework of Data-Driven Organization in the
Ministry of Finance
The figure shows that the efforts to build the Ministry of Finance
into a data-driven organization need to pay attention to mutual
relationship between the stages in the theoretical framework. The series
9
of stages from bottom to top, namely from data to value, constitute a
process that “guides” so that every stage on top levels has solid
foundation on bottom levels. As an example, valid information will be
obtained only if the data used are accurate. Similarly, quality decisions
can be taken only if decision makers have holistic understanding, not
only of the information held, but also of the context of information.
Vice versa, the series of stages from top to bottom, namely from
value to data, constitute a process that “directs” so that every stage on
bottom levels is aligned and relevant to the goals desired to be achieved
on top levels. As an example, when the public deems that it is important
and urgent for the government to protect impacted communities during
pandemic, the government needs to take actions in form of provision of
social aids. Furthermore, a few decisions need to be taken, such as to
allocate budget, to determine distribution mechanism, and to prepare
accountability mechanism. The process is then continued with gaining
holistic understanding such as the demographic of unfortunate people,
the level of people’s income, and success indicators of social program.
The process keeps going until the data level, such as data collection,
processing, and analysis, so that the government programs in social field
are efficient.
The selection of pyramid shape to illustrate the relationship in
data-driven organization construction is not without reasons. The lower
parts always have broader area than the upper parts. The aim is to
illustrate the lower stages will require more efforts before producing
quality outputs for the upper stages.
As an illustration, to take an effective action as a result of
recommendation of a data analytics project, an organization leader
should take more than one quality decision, such as allotment of
10
resources, improvement of business process, coordination with external
parties, and convincing higher management about potential benefits to
be gained. Similarly, to obtain relevant information about the factors
most influential on energy usage efficiency, data analytics team will
explore and analyze more data rather than just the amount of power and
service bills.
The theoretical framework also shows that the process of
transformation into a data-driven organization involves participation
and collaboration of many parties. In addition to determining which unit
to be the main player on each stage, collaboration among them is
definitely required to ensure that the data-based transformation obtains
support and is relevant for related stakeholders. In general, the
participation requires data unit, business process unit, operational unit,
and leaders. In value creation process, the transformation involves the
public that will assess the impacts of public policies produced by the
Ministry of Finance as a public institution that becomes the footing of
many hopes.
In the process, if a decision or policy has high complexity or
tends to be unpopular, the most realistic approach will be to find
consensus among the players and impacted people. The approach is a
form of emancipation of stakeholders and broadens the opportunity for
successful transformation. As said by Robin Tye, the Chief Operating
Officer of Ernst and Young, “The important thing is that everyone feels
satisfied to be a part of the process. It is not good to take correct decision,
but no one supports it.”
A healthy ecosystem gives opportunities for everyone to
express their views and interpretation if there is a dissent while ensuring
that everyone becomes a constructive part of the team. Certainly, there
11
will be a potential risk if participation of too many people is involved. If
there are no limitations for the level of involvement, the transformation
will face slow process and contradictory inputs. Therefore, the process
of transformation of the Ministry of Finance into a data-driven
organization needs to find the balance between the number of parties
involved and the extent of participation on one hand and the level of
control and process advancement on the other hand.
12
13
CHALLENGES IN TRANSFORMATION INTO DATA-DRIVEN
ORGANIZATION
9
NewVantage Partners. (2021). Big Data and AI Executive Survey 2021. The
Journey to Becoming Data-Driven: A Progress Report on the State of Corporate
Data Initiatives. Accessed from
https://www.newvantage.com/thoughtleadership on 12th April 2021.
14
From the study, it is found out that transformation into data-
driven organization is not an easy process. The question is what the
biggest problem is faced by the companies that their transformation
initiative did not produce any results yet. The answer is that almost all
(92%) of the companies admitted that human problems, business
process, and culture became the obstacles from adopting data analytics
into their businesses. Although the study did not specifically explain on
which aspect the challenges of business process in implementing data
analytics lied on, the finding gave an initial clue of how much the leading
companies in the world faced classical problems frequently faced in
other transformation projects. Therefore, implementing data analytics
massively without paying attention to cultural changes can end up in
inefficiency and failure.
In the context of government, the term “business” is known as
the task and function. To implement data analytics in the Ministry of
Finance, we need to understand the task and function of the Ministry of
Finance. In broad outline, the function of the Ministry of Finance consists
of three areas, namely policy, regulation, and transaction10 (Figure 4).
Whatever function of echelon I units of the Ministry of Finance, all can be
mapped into the three areas.
10
Allen R., Hurcan, Y., & Queyranne, M. (2016). The Evolving Functions and
Organization of Finance Ministries. Public Budgeting & Finance, 36(4), 3-25.
15
Figure 4
Function Areas in the Ministry of Finance
17
than decisions that have impacts on technical layer. Similarly, if the
decisions and actions are related to external parties that involve conflict
of values and interest, translating information and understanding
obtained from data analytics into public policies is a process that tends
to be more complex.
Studies show that the practical use of information and
understanding obtained from the process of transformation into public
policies is extremely limited. Studies conducted in the states of the
United States of America found that executive institutions rarely applied
information obtained from their transformation programs into their
decisions11. Moreover, it can be said that parliamentary institutions
almost neglect the information they obtain.
Reality shows that information obtained from data analytics is
not the only basis for consideration used in decision making and
formulation of public policies. Public policy itself is produced through a
complex process involving many actors with various perceptions,
interests, values, and preferences of policies 12. Therefore, the process of
decision making and actions does not fall in the scope of data analytics,
but falls in the domain of public policy process. The implementation of
data analytics utilization in the Ministry of Finance requires various
supports, such as legal basis and management of changes, all of which
falls beyond the scope of a data analytics project.
11
Joyce, P. G., & Tompkins, S. S. (2002). Using Performance Information for
Budgeting: Clarifying the Framework and Investigating Recent State Experience.
In Meeting the Challenges of Performance-Oriented Government (pp. 61-96).
Washington, DC: American Society of Public Administration.
12
Weible, C. M., & Sabatier, P. A. (2018). Theories of the Policy Process. Fourth
Edition. New York NY: Routledge. Various leading theories on the process of
formulation of public policies can be learned in the book.
18
2. Data analytics ecosystem that has not grown
3. Limited resources
19
analytics, they can be categorized as outliers with rare and different
competencies compared to the competencies generally held by other
employees. In addition, their number is very limited.
Currently, the employees of the Ministry of Finance who have
interest and abilities in field of data analytics organize themselves and
their activities into a community called as the Ministry of Finance Data
Analytics Community (MoF-DAC). Core administrators and expert
members of MoFDAC only consists of 49 members 13. Not only their
number is not proportional to the total number of employees of the
Ministry of Finance, but also their number is not comparable to the needs
for human resources in field of data analytics if data analytics is
implemented massively in every function and line of the Ministry of
Finance. Certainly, the implementation of data analytics in the
organization cannot be imposed on them without viewing the ecosystem
as a whole.
The last consideration is related to the incomplete digitalization
and automation of business process of the Ministry of Finance. Some
business processes in the Ministry of Finance are done manually. For
example, around 78% of government spending transactions in the
Ministry of Finance as Budget User are still done manually, such as order,
approval, payment, tax payment, recording, reconciliation, and reporting
process14. The consequence is that it requires a lot of resources to
21
22
DATA CULTURE
16 Díaz, A., Rowshankish, K., dan Saleh, T. (2018). Why Data Culture Matters.
McKinsey Quarterly.
17 Satya Nadella (Chief Executive Officer of Microsoft). Accessed from
https://blogs.microsoft.com/blog/2014/04/15/a-data-culture-for-everyone on
22nd June 2021.
23
culture should be built from the internal parts of organization by
involving participation of everyone to build awareness of mutual goals.
The next chapter will discuss about some factors that determine to what
extent data culture can be built in an organization.
Democratizing Bureaucracy
24
if it is too rigid18. Bureaucratizing democracy will open a space for data
culture to grow with the emergence of fresh ideas from everyone.
There is an anecdote in the community of data analytics
practitioners that a threat against a data-driven organization is the
Highest-Paid Person Opinion (HiPPO) 19 or an opinion that comes from a
person with the highest pay. HiPPO is an anti-thesis of data-driven where
they are the highest officials of an organization who have extensive
experience but only rely on intuition and subjective truth, and
sometimes do not care about data. Instead of preserving the pyramid of
ideas, data culture needs democratization of ideas where every level can
bring up some hypothesis that need to be tested to produce the best
innovations.
Regional Budget Data Review Competition 2021 proved how
idea democratization opened spaces for those who were not structural
officials to be able to have a direct dialogue with the Minister of Finance
and to deliver brilliant ideas regarding improvement of Regional Budget
spending quality.20 The winning team presented ALokasi Outcome
aNomAli (ALONA) project that predicted Human Development Index
(HDI) and its components based on the compositions of Regional Budget
spending based on their functions. The model developed was also able
to detect the presence of anomalies, such as regions with large spending
but low HDI. Although its implementation requires time, efforts, and
Data Leadership
21Anderson (2015).
22Newman, D. (2016). The Future of Work: Data-Driven Leadership. Futurum
Premium Report.
26
clear incentives and career path so that everyone working with data
becomes productive and do great things. Support from the stakeholders
is obtained by showing the results of data analytics projects even though
it only constitutes small win. All of the actions will increase the
opportunities to distribute data culture in organization.
Organization with data culture tends to have leaders who take
decisions based on data 23. For example, before an organization
implements new policies, it needs to conduct limited trial to observe how
effective the policies are before escalated into larger scope. Similarly,
meeting leaders can utilize the first 30 minutes of the meeting to read
proposal summary and supporting data before taking evidence-based
decisions. Such practices will stimulate every employee that if they want
to be heard and to communicate directly with their superior, they have
to bring data and facts. If it is done consistently on management level,
data culture will be able to be a norm in the organization.
There are some principal factors that develop data literacy. The
factors also determine the level of data literacy. The first factor is data
understanding. To work with data, one certainly has to understand the
data. To gain understanding of data, the answers to some questions
about the data need to be found. Where do the data come from? What are
the types of data? Which unit produces the data? What business
processes are related to the data? Who uses the data? How are the data
compiled? It is important to answer the questions to gain understanding
of data.
The second factor is ability to analyze data. Ability to analyze
data is the second level of data literacy. On this level, one has been able
to implement statistic and analytical methodology to produce useful
insights. Furthermore, on this level, one is able to define the relationship
between variables in data.
The last factor is ability to interpret data. The factor is the most
important and most complex part because not only we cannot
communicate with data, but also assess what effects the data have on the
organization. Furthermore, one with such ability can see potential
benefits for organization upon understanding of the data.
29
Those Who Have to Master Data Literacy
30
enables everyone to have a basis for stronger argumentation than
without any data.
31
32
DATA ANALYTICS ECOSYSTEM
Growing Ecosystem
27 Anderson (2015).
33
entities is crucial to support the effectiveness and sustainability of data
analytics initiative.
In implementing data analytics in the Ministry of Finance, we
need to consider that all parties and components required to develop
data analytics are interrelated elements. Similarly, to convert data into
values, the process will require interaction among human resource,
technology, and organization structure. The processes are placed in
ecosystem framework as a community that grows together underlain by
regulations, organizations, and commitment that develop data culture
(Figure 5). Data culture is the determining component of positive
interaction success in data analytics ecosystem. Data culture enables
everyone and every working unit to be empowered to produce values
and benefits for organization and public.
34
Figure 5
Ministry of Finance’ s Data Analytics Ecosystem
35
Strategic Direction for Data Analytics
36
policies that are useful for organization and public. On technical level,
data analytics ecosystem is expected to give early alerts and insights for
operations and services. Data analytics ecosystem in the future will also
enable collaboration among communities, business world, and
government.
Good quality of data, coming from automation and digitalization
of business process, and data standardization, becomes good input for
every existing process in data analytics ecosystem. The level of data
literacy including understanding of data and user involvement, either
from business process unit or data processing unit, takes part in giving
good input. For all elements in the ecosystem to interact, it requires some
elements as the activator and generator called as program induction.
Program induction will be dynamic depending of the maturity level of
data analytics in the Ministry of Finance. Program induction of data
analytics includes implementation of data analytics initiative, role
models from leaders, data competitions (hackathon), and synergy with
other institutions.
37
38
HUMAN RESOURCES
Pool of Expertise
39
In technical implementation, data analysts can get directions
from more experienced data analysts. Then, based on the directions, they
obtain, process, and summarize data. They are the people who manage
the quality assurance of data scraping, conduct regular database query
upon user request, and overcome data problems to achieve timely
resolutions. Furthermore, they pack data to give insights that can be
absorbed in form of narrative or visual.
Data engineers tend to focus on software engineering and
database design. They are also responsible for the smoothness of flow of
data from data sources to data destinations. By utilizing descriptive
statistics and outputs produced by algorithm, data are moved back to
their sources or other locations.
Data scientists are data analytics practitioners with a role
between data engineers and business process owner unit. They have
skills in field of statistics, data mining, machine learning, operation
research, six sigma, automation, and knowledge about business process.
They combine some techniques, processes, and methodologies from
various fields to achieve the goals of organization. They are in charge of
bridging various components that contribute to improvement of
business process, and eliminating silos that hamper efficiency.
Required Competencies
40
environment and working world. The two types of competencies are
crucial to master so that a data analytics practitioner will not only have
good career, but also become the agent of change in his/her organization.
1. Hard skills
41
opportunities to improve the efficiency and effectiveness of business
process.
c. Has knowledge of statistics
Knowledge of statistics is a crucial skill to be a good data
analytics practitioner. Basic knowledge of statistics, such as calculus
and probability, is highly required because mere raw report is not
enough. A data analytics practitioner has to be able to see trends and
fluctuations. With knowledge of statistical theories and applied
statistics, a data analytics practitioner will not only be able to make
data statements, but also understand why an analysis produces
certain results. The level of knowledge required in field of statistics
will vary depending on the position or role of a practitioner in team.
42
Alternatives of the two software are Google Data Studio and
Apache Superset. The two are alternatives that can be used without
having to think about paid license. Currently, Google Data Studio can
be utilized as cloud as long as the user has a Google account and can
be connected to Google Drive or third party software. Meanwhile,
Apache Superset can be installed on premise and can be utilized by
users in broad sense. Apache Superset itself is able to handle data in
petabyte map (big data) and has obtained significant support from
technology companies such as Lyft and Dropbox as well as becoming
a priority-scale project in Apache Software Foundation in 2021.
43
understand data related to management of state cash, he/she needs
to understand the working unit that implements the business process
related to management of cash in Directorate General of Treasury
and which data are interrelated. Similarly, if he/she is about to
analyze data of an information system, he/she needs to understand
the system and its procedure.
Knowledge of organization can vary from one organization to
another. Therefore, a data analytics practitioner has to be able to
learn quickly anywhere and in any fields data analytics works. If an
analyst does not understand the organization and context analyzed,
he/she will face difficulties to perform the tasks effectively.
Therefore, knowledge of organization and business process becomes
domain knowledge and main skills of a data analytics practitioner.
2. Soft skills
44
a. Communication and presentation
The deeper a report is, the briefer it has to be presented. An
analyst has to be able to explain the most important elements of data
analysis to team leader and business process owner unit leader. If no
one understands the report presented by a data analyst, there will be
no strategic decisions taken based on the report. Therefore, after
successfully conducting data analysis, a data analyst has to be able to
tell and show the conclusions of the analysis to information users.
b. Problem Solving
Problem-solving skill is one of the most important skills that has
to be mastered by a data analytics practitioner. To solve problems, an
analyst has to think critically and understand what are the right
questions to ask. If the questions asked are based on knowledge of
organization and business process, the investigation conducted will
be relevant to the needs of working unit. Eventually, data analytics
outputs will be relevant and produce the answers needed.
Data analytics mostly pursues logical thinking on the problems
faced. Therefore, an analyst has to have good logics on business
process and data. An analyst will reach right conclusions more
quickly if he/she is accustomed to data variations and challenges.
c. Critical thinking
Problem solving and critical thinking refer to the ability to
utilize knowledge, facts, and data to solve problems effectively. It
does not mean that a data analyst has to have instant answer.
However, he/she has to be able to think independently, evaluate the
problems, and find solutions. The ability to develop well-thought
45
solutions within reasonable framework of time is a valuable skill for
organization.
To be a successful data analytics practitioner, someone has to
think like a data analyst. If data analyst wants to use data to get
answers to questions, he/she will have to know what questions to ask
first. One thing which is also crucial is that a data analyst does not
depend on existing answers. On the contrary, he/she needs to
consider various possible scenarios.
d. Curiosity
An analyst may not have all information he/she needs in his/her
hands. Therefore, an analyst has to have great curiosity to explore
information more deeply if he/she wants to optimize the data he/she
has collected. An analyst also needs to read the results of current
studies so as to follow the development of science and technology in
field of data analytics. An analyst needs to have understanding and
latest tools to be able to interpret the most important information of
data he/she has. Up-to-date knowledge in field of data analytics is
also useful when an analyst presents his/her findings to the leaders
and persuade them about the steps to be taken by the organization
thereafter.
e. Attention to details
In many things, an analyst’s job is similar to finding a needle in
a haystack. An analyst has to be able to pay attention to small clues
that direct him/her to a greater message hidden behind the data.
Reporting and collecting data can be boring. Therefore, the ability to
46
draw important conclusion from data is a skill that needs to be
continuously developed.
Attention to details is also useful when an analyst sorts out data
and arranges analytical process. A minor mistake in a line of code can
make the whole workflow wrong. An analyst has to be aware of minor
mistakes that can cause greater problems in the system.
f. Teamwork
Data analytics practitioners need to collaborate with people of
various positions and working units to complete their tasks. They
also need to cooperate with business process owner to determine
what kinds of questions that can be answered through data analytics.
They also collaborate with website developers to ensure that the
organization’s website or information system is designed efficiently
to produce data they need. In greater scale, data analytics
practitioners collaborate with leaders to determine how the latest
data insights can guide the Ministry of Finance to move towards its
goals.
47
cover leadership skills required to prepare future leaders who
understand data culture. Table 1 contains examples of capacity
development programs for data analytics practitioners that can be
implemented28.
There are many approaches that can be implemented in
management of human resources in field of data analytics. Firstly, a
working unit can develop data analytics skills by having some data
analysts from the working unit’s internal, where the internal analysts
will obtain support from the externals when required. Secondly, internal
analysts can also cooperate with professional specialists upon request if
the internal analysts’ experience and skills are not adequate to solve
certain problems. Thirdly, the working unit can also utilize after-sales
service of software companies related to data analytics techniques or
methods. Lastly, the working unit can assign regular tasks to external
parties to reduce costs (outsourcing) and focus internal analysts to more
important and strategic tasks.
Before determining which approach to be implemented, we
need to identify the organization’s needs of data analytics as a whole. The
next step is to determine resources, either internal or external, that can
be used to meet the needs. External resources can be utilized when there
are highly specialized needs. However, the external resources are only
used to meet the needs that do not frequently occur and are not the
determining factors of the organization’s capability. When the needs are
highly significant and frequently occur, the organization needs to
prepare internal resources to avoid dependence to external resources.
28
World Customs Organization. (2018). WCO Capacity Building Framework for
Data Analytics.
48
49
50
51
INTRODUCTION OF DATA ANALYTICS TO THE MINISTRY OF
FINANCE
29
Marr, B. (2015). Big Data: Using SMART Big Data, Analytics and Metrics to Make
Better Decisions and Improve Performance. John Wiley & Sons.
52
Nevertheless, the understanding of data needs to be present on every
level, both on management level and on executive level30.
One of initial initiatives of implementation of analytical was
done by Customs and Excise Main Office Type A Tanjung Priok. Starting
from 2017, Customs and Excise Main Office Tanjung Priok implemented
data mining approach to determine target entities in DJP-DJBC Join
Program activities. In the beginning, the Joint Program activities faced
various challenges. One of which was that the list of companies to be
exchanged between the two echelon I units was not clearly defined yet.
It was because the related parties were not transparent yet in explaining
the process of analysis they conducted in producing the suggested list of
companies that became the targets of the program.
To escape from the problems, the team of Customs and Excise
Main Office Tanjung Priok suggested the data mining approach that
combines taxation data and customs data. The result of the project was
a list of scores called as Quality Assurance Scores, namely the scores that
illustrated the level of taxation obedience of every entity. In addition, the
data mining approach also produced Antareja Dashboard that illustrated
the potential state revenues at the same time became the list of targets
in the Joint Program activities. The approach made analysis of targets of
Joint Program quicker, more effective, and more efficient. A year after
the project was completed in 2018, the Joint Program has generated
state revenue of more than Rp2.7 billion. In addition, the preparation of
list of targets based on data mining has directed the Joint Program to a
new stage without having hampered by trust issues and incompatibility
of list of targets.
30
World Customs Organization 2018). Handbook on Data Analysis.
53
Data analytics project competition on the 74th Banknote Day of
the Republic of Indonesia (HORI) in 2020 became the moment of
escalation of lots of discussions about data. The competition was
expected to improve the culture of data utilization importance in
decision making. Unexpectedly, the event was able to collect 67 data
analytics projects distributed in all Echelon I units 31. Through the
competition, many parties eventually realized that the utilization of data
analytics could give great benefits to the Ministry of Finance. It was
because the winning project in the competition was proved to bring
benefits for state revenue 32. It is not impossible that such things could
also be implemented in other tasks and functions in the working units of
the Ministry of Finance.
The competition has generated awareness that data analytics
can give new colors to the Ministry of Finance. There is a hope that data-
based decision making is a culture that has to be embraced by every
working unit. Analysis and decision making not only use the data from
the unit internals, but also the data exchanged via Data Service System
of the Ministry of Finance (SLDK) and external parties. The goal is to
make decisions taken more effective, more comprehensive, and quicker.
In accordance with the mandate of the Minister of Finance, in
information transparency era and collaboration era, employees of the
Excise consisting of Canrakerta, Dewa Gde Adi Murthi Udayana, Yohanes Bella
Kurniawan, and Yuafanda Kholfi. Their project was entitled Implementation of
Data Mining as Risk Management in Import Documents: A Case Study at
Directorate General of Customs and Excise.
54
Ministry of Finance have to be open to data that can be utilized together
for national interest33.
Figure 6
Strategic Initiative of Bureaucracy Reformation and Institutional
Transformation Year 2021
56
organization. Overall projects are also encouraged to be the enablers of
business process improvement, the drivers of digital acceleration in the
Ministry of Finance, the drivers of organization transformation, the
supporters of organization effectiveness and efficiency, and insights for
formulation of policies for ministries, institutions, and regional
governments.
The implementation of strategic initiative of data analytics
becomes a specific challenge for the Ministry of Finance. There is a gap
of understanding of data analytics both on management level and
executive level so it needs to arrange some strategies to implement the
strategic initiative. Some strategies used are building capacities related
to data analytics and collaborating with MoF-DAC to give insights and
inputs in terms of concepts and technical to the data analytics projects
they work on.
Capacity building is facilitated by BPPK General Financial
Education and Training Center. It is conducted in two stages of
bootcamps, namely bootcamp 1 to discuss data analytics in general and
bootcamp 2 to discuss data analytics in technical manner. The
curriculum of the activity has been prepared since 18 th January 202136.
The bootcamps involve many parties. The educators are employees of
the Ministry of Finance who have utilized data analytics earlier and
belong to MoF-DAC.
36
Central Transformation Office. (2021b). Transformation Info (INTRA). 2 nd
Edition. Ministry of Finance.
57
58
The bootcamps are expected to be able to give an understanding
that the utilization of data analytics requires good cooperation between
the management and the executive. In addition, the utilization of data
analytics also requires cooperation between business process owner
unit and data processing unit. On the other hand, the bootcamps also
reaches technical matters to help the learning process. Not exaggerated,
MoF-DAC as one of the champions in the strategic initiative is directly
requested to play an active role in explaining the material at the
bootcamps or to assist every data analytics project in the Ministry of
Finance.
Lessons to Learn
59
to ensure that the data analytics project they are working on is relevant
and produces solutions that can be implemented.
The second lesson from the data analytics project is the
importance of data sharing among echelon I units or even among
ministries/institutions. It frequently happens that a data analytics
project that becomes the responsibility of an echelon I unit turns out to
require data from another echelon I unit. The situation is caused by the
business processes of echelon I units that are related to each other as
illustrated in the enterprise architecture of the Ministry of Finance.
The data analytics project worked on by DJPK that develops
Ministry/Institution Spending Analysis Model with Physical DAK
(education sector and roads) can be an example. In addition to requiring
the data managed by DJPK, the project also requires data from DJA, DJPb,
and the Ministry of Public Works and Public Housing. The data required
from DJA are the budget for education sector and roads. The data
required from DJPb are the realization of budget for education sector and
roads. While the data required from the Ministry of Public Works and
Public Housing are the location coordinates of education and road
projects they are working on. The data from the Ministry of Public Works
and Public Housing are used to validate whether there are similar road
projects done in the same locations. In addition, the data are also useful
to ensure that coordination between related institutions one different
government levels can be done well.
To overcome bureaucracy obstacles related to data sharing,
Data Management Office (DMO) at Central Transformation Office of the
Ministry of Finance plays a strategic role as the catalyst and becomes a
bridge for communication and coordination among echelon I units
including related ministries/institutions. DMO also plays the role as a
60
resource person and reviewer along with MoF-DAC to discuss the
substances and work procedure of teams that work on the data analytics
projects. Eventually, all data analytics projects can give significant
benefits to the Ministry of Finance as a whole.
61
62
SUCCESS FACTORS IN BUILDING A DATA CULTURE
Commitment of Leaders
Previous reviews show that the greatest obstacles in building a
data-driven organization are not technical factors but cultural
challenges. Illustrating and explaining how to use data as basis for
decision making will be easier that making them a habit and normal
things. The problems in building a data culture are the challenges in
building a data-driven organization.
In building a data culture, commitment of leaders is the key
factor. The role of leaders is not only important in viewing the big picture
of organization transformation, but also in producing added values for
organization and public. There are many statements by experts and
practitioners regarding the importance of role of leaders as the
developer of culture (Table 2).
Table 2
Importance of Role of Leaders in Data Culture 37
38
Schein, E. H. (2004). Organizational Culture and Leadership. Third Edition. San
Francisco CA: Jossey Bass.
64
projects, grant of feasible status to data analytics projects, as well as
career development and grant of awards to data analytics practitioners.
The leaders can also attach data culture in the organization by showing
attention to details and being regular toward data analytics initiative
done in their working units. The actions in Table 3 can also be taken by
the leaders to attach data culture in the the Ministry of Finance.
Table 3
Instruments for Leaders to Build A Data Culture
After data culture is attached, a leader will play the role as the
data culture grower. Data culture is not a binary concept where
organization is classified only into two categories, namely not yet or
already having data culture. It is a continuum in which an organization
can have better data culture after having data culture on certain level.
Therefore, as the data culture grower, leaders have to identify on which
stage data culture has been planted in their units. Furthermore, they
need to be creative to find what mechanisms can be implemented to keep
data culture growing and developing (Table 4).
65
Table 4
Data Culture Stages and Growth Mechanisms
Management of Changes
The review above has analyzed the enabling factors of
development of data culture. In addition, we also need to recognize the
obstacles for data culture. The goal is that the development of data
culture and implementation of data analytics initiative in the Ministry of
Finance does not repeat the same mistakes done by other organizations.
The study by Gartner predicts that until 2022, the utilization of data
analytics will only meet 20% of business outcomes 39. Many data
analytics projects will fail.
39White, A. (2019). Our Top Data and Analytics Predicts for 2019. Accessed from
https://blogs.gartner.com/andrew_white/2019/01/03/our-top-data-and-
analytics-predicts-for-2019 on 24th June 2021.
66
The failure of data analytics projects can be caused by some
reasons. Based on the survey conducted on 19 data analytics experts,
were obtained over 100 reasons that can cause failure of data analytics
projects classified into some factors (Table 5) 40. the Ministry of Finance
has to pay attention to the factors in developing data analytics projects
to reduce the risk of failure.
Building a data culture certainly takes time and needs to be
supported by management of changes. It aims to make a data culture the
mainstream in the Ministry of Finance. Management of changes needs to
be implemented to ensure that all involved parties and required facilities
are ready to support the changes of data culture. The parties who
become the goals of management of changes are not limited to
information system management unit only, but also business process
unit and leaders of all working units.
The focus of management of changes is change of mindset and
habit in the Ministry of Finance. In this case, the leaders are the people
who initiate and manage the changes, identify and overcome the
challenges, as well as monitoring and evaluating the results. Building a
data culture starts from the highest management. Leaders with high
expectation towards data-based decision-making habit will influence the
leaders below them because of the encouragement to participate.
In the practice, many ways can be taken as parts of campaign
and management of changes. The creativity of leaders and agents of
changes is required to determine the most proper strategy to support
the change of culture. No matter which way is chosen, the leaders’ direct
40
Becker, D. K. (2017). Predicting Outcomes for Big Data Projects: Big Data
Project Dynamics (BDPD): Research in Progress. In 2017 IEEE International
Conference on Big Data (Big Data) (pp. 2320-2330). IEEE.
67
involvement in management of changes is definitely required because
the leaders are the role models for transformation into a data-driven
organization. Some ways to support management of changes that can be
taken are among others:
1. Executive training on data culture and data analytics for leaders of
echelon I, II, and III units;
2. Training on data analytics for those who do not have data analytics
background;
3. Campaign on use of data analytics on every level of working units;
4. Implementation of data analytics hackathon in every echelon I unit;
5. Regular promotion on data analytics via social media;
6. Regular implementation of Financial Data Talk Show (NGOTAK);
7. Utilization of Project Management Office network to deliver the
important messages of data culture in every echelon I unit.
41White, A. (2019). Our Top Data and Analytics Predicts for 2019. Accessed from
https://blogs.gartner.com/andrew_white/2019/01/03/our-top-data-and-
analytics-predicts-for-2019 on 24th June 2021.
68
The failure of data analytics projects can be caused by some
reasons. Based on the survey conducted on 19 data analytics experts,
were obtained over 100 reasons that can cause failure of data analytics
projects classified into some factors (Table 5)42. the Ministry of Finance
has to pay attention to the factors in developing data analytics projects
to reduce the risk of failure.
Table 5
Factors Causing Failure of Data Analytics Projects
42
Becker, D. K. (2017). Predicting Outcomes for Big Data Projects: Big Data
Project Dynamics (BDPD): Research in Progress. In 2017 IEEE International
Conference on Big Data (Big Data) (pp. 2320-2330). IEEE.
69
70
ORGANIZATION STRUCTURE OF DATA ANALYTICS
Centralization Model
71
produced by centralization model. Firstly, centralization enables
standardization of expertise, training, and tools. At the same time, the
analysts can also share resources and efficiency of license cost. Secondly,
the analysts can also work across functions and across Echelon I units
with ease of coordination and ideas sharing among team members. The
model also enables us to work on projects with limited access to data. In
addition, centralization model enables the implementation of long-term
projects because of the centralized availability of budget.
Figure 8
Centralization Model of Data Analytics Unit in the Ministry of
Finance
However, the model also has a weakness, that the analysts will be
isolated from business process team and core goals of every working
unit. Therefore, it requires analysts who understand business process,
environment, and laws that becomes the context of data. Moreover, there
72
is a tendency for occurrence of data analytics unit bureaucratization that
the analysts can be reactive towards demands for data analytics because
there will be prioritization and competition of resources in the internals
of data analytics unit. Heavy workload can make central team less
responsive towards the needs of organization. There is a potential for
burnout if increase of workload is not accompanied by increase of
resources. The model also requires quite intensive training to produce
expert staff who understand business process across units properly.
Decentralization Model
In decentralization model, the analysts are classified into a
number of special teams. The analysts work according to the tasks given
to their respective teams. In the context of the Ministry of Finance, the
data analytics teams will be placed in every echelon I unit to echelon II
unit. Figure 9 is the generic form of decentralization model when it is
implemented in the Ministry of Finance.
Figure 9
Decentralization Model of Data Analytics Unit in the Ministry of
Finance
73
The strength of the model is that the analysts will understand
more about the functions and goals of every working unit. In addition,
analytical projects are also relevant to the needs and missions of every
working unit. The result is that every working unit obtains direct
benefits from data analytics projects.
However, the model also has some weaknesses.
Decentralization of data analytics can cause an analyst to be isolated
from other analysts that may result in redundancy of data analytics
projects as well as divergence of expertise, training, and tools. Instead of
generating efficiency, without good coordination and communication
among the analysts and core management, data analytics projects
worked on by the model can result in inefficiency. Data analytics units in
the model will face difficulties to access other competencies and
experiences of other data analytics units in other units. The dimension
and scope of data analytics projects produced by the model are also
limited because they are not connected to the needs and data analytics
projects in other working units.
74
among data analytics units in every echelon I unit. Figure 10 is the
generic form of Center of Expertise model when it is implemented in the
Ministry of Finance.
Figure 10
Center of Expertise Model in the Ministry of Finance
75
distribute its analysts in accordance with the unit’s needs by
coordinating with Center of Expertise.
However, the model also has some weaknesses. Center of
Expertise may not have adequate control on the effectiveness of work of
data analytics unit in echelon I unit because the data analytics unit is not
directly below Center of Expertise. The priority of work in echelon I units
can be considered higher by the leaders of echelon I units, which
hampers the achievement of goals of data analytics for the Ministry of
Finance in broad sense. If Center of Expertise does not get sufficient
budget support, the implementation of data analytics will be hampered
and split up in every echelon I unit. Therefore, Center of Expertise has to
be supported by proper facilities to eliminate divergence of technology
existing in echelon I units.
Functional Model
76
Figure 11
Functional Model in the Ministry of Finance
The model has some strengths. Data analytics units will be in all
functions of organization. Therefore, values and benefits of data
analytics projects will have more impacts because analyst units are
concentrated in the sector. The model also enables the expansion of
benefits of data analytics projects beyond initial scope. The limitations
in developing data analytics in a function can be transferred to increase
the benefits of other functions. For example, data analytics in
supervision unit that aim to increase obedience can be developed to
translated the obedience into optimization of state revenue.
However, the model also has a number of weaknesses.
Considering the limited resources, prioritization of function will occur in
the Ministry of Finance. If it is only focused on some functions, the other
functions will run without data analytics. Additionally, budgeting
procedure becomes more complicated because every group of functions
consists of various units of echelon I. The model is more suitable for
77
organizations that divide their units based on groups of functions or
organizations that do not have duplication of functions among their
units.
The description above has discussed about the four models in
general so that their implementation can be adjusted to the condition of
respective organization. Moreover, before selecting organization
structure of data analytics, all of the models above need to be discussed
in broader dimension and more diverse aspects, such as permit for
formation of new unit, range of control, human resource support,
availability of budget, and compatibility with the strategic plan of the
Ministry of Finance. The goal is that the form and organization structure
of data analytics produced can be accepted more widely and can
overcome bureaucratic deadlock, which is one of important goals of data
analytics projects. Selection of organization model of data analytics
depends on internal decision of the organization and is adjusted to
individual needs and condition.
78
We should not let data analytics units produce new partitions in
organization that should be able to be overcome.
2. Accessible enterprise-wide
Data analytics units have to be inclusive and open for all other
units from different levels. For echelon I units that have vertical units
in regions, there can occur problems in the field that need to be solved
using data analytics.
Even data analytics function also has to be present in public
services as the forefront facing the public. If potential problems on
operational level are not ease to be delivered by vertical unit to data
analytics unit, the organization’s goals in utilizing data analytics
become suboptimal. Ideally, data analytics units can provide a
cohesive platform that supports the collaboration among all units to
realize supportive environment and ecosystem.
3. Integrated with tasks and functions of working units (integrated with
business)
Data analytics units have to be connected and can be easily
connected to business process owner units. Data analytics units that
are not connected to business process owner units tend to produce
data analytics projects with minor impacts. The survey conducted by
McKinsey in 2018 gives an example of an insurance company that
recruited a large number of data scientists and worked on over 50
pilot projects. Because the insurance company placed its data
analytics unit separated and isolated from its business process owner
unit, eventually the data analytics unit did not produce anything for
the company.
79
80
DATA ANALYTICS AND DATA CONFIDENTIALITY
Data De-identification
81
regulations. This process is also needed in the application of data
analytics in the Ministry of Finance because some of the data they
currently own already include personal data of individuals and business
entities. Further information on data de-identification can be found in
the Guide to Data Analytics and the Australian Privacy Principles
published by Office of the Australian Information Commissioner (2018).
Broadly speaking, the process of data de-identification is as
follows:
1. Identifying data that can be direct identifiers of individuals or legal
entities, namely unique attributes of individual, such as name,
National Identification Number, Taxpayer Identification Number,
Employee ID Number, and Soldier Registration Number. Due to
current development of the use of unstructured data, facial
photographs and voice recordings are also included in the direct
identifiers group.
2. Identifying data that can become indirect identifiers, namely non-
unique attributes, such as height, age, skin color, hair color.
3. Performing pseudonymization, or better known as masking data.
4. After masking the identity data, the data can be further de-identified
using methods such as k-anonymization or randomizing to avoid re-
identification by unauthorized persons.
5. The unit that performs data masking and de-identification must be
able to retrieve the data that have been processed to their initial
condition.
De-identification can be applied in analysis of inter-entity
relationships in connection to tax payments. The data used may contain
information such as TIN, name of entity, address of business entity,
company deed number, telephone number, or company electronic mail.
82
If those who will perform an analysis of the data are parties who are not
entitled to access the tax data, then those parties will process the data
that have been de-identified.
Data analytics projects within the Ministry of Finance need to
consider whether individual and business entity data that have been de-
identified enable the organizations to use, share, or publish such data
without violating the provisions of data confidentiality and privacy. Data
de-identification can be carried out at various stages of a data analytics
project such as:
1. As soon as business entity data, personal data or other confidential
data are obtained,
2. Prior to analysis of the data, and/or
3. When data are used, shared or published, either in the internals or
externals of the Ministry of Finance.
83
1. Proactively managing data confidentiality and privacy before issues
related to data confidentiality and privacy are revealed,
2. Understanding that effective and innovative use of data along with
maintaining data confidentiality and privacy is possible, and
3. Ensuring that all confidential data and personal information are
always stored securely in the series of data analytics projects.
84
entity, but this was not followed by a clear definition of problems and
research objectives. This happened because the participants thought
that a data analytics project could be carried out by collecting as much
data as possible. A request for data without understanding business
processes will result in failure to achieve the goals of data analytics and
potentially violate data confidentiality.
85
86
MODELING TECHNIQUES FOR DATA ANALYTICS
Data Mining
45Hand, D. J., Mannila, H., & Smyth, P. (2001). Principles of Data Mining (Adaptive
Computation and Machine Learning). MIT Press.
87
Table 6
General Methods Used in Data Mining
Example of
Method Explanation
Algorithm
Association Used to observe the causal relations Apriori, FP
rules that occur based on association and Growth, and
correlation among data Eclat
Example of Use:
- At DJBC: To observe the
correlation between import and
export commodities
- At BPPK: To observe the
relevance of education and
training attended by employees
Classification/ Used to predict future events by Logistic
Prediction studying past events regression,
naive bayes,
Example of Use: support vector
- In Procurement Bureau/ machine,
Department: To predict the decision tree,
success of a procurement neural network,
- At DJBC: To perform risk and ensemble
management in import services method
and to predict the price of
imported goods.
- At Inspectorate General/internal
compliance unit: To detect fraud
Cluster Used to perform classification of the K-means,
analysis data owned DBSCAN,
Example of Use: agglomerative
- At DJBC: To perform clustering, and
segmentation of customs service BIRCH
users
- At DJP: To perform segmentation
of taxpayers
- At DJKN: To classify BMN
- At BPPK/HR Division/ Bureau: To
classify employees based on
certain criteria
Outlier Used to observe anomalies that Distance-based,
analysis occur from the datasets owned. density-based,
local outlier
88
Example of Use: factor,
- At DJPb: To detect misuse of connectivity-
government credit cards based outlier
- At Financial Technology and factor, and
Information System Center: To isolation forest
detect network intrusions and to
check for anomalies that occur in
computer networks
- At DJPb: To detect unusual
transactions
- In DJBC: To detect unusual ship
movements
Time series Used to observe trends that occur by Autoregressive,
analysis utilizing time components in data. moving average,
autoregressive
Example of Use: moving average,
- At DJBC: To predict the number of autoregressive
imports and exports that will integrated
occur in the future moving average,
- At DJP: To predict tax revenue and exponential
- At DJPb: To predict state spending smoothing
- At HR Department/Bureau: To
predict the number of employees
needed in the future.
46Kotu, V., & Deshpande, B. (2015). Data mining process. In Predictive Analytics
and Data Mining (p. 26). Elsevier.
89
Meanwhile, the unsupervised learning method is used to see
hidden patterns from unlabeled data. When referring to Table 6, an
example of supervised learning method is when solving problems using
classification/prediction method, while an example of unsupervised
learning method is solving problems using clustering method.
Text Mining
In general, both text mining and data mining have the same goal,
which is to obtain interesting information or patterns from data. Unlike
data mining which uses structured data, text mining tends to use
unstructured data. For example, text on social media, text on news in
conventional media, and text on regulatory documents. The text data
require special treatment until they become structured forms. The
process carried out starting from the source of required data until the
data can be analyzed is presented in Figure 12.
Figure 12
General Framework for Text Mining
90
Table 7
Approaches in Text Mining
Examples of
Method Explanation
Algorithm
Text Used to classify or predict Logistic
classification information from a text. regression,
naive bayes,
Example of Use: support vector
- KLI Bureau/Public Relation machine,
Division: To analyze public decision tree,
sentiment through social media neural network,
and online media upon a policy and ensemble
taken method
- Financial Technology and
Information System Center: To
predict incoming spams in official
e-mails.
Text clustering Used to classify text data that do not K-means,
have labels yet. DBSCAN,
agglomerative
Example of Use: clustering, and
- KLI Bureau/Public Relations BIRCH
Division: To classify news written
in media about the Ministry of
Finance
- Legal Bureau: To classify types of
regulations existing in the
Ministry of Finance.
Information Used to extract interesting Rule learning
extraction information from text data. based,
classification
Example of Use: based method,
- Organization and Management and sequential
Bureau/OTL Division: To extract labeling based
information from presentation
delivered by leaders
- Financial Technology and
Information System Center: To
extract information from official
e-mails
Topic modeling Used to extract topics from text data. Latent dirichlet
allocation
91
Examples of
Method Explanation
Algorithm
Example of Use: (LDA), non-
- KLI Bureau/Public Relations negative matrix
Division: To determine the topics factorization
written in news related to the (NMF),
Ministry of Finance latent semantic
analysis (LSA),
parallel latent
dirichlet
allocation
(PLDAA), and
pachinko
allocation
model (PAM)
Text Used to determine important Textrank,
summarization information from text data. sentence
scoring based,
Example of Use: and k-means
- KLI Bureau/Public Relations clustering
Division: To draw conclusions on
news related to the Ministry of
Finance
- Secretariat/General Bureau: To
draw conclusions on instructions
from leaders in the Ministry of
Finance
Information In contrast to the other methods,
retrieval information retrieval approach is a
process starting from collection,
indexing, to filtering. Used to form
text datasets to make it easier to find
information quickly.
Example of Use:
- Legal Bureau: To create a corpus
related to the Ministry of Finance
and link it with applicable
regulations.
92
Social Network Analysis
Data Visualization
93
certainly requires the right design of data warehouse to ensure that the
data flow that will be presented has a good quality.
In simple way, a data warehouse can be defined as a gathering
place for data that are produced to support decision making. Data
warehouse can also be defined as a collection of current or past data that
have their own appeals for decision making. The data warehouse itself
consists of transactional data that have passed through the process of
extraction, transformation, and loading (ETL). Theoretically, the
creation of data warehouse can be described through the data
warehouse architecture as outlined in Figure 13.
Figure 13
Data Warehouse Architecture47
95
Development Process of Data Analytics
Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz, T., Shearer, C., &
48
Wirth, R. (2000). CRISP-DM 1.0: Step-By-Step Data Mining Guide. SPSS inc, 9, 13.
96
Figure 14
Stages of CRISP-DM
97
have been collected, exploring data, verifying the quality of data,
understanding the condition of anomalous data or outliers,
anticipating the correlations between variables, and visualizing to
understand data.
c. Data preparation
This stage includes selection of attributes that can be compiled
into a dataset that will be used in modeling. The activities on this
stage are selection of data to be used, cleaning of data that will not be
used, formation of new variables if required, and integration of data.
d. Modeling
On this stage, data mining techniques or methods are
implemented to obtain modeling in accordance with the research
objectives. A number of activities need to be carried out on this stage,
namely determining the model to be used, compiling or sorting data
to be used as samples, forming models, and evaluating the model
produced to observe the important variables and consulting with the
business process owner.
e. Evaluation
This stage is carried out to evaluate the modeling that has been
previously made to achieve the research objectives. Some of the
activities that need to be carried out include evaluating the entire
results whether the project objectives are fulfilled or not, evaluating
the technical and practical aspects of the model such as performance
and speed, reviewing the processes carried out, and determining
whether the model produced will proceed to the deployment stage or
not.
98
f. Deployment
This stage indicates that the modeling carried out is in
accordance with the research objectives so that it is ready to be used.
If the evaluation results show that the model is ready for production,
then the steps that need to be prepared include determining the
deployment plan, monitoring and maintenance, and preparing the
final documents of project.
The series of processes above is not a one-time process. If the
model produced is already running in the production sector, the model
must still be evaluated periodically to see if there is a need for changes
of the previous model. The changes of model need to be made when the
model being used is rated to be ineffective based on calculations that can
be measured in terms of the objectives of the business process owner.
99
100
DATA ANALYTICS ROADMAP
Table 8
Data Analytics Roadmap
102
Short Term Medium Term
103
Short Term Medium Term
Governance
104
Human Resources
Digital Infrastructure
105
the center, but is also open to working units in regions. To formulate this,
it is necessary to make a consensus on the environment and tools of data
analytics that will be used by the Ministry of Finance. Then, in medium
term, if the investments made have been carried out according to plans
and needs, it is necessary to develop standard operating procedures
related to the use of environment and tools of data analytics for all
working units, including the mechanisms of data utilization and the
projects of data analytics between echelon I units.
Data as Assets
106
that it can develop along with the maturity process of data analytics in
the Ministry of Finance. Likewise, the estimated costs, sources of
funding, and the person in charge of the program have not yet been
determined to provide room for discussion and improvement.
Some programs need to be conceptualized and require further
analysis from related units, including identification of resources for their
implementation. For example, human resource management strategies
of data analytics to support future workspaces require views from
human resource management unit, policy makers in field of organization
and governance, as well as technical and operational units so that there
is a link and match between organizational needs, provision of data
analytics talents, and strategic directions from leader. In the end, the
ideas on this roadmap are expected to be a trigger for discussion and
consensus so that everyone can use the data to transform the Ministry of
Finance into a more reliable public institution.
107
CLOSING
108
out. Certainly, the process is not an easy one. It takes determination,
consistency, and the right ecosystem to grow data culture.
As the closing, being an organization with data culture does not
guarantee that the entire transformation program will be easy to be
implemented. Firstly, the Ministry of Finance is a public institution that
lives in a very dynamic state constellation. Changes can occur quickly.
Values can change. Strategic themes can also change. Secondly, the
programs and strategies of transformation are led and implemented by
humans. The human factor remains to be the final determinant of the
goals of the transformation journey. Mistakes in setting goals and
strategies will direct to failure even though every decision is supported
by data.
In the end, efforts to build a data culture will pay off. By having
data culture, the Ministry of Finance can formulate policies more
effectively. Decisions and actions are also generated faster, better, and
more innovatively. Data culture also makes the Ministry of Finance an
inclusive public organization where everyone can participate and
contribute to the success of the Ministry of Finance as a reliable fiscal
authority and State General Treasurer and that brings prosperity to all
Indonesian people.
109
Epilogue
Change is a necessity
That hits various sectors including the Ministry of Finance
Leader's job is to create trust
Leaders who drive innovation and pave the way
110