You are on page 1of 4

The Application Study of ERP Data Quality

Assessment and Improvement Methodology
Zhao Xiaosong He Zhen Zhang Meng Yu Dainuan Zhang Ting
Department of Industry Engineering, Tianjin University, Tianjin 300072 China

Abstract-The problem of ERP data quality is studied and the completeness, uniqueness, consistency.
model of ERP data quality assessment and improvement is Following assessment model is built according to the
established. The problem of ERP data quality can be detected attributes of data quality before ERP online:
and improved effectively by measuring ERP data quality. 1. Suppose tetrad H =< B , D , Q , G > [5], in which
Finally, ERP data quality assessment and improvement
methodology is verified by case study. (1) B is the set which includes every functional module the
system would implement, and is denoted by
Keywords-ERP, data quality, assessment and improvement,
B = {B1 , B2 , .Bn } in which Bi (i = 1, 2, , n) is the
fuzzy assessment
corresponding module.
I. INTRODUCTION (2) D is the set which includes datasets of every department
before online, and is denoted by D = {D1 , D2 , , Dm } in
With the upgrading attention theory researchers and
practitioners pay to ERP(Enterprise Resource Planning) data which Dk (k = 1, 2, , m) is the corresponding dataset,
quality, many scholars are developing and improving the ERP and Dt ∩ Ds = ∅ , D1 ∪ D2 ∪ ∪ Dm = D .
management concept and structure continuously combining All departments collect the data before online with
actual conditions of various industries and regions. How to
assistance of some tool software such as MS Excel,Access
guarantee the data quality of ERP as the carrier of massive
and SQL Server. We can confirm that there aren’t any
data has become an important issue for many researchers and
intersection sets between dataset D and each sub-dataset and
enterprises.
find out the sub-datasets supporting the according module
Hongjiang Xu, Jeretta Horn Nord et al.[1] studied the
from the sets above.
importance of data quality when ERP is implemented, Yu
(3) Q is the set which includes quality elements of ERP
Jinlong[2] established the method based on the principle of
ERP data management, Chen Yuan et al.[3] studied the data data, and is denoted by Q = {Q1 , Q2 , , Q j } , j = 4 . Q1 is
quality of information system and analyzed the reason for accuracy, Q2 is completeness, Q3 is uniqueness and Q4 is
data quality problem, Liu Xia et al.[4] studied the data in PDM consistency.
system and proposed the planning method of data acquisition (4) G is the set which includes rules made for the quality
quality and the plan for data quality assurance. situations of every quality element, and is denoted
Base on cumulated research efforts, there are some by G = {Gi , k , j , g } . Gi ,k , j , g represents that rule Gg which is
problems to be solved: the lack of systemic research for ERP
data quality, system of evaluating methods of data quality made to quality element Q j by dataset Dk of supporting
according to ERP system and specific instruction to improve module Bi ,and is denoted by Gi ,k , j , g = Gg ( Bi , Dk , Q j ) ,
the quality of whole ERP data. g = 1, 2, .
In this paper, method of resolving data quality problem
especially for ERP system is proposed combining IP-MAP Suppose that an enterprise is collecting and arranging data
and a series of integrate ERP data quality management system and preparing for implementing ERP. D1 is a sub-dataset
from quality assessment, quality improvement to quality prepared for module B1 . Two rules are made for
assurance is established. consistency Q4 : the length of x value is longer than 3 and
II. MODEL BUILDING shorter than or equal to 20; z value must among range of value
For higher ERP data quality level, we should evaluate data Z. Therefore, following rule set could be obtained:
quality level both before online and after online. Only in this G = {G1 ( B1 , D1 , Q4 ), G2 ( B1 , D1 , Q4 )} .
way, can we assure the data quality consistency between data WG (Gi , k , j , g ) is the weight assigned to rule Gi ,k , j , g . Its
source and transport process. importance is different because of the module and dataset for
A. Assessment Model of Data Quality before ERP Online each rule is different. For example, rule made for field
Data before online are prepared for implementing ERP “name” is more important than weight assigned to field “sex”,
system and their attributes are determined by the need of each because “name” can show attributes of a record better than
ERP module. Therefore, static data and initial data can be “sex”. The weights are given by experts or project members.
described by the following major elements: accuracy,

978-1-4244-1718-6/08/$25.00 ©2008 IEEE Pg 1036

k . g ) G Pg 1037 . Build order increased hierarchy structural system Build order increased hierarchy structural system by using nowadays. 2. g ) ⋅ M (Gi . Q ) B i k j module Bi must achieve on quality element Q j in dataset Datasets and quality aspects required to be optimized could Dk . ∑ WQ( B . The factors experts or project members.4. An in lower level and build Dk judgment Qk A1 A2 …… An matrix as select count ( z ) from Dk where( z not in Z ) A1 a11 a12 …… a1n follows: M (G1. Some quality aspect of one dataset may be corresponding to many function modules. Q j ) is the quantification level that dataset Dk influence ERP data quality. 1. Q j ) ). VB and VC program segment). Q j ) can be calculated only when rule levels of a C : ERP data quality grade of each function module group of Dk and Q j of all modules are known. Q ) k j C31 : coding rule conformable degree C32 : code update degree i k j C33 : operation authority reasonable degree C41 : system maintenance degree B C42 : system inherent vulnerability C43 : system design and detect capability Then a method is in need to calculating current quality C51 : data missing degree C52 : data cover degree CQ ( Dk . are possessed of fuzziness. module Bi achieves on quality element Q j in dataset Dk 1. A2 . . g ) CQB ( Bi . Dk . Q ) C11 : data accurate degree C12 : data qualified degree C13 : data reliable degree i k j i k j C21 : system module complex degree C22 : system operation reaction rate RQ ( D . j . Q j ) (it is also the weight assigned to foundation of the data quality after ERP implementation. Dk .1 ) = has relation with A1 .1 ) and WG (G1. Q j ) is the quantification level that dataset Dk CQ ( Dk . Q j ) is the weight assigned to implementation in order to make sure the initial data quality level Elementary data with high quality before online are the RQB ( Bi .4. Assessment model of data quality after ERP online There are more influencing factors including qualitative and element Q j in dataset Dk acts on module Bi can be given by quantitative factors when ERP system is running. Q4 ). g ) is assessment result of rule Gi . D . Q j ) and CQ ( Dk . Q j ) = G (2) ∑WG (Gi . RQB ( Bi .1. Calculate RQ ( Dk . D . Dk .4. Build comparison judgment matrix Use scale method to compare. c1 c2 c3 c4 c5 Usually. Q j ) RQ ( Dk .1. Q j ) to judge that if current DQ is qualified. Q ) i k j i k j CQ ( Dk . Q ) ⋅ CQB( B . j . Q j ) can be calculated when B. C1 : accuracy C2 : timeliness C3 : consistency C4 : security C5 : completeness ∑ WQ( B . D . j . D .k . Judgment matrix represents the G2 ( B1 . D1 . M (Gi . Q4 )} and calculation formulae of the two rules are importance that the relevant elements in this level according to a as follows: certain element in upper level. Suppose element Qk in level Q select count ( x) from Dk where len( x) ≤ 3 or len( x ) > 20 M (G1. D . Q ) = B (1) ∑ WQ( B . achieves on quality element Q j nowadays. calculating or experts’ judging and it is the precondition for calculating current quality level. j . 1 ERP data quality level of each function module methods (such as SQL statement. this assessment model of data quality before ERP (6) WQ ( Bi . CQB ( Bi . k . Dk . Dk . j . Q j ) and RQ ( Dk .2 ) to the two An an1 an2 …… ann rules respectively. Q j ) : obtained by programming.4. k .2 ) = Dk A2 a21 a22 …… a2n Assign weight WG (G1. c C 2. so this paper selects using fuzzy (7) CQB ( Bi . D . Q j ) = B (3) (5) RQB ( Bi . be distinguished by comparing RQ ( Dk . Dk . Dk . Q j ) . Q is confirmed. Dk . Q j ) can be obtained by experts doing C C C C C demand analysis about data users. analytic hierarchy process according to major factors that CQ ( Dk . must achieve on quality element Q j and assessment is in Enterprise should evaluate data quality of the initial data by terms of it. Q j ) is the DQ quantification level that assessment method to evaluate data quality after ERP online. k . D1 . and ∑ WG (Gi .1. g . Q ) ⋅ RQB( B .1. Q j ) is the DQ quantification level that ∑ WQ( B . Every rule in rule set G could be realized by programming Fig. that is the influence which quality B. so C c11 C c12 C c13 C c21 C c22 C c31 C c32 C c33 C c41 C c42 C c43 C c51 C c52 RQ ( Dk . As shown in Fig. There are rule set G = {G1 ( B1 . D. Q j ) can be calculated with WQ ( Bi . It can be CQ ( Dk .

Q2 ) − CQ ( D1 . finance function module B2 . more than 0. subordination. Data group will arrange and classify data before online. vm ) and marketing department’s data D1 . To run system normally and effectively after online and keep the λ −n ERP data quality both before online and after online. we will do C. Q ) ⋅ CQB( B . Cinn ) . production function Ci = (Ci1 . Assign production department’s data D3 .03 judgement matrix is Through analysis. W is normalization A. ni . = max (5) n −1 data collection. 2. e2 . . . k = 1. ni According to the definition of assessment model of data ∑w ij =1 (6) quality stated above. uniqueness Q3 and consistency Make a single fuzzy assessment of Ci and determine fuzzy Q4 . Q2 ) = B = 0. . . calculate the weigh of Ci is the constitute index of C. Pg 1038 . the appraisal object respective Then the quality level differences of D1 . v2 . em } . R= =     2. its scores is a ji S = EBT (10) 3. D2 and D3 on Q2 level is determined according to the biggest subordination. composition elements. CQ ( D1 . in which i ≠ j ; aii = 1 . ni is the number of module B3 and purchase function module B4 .06 Each Ci is an element. p . . . Datasets D1 . according to the level of the biggest aij = . m. Q2 ) = 0. C2 . The Make a synthetical assessment of each factor set Ci . Q2 ) = 0. distribution according to their importance.85 Then calculate the fuzzy synthesis Wi × Ri ,so the first level ∑ WQ( B . Q2 ) = 0. NUMERICAL EXAMPLE λ max is the maximum eigenvalue of A. Q2 ) = 0.Calculation results are as TABLE I. analysis and 4. j = 1. relation matrix Ri from Ci to C: Take Q2 as the example,then D1 quality level on Q2 is Ri = (rijk ) n× m . i = 1. Q ) B i 2 synthetical appraisal is so bi = Wi × Ri = [bi1 bi 2 bim ] .I . C p ) . it is the whole company. Assessment model of data quality after ERP online     Take marketing module data quality level as example to be bp  bp1 bp 2 bpm  evaluated: R is the single factor judgment matrix of (C1 .I. data collected from various operating departments include: Suppose assessment set V = (v1 . bi is the single factor judgment.12 Si = Ebi T (8) RQ ( D2 . completeness Q2 . An enterprise has decided to implement ERP system among For checking the consistency of the judgment matrix. B3 and B4 respectively. Q2 ) − CQ( D2 . we define that quality elements consist of j =1 accuracy Q1 . Q2 ) − CQ ( D3 . 2. Assessment of ERP data quality before online p Based on IP-MAP. C2 . i = 1. aij is numerical value of the importance that Ai according to Then the second level synthetical judgment is Aj .89 After normalized processing. Q2 ) = 0. p (7) CQ ( D2 . 2. Q ) i 2 i 1 2 rijk denotes Cij is the subordination of vk . 2. Build fuzzy synthetical assessment model improvement[6]. D .: found that data management of the enterprise is handiwork now. Background and problem description eigenvector corresponding to λ max . B = W × R , B = [ B1 B2 Bm ] (9) Judgment matrix has some attributes as follows: aij > 0 ; B j shows that the ERP data quality is evaluated the 1 subordination of v j . arrangement.10 data quality difference is b1  b11 b12 b1m  made to improve. W = [ w1 w2 wp ] is the weight Ci . Assessment C1 . D2 and D3 weight Wi = [ wi1 wi 2 wini ] to Ci and require that wij satisfies supports B1 . . C p . it is necessary to calculate consistency index C. the are score is RQ ( D1 . corresponding assessment scale set E = {e1 . i = 1. finance department’s data D2 . While preparing and arranging the data. Calculate eigenvalue and eigenvector AW = λmaxW (4) . 2. . p . each First. based on the model of the paper. which satisfies 1. the ERP system modules which the ∪C i = C, Ci ∩ C j = φ, i ≠ j company will implement include: marketing function i =1 module B1 .     b2 b21 b22 b2 m  So D1 should be improved on Q2 . B2 . Divide index set into N sub-factor sets denoted as B. in which ∑WQ( B . Ci 2 .92 , CQ ( D3 . assessment. so the RQ ( D3 .

071 C42 C43 0 0. 06YFGZGGX06100).0289 0.We keep on supervising contents subordinates asswssmwnt and controlling data quality after improvement plan Single Level Level compo- Cij implemented.1472 0.0219] C C1 C2 C3 C4 C5 W C. good.” Computer Engineering and 0. accuracy C1 and general.22 0 0. Therefore.1 0.429 0.201  0. “Research on Data Quality Assessment Methodology. Data quality C4 C41 0.283 0. “The research On the data quality 0 0 0.1505 0.2 0.2 0.071 0.09 0  Applications.poor).138 0.94 0 0.25 0.5 0.236 ACKNOWLEDGMENT w1 = 0.5 0.045 0   Then the application level analytic method computation 0.013 [1] Hongjiang Xu.705 0.045 0] Pg 1039 .” Sichuan: C5 C51 0 0.2865 0.6 0.25 0. 0.2 0.22 0.07  [5] Yang Qingyun.53 0.0219  0 0.0284 0. Luo Lin.429 0.283 0.5141 0. 30(149): 48-50 [4] Liu Xia,Liu Feng,Zhang Ping.518 b5 = [0 0. so b2 = [0.236 C1 C 0 0. “Data Managemant Reaserch of ERP.073 C22 0. first level of synthesis judgment: It can be seen from the computing result that: data quality of Here takes m=4. assessment collection V= (superior.5691 0.06 0. C11 0 0 0.129 Southwest Petroleum Universitry.3135 0.43 0.138 0.064 0.041 issues in implementing an ERP. all judgement matrix pass the uniform examination.” The Journal of the Library Science in China.309 0. two levels of fuzzy syntheses judgments: C3 1/5 2 1 2 1/2 0.2519 0.142] 0 0.R.72 0.052 Foundation (No.429 0.48 0.014 Systems.073 Third.73 0. 2(1):47 -58 w4 = 0.27 0 0. data quality of marketing module should be improved especially in judgment.142 0.” Industrial Management & Data 0 0.5606 0.109 0. C3 C31 0.3135 0. = [0.518 12 C13 0.518 0.705 0.1505 0.C5 b3 = [0.3 0.2 0. “The Application Study of ERP Data Quality = [0.7 0 0 0.2865 0.017 of our group.22  of PDM.3 0.11 0.079 The paper is supported by Tianjin Natural Science C2 C21 0.78 0. Zhao Peiying.0418 quality element next various factors weight.2 0. TABLE I JUDGMENT MATRIX C1.069 0.5606 0.071] Second.138 C32 C33 0 0. TABLE II According to the module analysis.064 [3] Chen Yuan.1472 0.201] ⋅ C4 1/7 1 1/2 1 1/3 0.1 0.717 and 0.39 0.18 0 0.5 0. Thank all the members w2 = 0. the data accuracy C11 and completeness C5 are also evaluated as “general”and other three non-missing rate C51 are 0.09 0 0.124  C5 1/4 3 2 3 1 0.2519 0.667 0.0418] b4 = [0.78 0.25 0.717 0  Then calculating the next level factors weight.2 0 0.493 0. the enterprise made FUZZY ASSESSMENT RESULT OF MARKETING MODULE DATA QUALITY modified plan for ERP data quality improvement and established according assurance system. 40(9):3-4 [6] Zhang Ting. C1 1 6 5 7 4 0.372 0. the other factors judgment results are followed: aspects of accuracy and completeness.6 0.5141 0. through confirms   each judgement matrix through the uniform examination.” Machinery Design & Manufacture.43 0.5 0. Shen Xiangxing.073 0.071 0. After experts quality elements are evaluated as “good”. 2004.71 0. Yang Dongqing. “A Study of Data Quality in Information System.018<0.014 [2] Yu Jinlong.756. marketing module is evaluated as “general”.6:166-167 b1 = W1 × R1 = [0.717 0] C2 1/6 1 1/2 1 1/3 0. Thank all the authors of the references.333 0. 2005 w5 = 0. the module accurate University. 2002. The error rate is 0 in terms of 200 sales plans and weigh sitor ci superior good general poor 200 invoices that are tested check. 2004.1 0.201 C52 0 0.125] Assessment and Improvement Methodology.429 0.” Tianjin: Tianjin By most greatly subordinates principle.1 0.6 0.71 0.1 B = W × R = [0.52 0 0.037 REFERENCES w3 = 0. 2007 level is " general ".75 0. “Noel Brown et al.5691 0.582 0. Jeretta Horn Nord. 2006. Through the random sampling.3 0.07 0.25 0.