by implementing a suitable master data solution, organizations can become more o
perationally efficent by means of automation and communicated through a single c
hannel. as quality data is a strategic asset for any organization started out wi th a thorough understanding of data quality for each contributing source allows for the highest quality data in a MDM implementation. one of the data quality issues is data duplication. to identify and remove these duplicates deterministic and probabilistic approaches are used. deterministic matching protocol it all starts with the datasource. deterministi c matching mainly look an exact match between the two pieces of data. it therefo re requires the quality of the data to be at 100% level and your data is clean a nd standardised in the same way 100% of the time. as data is never 100% clean th is requires some level of up front work for determistic matching to work. Probalistc matching uses a statistical approach in measuring the probablility that the two customer records represent the same individual. it is fundamental t o properly analyze the data elements as well as the combination of such data ele ments that are needed for searching and matching. This information goes into the process for defining an algorithm where the searching and matching rules are de fined. Probablistic matching takes into account the frequency of the occurence of a par ticular data value all the values in that data element for the entire population . search pockets are divided based on the combinations of the data element in th e algorithm. Thresholds are set to determine when two records should automatically dealing se nse two records are the same manually revelaed as the two records may be the sam e or not linked because the records are not the same. It is designed to work using a wider set of data elements to be used for matchin g. It uses weights to calculate the max score and it uses specials to determine a match, non-match or a possible match.