Professional Documents
Culture Documents
189867663.doc
Objective
The purpose of this document is to outline the course of actions to cleanse data in the legacy systems or in the corresponding staging area before it is loaded into SAP. It defines general guidelines, which may be customized for each conversion ob ect when detailed cleansing instructions are rolled out. This is a living document that will be updated as !lue Print and "ata #onversion decisions are made in the following wee$s.
Versions The following table documents the revision history of this document:
%&'SI() %&'SI() "AT& "&S#'IPTI() *P"AT&" !+
1., 1.1
-./.-,,0 -.13.-,,0
!1ord
Data Cleansing
"ata #leansing is the process of reviewing and maintaining legacy application data so that it can be converted into the S#&IS SAP solution without intervention at final conversion time. "ata cleansing is one of the most important processes for data conversion. #leansing of the data must occur prior to loading it into the Production SAP environment. 4oading poor 5uality data into SAP could result in incorrect business decisions and may be more difficult to correct later. As part of the S#&IS "eployment Strategy, legacy data must be cleansed before loading it into the SAP solution. State Agencies will cleanse their own data per scope indicated in the "ata #leansing Scope charts below. 'esources will be needed from the Agencies who are currently using the legacy data. The "eployment team will coordinate this process.
189867663.doc
2or$ plan and metrics will be used by the "eployment S#&IS team to trac$ progress over the course of the implementation
Assets 8anagement
Accounts 'eceivable
1i6ed Assets 8aster < !alances. Also include #apital and (perational 4eases #ustomer 8aster !an$. !an$ Accounts #ost #enters
#ash 8anagement
#(ST #()T'(4.#()T'(44I)
=
#ost #ontrol.#ontrolling
Internal (rders
=rants 8anagement
Sponsor
=rants 8anagement
Sponsored Programs
8anual.&6cel Spreadsheet
Active agency #ustomer list !an$ files. #urrent !an$ Accounts )ew SAP #ost #enters based on agency org structure )ew SAP Internal (rders based on SPI'S non7capital and capital pro ects Agency active Sponsor lists combined with #1"A information )ew SAP Sponsored Programs
189867663.doc
=rants 8anagement
(pen =rant
%endor 8aster
Active agency =rants list Active %endors in the last -> months
=eneral 4edger
=4 !alances
STA'S.&6tract Programs or &6cel Spreadsheet 8anual.&6cel Spreadsheet 8anual.&6cel Spreadsheet APS.&6tract Program or &6cel Spreadsheet
&nding balances of last fiscal period before go7 live date (utstanding vendor invoices (utstanding customer invoices #ontract !alances by go7 live date
Agency 1inance "epartment Agency 1inance "epartment Agency 1inance "epartment Agency Procurement "epartment
ISS*&
"uplicates
The same data entity 9fi6ed asset, vendor, customer, etc.: is named two or more times in the same system.
"ata that is not up to date or no longer active. (bsolete data should remain in the legacy system since it is not needed in SAP. &6ample vendors no longer purchased from.
"ata cleansing is re5uired. 1lag one or more of the data elements so that it is not included in the @to be@ e6tract file. "ata cleansing is re5uired. The rules to declare a record obsolete is as followsA 7 %endorsA no activity in the last two years 7 1i6ed AssetsA 'etired of
189867663.doc
>
Incorrect "ata
Inconsistencies that are related to typing or data entry errors 7 typical problems include spelling errors 9e.g., !an$ of America vs. !anc of America: and reference inconsistencies 9e.g., -nd Street vs. Second Street, or Inc vs. #orporation:. 8issing data in current legacy system.
scrapped Assets after ? years 7 #ustomersA T!" 7 !an$ AccountsA T!" 7 Pro ectsA T!" 7 =rantsA T!" #leansing involves using a field in the legacy system to identify the record and use it to sort out these files when e6tracting data. "ata cleansing is re5uired. 'eview file and correct manually. If the error is present in multiple records, there may be a way to correct this automatically. #onsult with Agency Technical support.
Incomplete 'ecords
"ata cleansing is re5uired. #orrect incomplete records since some of this data may be re5uired by SAP.
Cleansing Process
'un corresponding 4egacy System report and download it to an e6cel spreadsheet "epending on the size and.or comple6ity of the data file, determine, either programmatically or manually, duplicates, obsoletes, incorrect or incomplete records #orrect records per suggested solutions in the previous chart. If necessary, consult with your Agency Technical support and.or corresponding S#&IS Team member 'eport status to "eployment team per pro ect plan and metrics sheet
189867663.doc
o o
Agencies will be given the corresponding support from the S#&IS team to understand SAP re5uirements and complete mapping The following guidelines may be revised and customized for each conversion ob ect
&?P4A)ATI() '&S(4*TI()
ISS*&
The current system does not re5uire a certain field, so it has been left blan$, or a given field should be filled per up to date procedure but it is s$ipped when information is not $nown at the time of data entry. This field is re5uired in SAP per defined business process.
Two organizations use the same field to store different elements of information.
#leansing 'e5uired. It might be possible to automatically populate the field 9a: by plugging in a constant value, or 9b: by referencing some other file to Cloo$ upD the information. If not, manual data cleansing will be needed. #onsult with Agency technical support for assistance. #leansing re5uired in one database or the other, or both based on what the field will be used for in SAP It may not be possible to reliably separate the two values. 8anual cleansing may be re5uired.
The current system does not provide a separate field for some desired piece of information. That piece of information is being stored along with another one in its designated field. &6ampleA current system includes a field named C#ontactD which would typically contain the CnameD of the appropriate contact individual. !ecause the system does not include a separate field for the contactEs telephone number, both the Cname and phone numberD are being stored in the C#ontactD field. Similar data entered into separate or independent systems. &6ample, consider two departments defining pro ects in their systems. Same type of data 9pro ect
#leansing re5uired in one database or the other, or both based on what the field will be used for in SAP.
189867663.doc
related: is entered into different systems but since it is not validated against each other or a central system, the data format is different. 1ree form te6t fields may have data that varies in meaning based on the user who entered the data into the system. Inconsistencies due to different data structures used in different source systems 7 typical problems include using different data values to represent the same thing 9e.g., System A uses 1 for CyesD, System ! uses + for CyesD and System # uses a flag for CyesD:. %arious positions of the data field imply additional information. SAP typically provides a separate field for the implied additional information. &6ampleA #onsider a system which includes a 07character field named CInvoice )umberD. A value of C=D in the first position indicates a sale to the *S =overnmentF a value of C"D in the first position indicates a sale to a non7government *S customer. The remaining characters in the field contain a uni5ue serial number. Thus, it is possible to determine some additional information from the invoice number G customer type. Is the customer type *S =overnment or domesticH The data field in the current system contains a code to represent a full value. SAP re5uires the full value or SAP uses a different code to represent the same full value. &6ampleA consider a system
#leansing re5uired in one database or the other or all based on what the field will be used for in SAP
If there is a regular pattern to the coding, the separation can probably be done programmatically. If not, manual conversion may be re5uired. S#&IS functional team will determine the solution.
The full value can be programmatically generated from a loo$7up table. S#&IS 1unctional Team will propose solution.
189867663.doc
1ormatting
1ield lengths
which includes a 17character field named C)ame Prefi6D, where a code of C1D indicates C8r.D, a code of C-D indicates C8issD, a code of C3D indicates C8rs.D. SAP wants the full value 9that is, C8r.D, C8rs.D, or C8issD:, not the code. A data field in the current system contains a value not allowed by the corresponding SAP field. &6ampleA #onsider a field where the current system allows alpha7numeric values, but the SAP field is only numeric. The length of the data field in the current system is longer than the corresponding field in SAP. &6ampleA #onsider a current system with description field of length 3,. Suppose SAP provides a description field of length ->. A valid field entry in legacy is not valid in SAP.
Should the field be unilaterally truncatedH (r should each description be evaluated by a human and abbreviated to retain ma6imum readabilityH Per proposed solution, manual data cleansing may be re5uired. &stablish the need for a translation table in the data cleansing procedures and describe itEs fields and valid entries
Cleansing Process
Attend meeting to gain understanding of SAP field re5uirements Team up with S#&IS functional team member to develop legacy system vs. SAP fields mapping. &6cel spreadsheet tool will be used to create to be file 'un corresponding 4egacy System report and download data to an e6cel spreadsheet per previously defined data file "epending on the size and.or comple6ity of the data file, determine, either programmatically or manually, data to be cleansed as per guidelines indicated before in this document #orrect records per suggested solutions in the previous chart. If necessary, consult with your Agency Technical support and.or corresponding S#&IS Team member 'eport status to "eployment team per pro ect plan and metrics sheet
189867663.doc