You are on page 1of 9

Manjunath T.N et al, / (IJCSIT) International Journal of Computer Science and Information Technologies, Vol.

2 (1) , 2011, 477-485

Analysis of Data Quality Aspects in Data Warehouse Systems
Manjunath T.N1, Ravindra S Hegadi2, Ravikumar G.K3

Bharathiar University, Coimbatore Tamilnadu, INDIA,
2 3

Karnatak University, Dharwad Karnataka, INDIA, Dr. MGR University, Chennai Tamilnadu, INDIA,

Abstract: Data quality is a critical factor for the

success of data warehousing projects. If data is of inadequate quality, then the knowledge workers who query the data warehouse and the decision makers who receive the information cannot trust the results. In order to obtain clean and reliable data, it is imperative to focus on data quality. While many data warehouse projects do take data quality into consideration, it is often given a delayed afterthought. Even QA after ETL is not good enough the Quality process needs to be incorporated in the ETL process itself. Data quality has to be maintained for individual records or even small bits of information to ensure accuracy of complete database. Data quality is an increasingly serious issue for organizations large and small. It is central to all data integration initiatives. Before data can be used effectively in a data warehouse, or in customer relationship management, enterprise resource planning or business analytics applications, it needs to be analyzed and cleansed. To ensure high quality data is sustained, organizations need to apply ongoing data cleansing processes and procedures, and to monitor and track data quality levels over time. Otherwise poor data quality will lead to increased costs, breakdowns in the supply chain and inferior customer relationship management. Defective data also hampers business decision making and efforts to meet regulatory compliance responsibilities. The key to successfully addressing data quality is to get business professionals centrally involved in the process. We have analyzed possible set of causes of data quality issues from exhaustive survey and discussions with data warehouse groups working in distinguishes organizations in India and

abroad. We expect this paper will help modelers, designers of warehouse to analyze and implement quality warehouse and business intelligence applications. Keywords: Data warehouse (DWH), Data Quality, ETL, Data Staging, And Multidimensional Modeling (MDM).

1. Introduction
The presence of data alone does not ensure that all the data management functions and decisions can be made. The main purpose of data quality is about horrific data - data which is missing or incorrect or invalid in some perspective. A large term is that, data quality is attained when business uses data that is comprehensive, understandable, and consistent, indulging the main data quality magnitude is the first step to data quality perfection which is a method and able to understand in an effective and efficient manner, data has to satisfy a set of quality criteria. Data gratifying the quality criterion is said to be of high quality. Ample attempts have been made to classify data quality and to identify its dimensions. Magnitude of data quality typically includes accuracy, reliability, importance, consistency, precision, timeliness, fineness, understandability, conciseness and usefulness. For our research work we have taken the quality criterion by taking 9 key factors as mentioned below. According to English [3] defines the following inherent information quality characteristics and measures: 1. Definition Conformance 2. Completeness (of values)


Fig 1: Data warehouse Architecture 1. maintained. Validity. 478 . integrated. entered. affirm the reasons for this bad data while doing data cleaning. Primary key uniqueness. Validity (Business rule conformance): Is a measure of degree of conformance of data values to its domain and business rules. Accessibility: Is the characteristic of being able to access data on demand. Data crashing is happening by many courses of actions that get data into DWH environment. Referential Integrity.N et al. Non-duplication (of occurrences): Is the degree to which there is a one-to-one correlation between records and the real world object or events being represented. Completeness (of values): Is the characteristic of having all required values for the data fields. Accuracy to source 5. 2.3 Layers of Data warehouse liable to Data Quality Issues This paper addresses The Analysis of Data Quality aspects in all the layers data warehouse Systems. reasonability tests. Transformed and Cleansed) and loaded. Definition Conformance: The chosen Object is of most important and its definition should have complete details and meaning of the real world object. 9. A data warehouse is "a copy of transaction data specifically structured for query and analysis". Ranges. the majority of which will affect the data quality. Vol. This includes Domain values. Malfunction to stick on to data entry and procedures. The bad data must be reported. which he defined in the following way: "A warehouse is a subject- oriented. 2 (1) . Precision 6. 7. or business rule conformance 4. processed (Extracted. The Stages are:    Data Sources Data Staging Multi dimensional modeling design Data quality can be negotiated based on how data is received. 5. In spite of all the efforts. time-variant and nonvolatile collection of data in support of management's decision making process”. Derivation Integrity: Is the correctness with which two or more pieces of data are combined to create new data. Data quality factors can occur in different ways [9].Timeliness: Is the relative availability of data to support a given process within the timetable required to perform the process.Manjunath T. Accessibility 9. Timeliness 1. / (IJCSIT) International Journal of Computer Science and Information Technologies. Non duplication (of occurrences) 7. 8. integrated. 4. there still exists a certain percentage of dirty data. The Layers of data warehouse system is liable for data quality issues. The most common include:   Poor data handling procedures and processes. 477-485 3. 3. 2011. 6.1 Data Warehouse System The term Data Warehouse was coined by Bill Inmon in 1990. Accuracy (to the Source): Is a measure of the degree to which data agrees with data contained in an original source. Derivation Integrity 8.Precision: The domain value which specifies business should have correct precisions as per specifications. Ralph Kimball provided a much simpler definition of a data warehouse. 1.

The hypothesis assumed that data quality factors can be begin at any layer of data warehouse viz.N et al. There are several data profiling tasks: column statistics. such as a record of a customer indicating he/she lives in the city of Wisconsin. in data sources. English [3] proposes a comprehensive Total Quality data Management Methodology. such as blank phone numbers or addresses.1 Data Quality Steps The primary goal of Data Quality solution is to assemble data from one or more data sources. 1. Fig-3 describes the 6 tasks of Data quality in data warehouse [2]. and database modeling(DM). and an umbrella process for bringing about cultural and environmental changes to sustain information quality improvement as a management tool and a habit: Step 1: Assess Data Definition & Information Architecture Quality Step 2: Assess Information Quality Step 3: Measure Non quality Information Costs Step 4: Reengineer and Cleanse Data Step 5: Improve Information Process Quality Step 6: Establish the Information Quality Environment Fig-3 describes the steps of data quality to be accomplished to have proper and free from data quality factors which are described in the coming sections [2].Manjunath T. the process of bringing data together usually results in a broad range of data quality issues that need to be addressed. 477-485  Errors in the migration process from one system to another. However. Data profiling becomes even more critical when working with raw data sources that do not have referential integrity or quality controls. 2 (1) . which consists of 5 steps of measuring and improving information quality. value distribution and pattern 479 . profiling data helps you examine whether your existing data sources meet the quality standards of your solution. Profiling As the first line of defense for your data integration solution. Methodology Data quality assurance is a complex problem that requires a systematic approach. / (IJCSIT) International Journal of Computer Science and Information Technologies. Vol. Or certain data may be incorrect. Properly profiling your data saves execution time because you identify issues that require immediate attention from the start and avoid the unnecessary processing of unacceptable data sources.Following framework is designed with stages of data warehouse which are exposed to data quality factors. incomplete or missing customer profile information may be uncovered. 2. Data Quality Steps Profiling Cleansing Standardization Staging Area VLDB (very Large Datab ase) Target system (DWH) Matching Enrichment Monitoring Fig: 3 Data Quality Steps in Data warehouse System Source Systems ETL Process Fig 2: Stages of data warehouse liable to data quality factors 2.  Unstructured data may not look like our required data. in the state of Green Bay. staging area. 2011. For instance.

Manjunath T. phone number verification. state. • Pattern Distribution identifies invalid strings or irregular expressions in your data. city. correct and standardize patterns of data across various data sets including tables. 2011. Also Append lat/long. Vol. and county. name. Enrichment 480 .S. etc. This approach also empowers businesses to enforce data governance and compliance measures [2]. the process can standardize Addresses to a desired format. • Email Validation-Validate. The process provides a better understanding of your customer data because it reveals buyer behavior and loyalty potential [2]. Data enrichment enhances the value of customer data by attaching additional pieces of data from other sources. domain name changes. 2. company. 477-485 distribution. which are needed to enable CASS Certified™ processing. Data monitoring is designed to help organizations immediately recognize and correct issues before the quality of data declines. city. efficient techniques capable of handling complex quality issues hidden in the depths of large data sets. • Name Parsing and Gendering-Parse full names into components and determine the gender of the first name. gender and social security number. / (IJCSIT) International Journal of Computer Science and Information Technologies.e. 2 (1) . Check for general format syntax errors. columns and rows. ZIP. • Phone Validation-Fill in missing area codes. and MXlookup. • Column Statistics-This task identifies problems in your data. 6. full-name parsing and genderizing. or to USPS® specifications. and email validation. • Value Distribution. Cleansing After a data set successfully meets profiling standards. time zone. AOL. For instance.Identifies all values in each selected column and reports normal and outlier values in a column. Successful data cleansing requires the use of flexible. correct and clean up email addresses using three levels of verifi cation: Syntax. This process locates matches in any combination of over 35 different components – from common ones like address. • Address Verification-Verify U. Monitoring This real-time monitoring phase puts automated processes into place to detect when data exceeds pre-set limits. Local Database. minimum. demographic data. ZIP®. 4. and Canadian addresses to highest level of accuracy – the physical delivery point using DPV® and LACSLink®. such as invalid dates. Hotmail. state. which are now mandatory for CASS Certified processing and postal discounts. • Geocoding-Add latitude/longitude coordinates to the postal codes of an address. and update and correct area code/prefi x. It reports average. • Residential Business Delivery IndicatorIdentify the delivery type as residential or business. it still requires data cleansing and deduplication to ensure that all business rules are Properly met. and parse email addresses into various components. maximum statistics for numeric columns. as well as verify the domain name exists through the MaileXchange (MX) Lookup. Yahoo) and validate the domain against a database of good and bad addresses. including geocoding. improper email format for common domains (i. Standardization This technique parses and restructures data into a common format to help build more consistent data.N et al. This phase is designed to identify. 3. These tasks analyze individual and multiple columns to determine relationships between columns and tables. Matching Data matching consolidates data records into identifiable groups and links/merges related records within or across data sets. 5. and phone – to other not-so-common elements like email address. The purpose of these data profiling tasks is to develop a clearer picture of the content of your data [2].

Because of combination of different files from different source systems would result in data quality factors at any stage Table-1 describes the feasible causes of data quality factors originating at the source system of data warehouse. 477-485 2. if there are not properly converted.N et al. For source systems. which will take a decision on whether data cleansing rules are required.3 A Case of Data Quality issues at Source An approach to resolving source data quality issues that will be adopted in the design phase. It is also instrumental in keeping migration on track. It is highly recommended to understand common data quality factors. 3. The Analysis of data quality factors will help data warehouse and data quality practices in the organizations. The source system which has bad data consists of typo error driven by humans. Data anomalies will be presented. A step-by-step reconciliation carried out for all data conversions will be maintained in the form of a balance sheet. if we don’t have secured access. Extraction and Loading under control. Ultimately data into data warehouse is loaded from various sources as shown in the Fig-5. Part of the data comes from text files. / (IJCSIT) International Journal of Computer Science and Information Technologies. The data sources which are highlighted in the Fig-4 will have their own data formats for storage. Because of these multiple reasons. from which the information is described and builds and loaded into data warehouse system. will leads to unfortunate data quality. Analysis of Data Quality Factors To help Quality Analyst to find the root cause of data quality factors so we need to design the tools which will address data quality factors. The following types of data quality issues can be considered: Data quality issues related to source systems data such as Customer details. address etc. Verification programs will be introduced at the extract and upload levels to capture the count of records. Sudden changes in source systems cause DQ Problems. The process of keeping migration activities such as Cleansing. ERP Csv Files ODS 3.The following table-1 outlines the identification of data quality issues and the approach to managing the data quality issues during the data analysis phase. the chances for getting correct data decrease [4]. some are compatible and some are not. ensuring the authenticity of the quality and quantity of the data migrated at each stage.The source system contains transactional unstructured data. which would contribute to data quality factors.Manjunath T. or wrong data updating leads to malicious data. Changeable timeliness of data sources [6] [7]. part from MS Excel files and some of the data is direct ODBC connection to the source database [16]. Multiple data source systems will have different kind of quality issues. As time and proximity from the source increase. Multiple data sources generate semantic heterogeneity which leads to data quality issues. 2011. Inability to mange with the aged data which leads to data quality factors [4]. Vol. Deficient validation programs at source systems leads to data quality factors. 2 (1) .1 Data Quality factors at Data Sources A foremost cause of data warehouse and business intelligence projects to failures is because of wrong or poor quality data. 481 . Lack of having knowledge between different data sources leads to data quality factors. DB Objec Flat Files OLTP Legacy Data XML Docs Data Sources Other DB’s CRM. or whether the data will be migrated unmodified. which Fig 4: Probable Data Source for Data warehouse Sl No 1 2 3 4 5 6 7 8 Origin Of Data Quality factors at Sources Combination of different files from heterogeneous source system leads to data quality factors.

Special characters which conflicts the data domains [11]. Additional columns [6] [11].…. / (IJCSIT) International Journal of Computer Science and Information Technologies. Important entities.N et al. Different data types for similar columns (A customer ID is stored as a number in one table and a string in another). and January in four separate Columns) [10]. Lack of physical design structure while planning entire database system leads to data quality factors [4]. Lack of data quality assurance on individual data. One should concentrate the source systems before we move to target systems. Not handling Null characters properly in CSV source files result in wrong data. Missing values in data sources [2] [11] [12]. No meaningful data is stored in source systems as per business [7]. Measurement errors [11]. 2011.Manjunath T. Fig-6 shows types of Source systems which are used to extract [10]. 2 (1) .) [6][7]. Using Same column name in different source system tables which leads to Data quality factors [6]. Failure to have consistent data updates in regular intervals leads to data quality factors. The data sources considered is on the basis of survey conducted by Larry P [15] in the report titled “Evaluating ETL and Data Integration Platforms”.Fig-6 shows the analysis of using source systems with their percentage used for populating data warehouse. According to the Gartner Research Group. Table 1: Data Quality Factors at source. Different encoding formats (ASCII. Non-Compliance of data in data sources with the Standards. Our Analysis has presented much more number of factors of data population. EBCDIC. 1. Ambiguous data present in source systems leads to data quality factors [7]. Deficient of domain level validation in source Data. Missing Columns (You need a middle name of a person but a column for it does not exist. Misspelled data [11] [12]. System fields designed to allow free forms (Field not having adequate length). one of the primary causes of data migration project overruns by time and deficient of understanding data sources prior to data movement into data warehouse [17]. Ja. According to English [3]. Selecting wrong Columns as Primary keys in source system leads to Data quality factors. All the factors presented in the table 1 are related to the source systems which are the input to the data warehouse. we mainly focus on data quality factors in following types of source systems: a) Legacy Systems b) OLTP/ operational Systems c) Flat/Delimited Files d) Mainframe Machines And analysis is limited to non multimedia (Images. Malfunction to update all replicas of data causes DQ Problems. Presence of Outliers. Use of different representation formats in data sources. Incompatible use of special characters in various sources [6] [7]. Having redundant data in different source systems which leads to data quality factors [7] [11]. Examined the data quality factors in source systems. 34 35 36 37 38 39 40 41 42 Improper relationships between tables while DB designs. attributes and relationships are hidden and floating in the text fields [6] [7]. Vol. Different data formats in source systems (The month of the year is stored as Jan. So for the analysis done. Incorrect number of field splitters in source files leads to data quality factors. Wrong data mapping which leads to data quality factors. 482 . 477-485 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 Usage of unmanaged applications and databases as data sources for data warehouse in the organizations. Multi-purpose fields present in data sources. Data and metadata mismatch. One of the main barriers in the existing data warehouse system disquiet the existence of inconsistent data. Video and Audio) data only.) [11]. specifies how we need to populate the data in source systems. Wrong domain values for attributes [6]. Having Incompatible data formatting.

2 (1) . Vol. 3 4 5 Table 2: Data Quality Factors at Staging 3. Loading strategy selected (delta/incremental. No Proper restoring of staging leads to data quality factors. Incompatible use of code symbols and formats [4]. generated date.e. Data warehouse design affects the data quality analysis so we should give extra effort while designing the model. Mishandling null Values while transforming data results in data quality issues. in source systems or Staging or in data warehouse [18]. updated date) while performing ETL. Various Parsing/Business rules of source system contribute for data quality.refresh) leads to data quality issues. In staging area we does most of the activities like data profiling. No Proper data conversion logics and migration and reconciliation which leads to data quality [24]. data cleansing and data matching and data flushing with reference to the source systems where maximum effort is utilized.bulk. The Database which we are using as staging layer also contributes for data quality. A data cleansing usually will be carried in staging layer in order to main dimension and fact tables consistent and accuracy of data warehouse will be improved. Data warehouse tools used for Extract Transform and Load (ETL) do not generate consolidated meta data log. Due to Data cleaning without rules established leads to poor data quality.2 Data Quality factors at Staging Layer One concern about where needs to be cleansed the data i. Incapability to resume the ETL Process from breakpoints without losing data [14]. 2011. we have recognized reasons from our analysis as shown in table-2 Sl No 1 2 Origin Of Data Quality factors at Staging Layer Architecture of data warehouse will influence the data quality. Reconciliation problem may occur due to data staging area clean up. Improper referential integrity constraints 16 17 18 19 20 21 22 23 in staging layer leads to bad data and their relationships which are used to extract leads to data quality factors [11].N et al. which contribute to data quality.3 Data Quality factors at dimensional modeling (target) stage The quality of the information depends on 3 things: (1) The quality of the data itself (2) The quality of the application programs and (3) The quality of the data model [19]. We have different reasons for data quality factors in this layer. Some of the parameters such as change data capture i. A defective model design forces 6 7 8 9 483 . / (IJCSIT) International Journal of Computer Science and Information Technologies. Inefficient way using SCD logic in ETL Process. 477-485 10 11 12 13 14 Fig 5: Types of Data Sources Extracted 15 3. Lack of capturing only changes in source files [24]. Lack of business rules formation leads to data quality problems.Manjunath T. CDC’s and multi valued attributes etc. No proper extracts from data sources without time response leads to data quality factors. Failure to have centralized metadata repository leads to poor data quality.e. Faulty execution of slowly change data capture (CDC) plan in ETL stage leads huge data quality issues. Unacceptable data mapping cause Data quality issues. No Standard naming conventions followed while creation mapping and workflows/tasks leads to data quality. Misinterpretations of change requests on ETL stage leads to data quality issues. Wrong way of using audit columns (ex: create date.

MelissaData." [11] DATA MIGRATION BEST PRACTICES NetApp Global Services January 2006. University of Wisconsin –Whitepaper [13] Badri. which are emphasized in the above discussions i. [10] Jaiwei Han. Parswanath Project Manager (Data Warehouse Wing). 3 4 5 6 7 References [1] Symbiotic Cycles of Data Profiling Integration. SME.Data Quality Issues. Sami. [4] Fields. Inc. [9] Microsoft® CRM Data Migration Framework White Paper by Parul Manek. (1997). [3] English. 2006 [2] Six Steps to Managing Data Quality with SQL Server Integration Services: www. We further outlined the major steps for data Quality in Data warehouse systems. Improving Data Warehouse and Business Information Quality: Methods for Reducing Costs and Increasing Profits. SNOWFLAKE. [8] SAS Institute Inc. Bad Data model design contributes to data quality factors. System Services Corporation. K. Choice of dimensional modeling (STAR. Appending Dimensions leads to Data Quality issues. Mr. P. E. C. “An Introduction to Data Warehousing”.Senior consultant. FACT CONSTELLATION) schema Contribute to data quality. L.. North Carolina: SAS Institute Inc. A. T. India.2 and 3. pp. William J. Data Quality Experts of various organizations in India and abroad. "Data Mining: Concepts and Techniques. (1996).com. 5. A study of measuring the critical factors of quality management. The authors gratefully acknowledge the time spend in this discussions provided by Mr. (1999). Quantification of the auditor’s evaluation of internal control in data base systems.Shahzad. Table-3 shows the origin of data quality factors at dimensional model. 25-26 Oct. From Data to Business Advantage: Data Mining. Vol. Cary.N et al.. [7] Potts. [5] Firth. Mr. International 8 Table 3: Data Quality Problems at Target (DM) 4. Sl No 1 2 Origin Of Data Quality factors DWH(Target System/Dimensional Model) Misinterpretations of Requirements. Program Manager Published: April 2003. G. 1(1).Manjunath T. table 1. We see several topics deserving further research. Future Work We can implement a standardized tool to handle all the data quality factors in each stage of the data warehouse systems. We expect this paper will help modellers. Davis. Paper presented to the Conference of Information Quality. Page 10. and Sumners.Leitheiser. more work is needed on the design and implementation of the best tool for finding the Data Quality Subjects in Data warehouse systems. Delay in identifying SCD’s leads to data quality issues. and Quality (poster) TDWI. NC: SAS Institute Inc. Wipro Technologies. 2011. [6] White paper by Vivek R Gupta. H. data warehouse groups. Experts (SME). Govardhan (Architect) IBM India Pvt Ltd. John Wiley and Sons. 477-485 Errors on data quality. Mr. The Journal of Information Systems. [12] DATA Quality in health Care Data warehouse Environments Robert L. Improper usage of multidimensional objects and their relationships leads to data quality factors. Cary. Michelinne Kamber. Lack of Database design support contributes to data quality factors. Hierarchies Leads to Data Quality Issues.e. ie entire through the life cycle of data warehouse. (1986). the SEMMA Methodology and the SAS® System. M. 6. 2 (1) . (1998). Data quality in practice: experience from the frontline.. / (IJCSIT) International Journal of Computer Science and Information Technologies. First of all. Acknowledgements This paper is prepared through exhaustive discussions and T-cons with Subject Matter 484 . Donald and Davis. CSC USA. designers of warehouse to analyze and implement quality warehouse and business intelligence applications. Data Mining Using SAS Enterprise Miner Software. So far only a little research has appeared on data Quality Problems. Donna (1995). E. SAS Institute White Paper.. Arun Kumar Data Architect KPIT Cummins India. Conclusions We provided analysis of data quality factors in each stage of data warehouse system ie data sources and staging area and Target system (Multidimensional Schema). 24-77.

477-485 Journal of Quality and Reliability Management. [16] Miles. Y. [14] Birkett. received his Bachelor’s degree from Siddaganga Institute of Technology. pages 217– 232. In Proc.” DM Review August 2000 <http://www. “E-Analytics—The Next Generation of Data Warehousing. Design and Analysis of Quality Information for Data Warehouses. California. W. / (IJCSIT) International Journal of Computer Science and Information Technologies. and M. His areas of interests are Data Warehouse & Business Intelligence. C. Chaudhuri and U. September. Michael J. Chickballapur. He is having 15 years of Experience.D. international and national level conferences. Dayal. A.N et al. Shimoga. 26(1):65–74. 41(2).Ravindra S Hegadi received his Master of Computer Applications (MCA) & M. SIGMOD Record. of the 17th International Conference on the Entity Relationship Approach (ER’98). He has visited overseas to various universities as SME. 36-53. M. R. (1998). 2 (1) . Quix. Number 1. R. He has published and presented papers in journals.A. Image Processing and Databases and business intelligence. p. [19] Wang. He is having total 10 years of Industry and teaching experience. Ulrich G¨untzer. Qualitative Data Analysis . [21] Wang. 1995 Authors Profile Manjunath T N. pp. Inmon. His area of interests are Data Warehouse & Business Intelligence. Jeusfeld. Singapore. M. [18] M. & Huberman. Dr. A Product Perspective on Total Data Quality Management. [15] Larry P.india during the year 2001 and M. of the 6th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (DMKD 2001). Currently pursing Ph. 12(2). Coimbatore. 2001.Manjunath T. 1998. [24] Schroeck. 1997. and Udo Grimmer. Vol. Professional specialization in accounting IV: management accounting. received his Bachelor’s Degree in computer Science and Engineering from SJC Institute of Technology. He has published and presented papers in journals. English.. 2001.A Source Book of New Methods. 485 . Santa Barbara.His area of interests are Image Mining. [23] Considerations for Building a Real-time Data Warehouse John Vandermay. 41(2).H. He is having around 14 years of Professional experienced which includes Software Industry and teaching experience. A product perspective on total data quality management.Data Quality Mining – Making a Virtue of Necessity.cfm?NavID =55&EdID=2551>. DataMirror Corporation. In Proceedings of the 6th International Conference on Information Quality (IQ 2001). [20] Udo Grimmer and Holger Hinrichs.78. international and national level conferences. pages 52–57. [17] Jochen Hipp. [22] S. Communications of the A Methodological Approach to Data Quality management Supported by Data Mining.New York. (1986). An Overview of Data Warehousing and OLAP Technology.Communications of the ACM. He has published and presented papers in journals. 1998.P. in year 2007 in computer science from Gurbarga University. Tech in Systems Analysis and Computer Application from Karnataka Regional Engineering College Surthakal (NITK) during the year 2000.dmreview. Tumkur (Bangalore university) during the year 1996 and M. multimedia and Databases.58-65. In Proc.Phil and Doctorate of Philosophy Ph. Tech in computer Science and Engineering from Jawaharlal Nehru National College of Engineering. Prism. Wiley & Sons. Karnataka. Sage Publications. B. pp. Currently pursing PhD degree in Dr MGR University. Volume 1. (1994).D degree in Bharathiar University. Australian Accountant. Ravikumar GK. Jarke. Improving Data Warehouse and Business Information Quality.Karnataka. [25] White paper on Data Quality Strategy: AStep-by-Step Approach by SAP Labs [26] "What is a Data Warehouse?" W. Karnataka. Chennai. multimedia and Databases.india during the year 2004. 2011. international and national level conferences. Thousand Oaks. 1999.