This action might not be possible to undo. Are you sure you want to continue?
Question 1: (i) What do you understand by Business Intelligence System? What are the different steps in order to deliver the Business Value through a BI System? (ii) Discuss the characteristics of a Data warehouse and analyze how the development of a Data warehouse helps you in managing various functions in your organizations. Answer: (i) Business Intelligence (BI) is a generic term used to describe leveraging the organizational internal and external data, information for making the best possible business decisions. The field of Business Intelligence is very diverse and comprises the tools and technologies used to access and analyze various types of business information. These tools gather and store the data and allow the user to view and analyze the information from a wide variety of dimensions and thereby assist the decision-makers make better business decisions. In simple terms, Business Intelligence is an environment in which business users receive reliable, consistent, meaningful and timely information. This data enables the business users conduct analyses that yield overall understanding of how the business has been, how it is now and how it will be in the near future. Also, the BI tools monitor the financial and operational health of the organization through generation of various types of reports, alerts, alarms, key performance indicators and dashboards. Some examples of BI systems include Decision Support Systems (DSS), Executive Information Systems (EIS), Multidimensional Analysis Software or OLAP (On-Line Analytical Processing) tools and Data Mining tools. The manager of a BI system has to take care of the following steps in order to deliver the intended business value: Step 1: Ensuring strong business partnership Developing a strong business sponsorship is the first step to start a BI project. Your business sponsors will take a lead role in determining the purpose, content and priorities of the system and so the business sponsors are expected to have the following skills: Visionary Resourceful Reasonable
Step 2: Defining organizational-level business requirements The long-term goal of a BI system is to build an organizational-wide information infrastructure. This cannot be done unless the BI system developing team understands the business requirements at an organizational level. Thus the process of understanding the organizational-level business requirements includes the following steps: Establishing the initial project scope Interviewing the BI system stakeholders Gathering the organizational level business requirements Preparation of an overall Requirements document
Step 3: Prioritizing the business requirements The prioritization process is a planning meeting that involves the BI system developing team, the business sponsors, and other key senior managers across the organization. A prioritization grid can be developed for the set of business processes identified in the previous step against the feasibility of a business process and the business value that the processes are likely to generate. Step 4: Planning the Business Intelligence project After getting the complete understanding about the business priorities, the BI system developing team revisits the project plan. Now the plan is made based on the priority of the business processes detailed in the previous step. Step 5: Defining the project-level business requirements Based on the previous steps, now the BI system developing team defines and documents the project-level business requirements. These requirements act as guidelines while developing the BI system. (ii) Characteristics of a Data Warehouse: According to Bill Inmon, who is considered to be the father of data warehousing, the data in a data warehouse consists of the following characteristics: Subject Oriented: The first feature of a DW is its orientation towards the major subjects of the organisation instead of applications. The subjects are categorized in such a way that the subject-wise collection of information helps in decisionmaking. Integrated: The data contained within the boundaries of the warehouse are integrated. This means that all inconsistencies regarding naming convention and value representations need to be removed in a data warehouse. Time Variant: The data stored in a data warehouse is not the current data. The data is a time series data as the data warehouse is a place where the data is accumulated periodically. This is in contrast to the data in an operational system where the data in the databases are accurate as of the moment of access. Non-Volatility of the Data: The data in the data warehouse is non-volatile which means the data is stored in a read-only format and it does not change over a period of time. This is the reason that the data in a data warehouse forms as a single source for all decision system support processing. Building a data warehouse helps managers in making timely decisions in various functional areas as provided below: Marketing: To determine real time product sales in order to make strategic pricing and distribution decisions. To analyze the history of products to conclude the success or failure of a product’s attributes. To determine the successful products and evaluate the key success factors. To understand the revenue impact for a specific decision item. To identify the right customer segments based on the past records.
To understand the performance of an individual sales person.
Finance: To compare the budget allocations and actual cost for a specific area on weekly, monthly or annual basis. To prepare the future estimates of profitability. To review the past cash flow trends and forecast the same for future periods. To monitor a set of key financial indicators and ratios. Human Resources: To evaluate the trends in a specific employee benefit program. To monitor the performance of an individual or a set of individuals. To calculate the Return on Investment for specific resources. To review the compliance levels for regulated activities. Question 2: (i) What do you understand by Data Warehouse Meta Data? What is the use of Metadata? How can you manage Metadata? (ii) What do you understand by ETL? What are the significances of ETL processes? What are the ETL requirements and steps? Answer: (i) In simple terms, ‘metadata’ refers to “data about data”. It is the information that describes, or supplements the main data. In general, there are two distinct classes of metadata: structural metadata and guide metadata: Structural metadata is used to describe the structure of the computer systems like tables, columns, indexes etc. Guide metadata is used to help people find specific items and is usually expressed as a set of keywords in a natural language.
Data Warehouse Metadata: In general, data warehouse metadata systems are divided into two sections: Back room metadata that are used from Extract, Transform and Load functions to get OLTP data into a data warehouse. Front room metadata that are used to label screens and create reports.
Kimball, a renowned author in the area of business intelligence and of data warehousing, lists the following types of metadata in a typical data warehouse: Source system metadata. Source specifications, such as repositories and source schemas. Source descriptive information, such as ownership descriptions, update frequencies, legal limitations and access methods. Process information such as job schedules and extraction code data staging metadata. DBMS metadata, such as DBMS system table contents. Data acquisition information, such as data transmission scheduling and results, and file usage.
Dimension table management, such as definitions of dimensions, and surrogate key assignments. Transformation and aggregation, such as data enhancement and mapping, DBMS load scripts and aggregate definitions. Audit, job logs and documentation, such as data lineage records, data transform logs.
Use of Metadata: The applications of metadata are discussed below:
Metadata provides additional information to users of the data it describes and the information can be either descriptive or algorithmic. Metadata speeds up and enriches searching for resources. Search queries using metadata saves users from performing more complex filter operations manually. Also, web browsers, P2P applications and media management software automatically downloads and locally caches metadata to improve the speed at which files can be accessed and searched. Metadata plays a vital role on the WWW to find useful information from the large amount of information available. Metadata is an important part of electronic discovery. Application and file system metadata derived from electronic documents and files act as important evidence. Some metadata is intended to enable variable content presentation. For example, if a picture has metadata that indicates the most important region, the user can narrow the picture to that region and thus obtain the details required. Metadata can also be used to automate workflows. For example, if a software tool knows content and structure of data, it convert it automatically and pass it to another tool a input so that users need not perform copy-and-paste operations required. Metadata helps to bridge the semantic gap by explaining how computer data items are related and how these relations can be evaluated automatically.
Managing the Metadata: To successfully develop and use metadata, we need to understand the following important issues that should be treated with care:
We need to keep track of the entire metadata created even in the early phases of planning and designing. It is not economical to start attaching metadata once the production process has been completed. Metadata must adapt if the resource it describes changes. It should be merged when two resources are merged. It can be useful to keep metadata even after the resource it describes has been removed. Metadata can be stored either internally or externally. Internal storage allows transferring metadata together with the data it describes. This method creates high redundancy and does not allow holding metadata together. External storage allows bundling metadata, for example in a database, for more efficient searching. There is no redundancy and metadata can be transferred simultaneously when using streaming. Storing the metadata in a human-readable format can be useful because users can understand and edit it without specialized tools. But these
formats are not optimized for storage capacity. It may be useful to store metadata in a binary, non-human-readable format instead to speed up transfer and save memory. (ii) ETL: Data extraction is the first step in the execution of the ETL (Extraction, Transaction and Loading) functions to build a data warehouse. This extraction can be done from an OLTP database and non-OLTP systems, such as text files, legacy systems and spreadsheets. The data extraction process is complex in its nature because of the tremendous diversity that exists among the source systems in practice. It is all ETL functions that reshape the relevant data from source systems into useful information to be stored in the data warehouse. There would be no strategic information in a data warehouse in the absence of these functions. Significance of ETL Processes: The ETL functions act as the back-end processes that cover the extraction of the data from the source systems. Also, they include all the functions and procedures for changing the source data into the exact formats and structures appropriate for storage in the data warehouse database. After the transformation of the data, the processes include all processes that physically move the data into the data warehouse repository. Also, the amount of time to be spent on performing the ETL functions is as much as 50-70% of the total effort to be put for building a data warehouse. To extract the data, we have to know the time window during each day to extract the data from a specific source system without impacting the usage of the system. Also, we need to determine the mechanism for capturing the changes in the data in each of the relevant systems. Apart from the ETL functions, the building of a data warehouse includes functions like data integration, data summarization and metadata updating. ETL Requirements and Steps: Ideally we’re required to undergo the following steps for the execution of ETL functions: 1. To determine the target data of the data warehouse 2. To identify the internal and external data sources 3. To map the sources with the target data elements4. To establish comprehensive data extraction rules rules 5. To prepare data transformation and cleansing 6. To plan for aggregate tables 7. To organise the data staging area and test tools 8. To write the procedures for all data loads 9. To execute ETL functions for dimension tables 10. To execute ETL functions for fact tables
Question 3: (i) Describe briefly the Data Transformation process. What are the major types of transformations? Describe them briefly. (ii) What do you understand by EIS? What are the significances of EIS? Briefly describe the benefits of EIS. Answer: (i) Data Transformation Process: The extracted data is raw data and it cannot be directly loaded into a data warehouse. To have useful information for strategic decision-making is an underlying principle of the data warehouse and the data in the operational source systems cannot fulfil this purpose. So, the transformation and loading functions play a key role in the preparation of the data. The transformation of the data is to be done as per the standards as the data comes from various source systems and you also need to ensure that the combined data does not violate the business rules. Irrespective of the complexity of the source systems and regardless of the extent of the data warehouse, some of the basic functions performed in data transformation function are as follows: Selection and Splitting/Joining: This is the basic task that is performed at the beginning of the entire data transformation process. Using this task, we may select either whole records or parts of several records from the source systems. The splitting/joining task includes the type of data manipulation you need to perform on selected records of the source systems. We can either split the selected parts further or join the parts selected from many source systems. Summing Up: This task is used in case we find it is not required to keep data at the lowest level of detail in our data warehouse. Conversion: This task involves a large variety of rudimentary conversions of single fields. This task is done for two reasons: To standardize the data among the data extractions from disparate source systems. To make the fields usable and understandable to the users.
Enrichment: This task involves the rearrangement and simplification of individual fields to make them useful for the data warehouse environment. Major Transformation Types are listed as under: Format Revisions: Format revisions include changes to the data types and lengths of individual fields. Decoding of Fields: This type of transformation deals with multiple source systems and we are bound to have same data items described by a plethora of field values. Calculated and Derived Values: In this type of transformation both calculated and derived types of data values in a typical data warehouse are maintained. Splitting of Single Fields: In this type of transformation we need to split the larger single files for improved understanding and making better analysis.
Merging of Information: This type of transformation deals with merging of information available in various source systems into a single entity. Summing Up: In this type of transformation, the summaries are created and then loaded in the data warehouse instead of loading the most granular level of data. Character Set Conversion: In this type of transformation, the character sets are converted into an agreed standard character set for textual data in the data warehouse. Conversion of Units of Measurements: In this type of transformation the metrics are converted accordingly so that the numbers may all be in one standard unit of measurement. Key Restructuring: In this type of transformation keys with built-in meanings are avoided while choosing the keys for data warehouse database tables and such keys are transformed into generic keys (that are generated by the system itself). Reduplication: In this type transformation all duplicates in the source system are linked to a single record in the data warehouse. This process is called reduplication. (ii) Definition of an EIS (Enterprise Information System): In simple terms, and EIS can be defined as a computer-based system intended to facilitate and support the information and decision making needs of senior executives of an enterprise by providing easy access to both internal and external information relevant to meeting the strategic goals of an organization. These systems act as organizational-wide Decision Support Systems to help top-level executives analyze, compare and highlight the trends and patterns of important variables. Significance of EIS: An EIS provides the summarized or detailed data of strategic information at the convenience of senior executives of an organization. An EIS performs all these functions by constantly monitoring the internal and external trends. The tools offered by EIS are programmed to provide canned reports or briefing books to top-level executives. Today these tools allow querying against a multi-dimensional database, and most offer analytical applications along functional lines such as sales or financial analysis. Today the application of an EIS is not only in typical corporate hierarchies, but also at personal computers on a local area network. These systems now cross computer hardware platforms and integrate information stored on mainframes, personal computer systems and minicomputers. This arrangement enables all users to customise their access to proper company’s data and provide relevant information to both upper and lower levels in companies. Benefits of an EIS: The advantages that an EIS brings to the organisation are: Provides tools to select, extract, filter and track the critical information of organisation in an organized manner. Enables top-level executives to use the system with ease.
Provides timely delivery of the organization-wide summary of information highlighting the major deviations of information wherever they arise. Provides a wide range of reports including the status reports, trend analyses, drill down investigation and ad-hoc queries. Presents the information in graphical, tabular and/or text formats.
This action might not be possible to undo. Are you sure you want to continue?
We've moved you to where you read on your other device.
Get the full title to continue listening from where you left off, or restart the preview.