Amazon Redshift is an Internet hosting service and data warehouse
product which forms part of the larger cloud-computing platform Amazon Web Services. It is built on top of technology from the massive parallel processing (MPP) data warehouse company ParAccel (later acquired by Actian),to handle large scale data sets and database migrations. Redshift differs from Amazon's other hosted database offering, Amazon RDS, in its ability to handle analytic workloads on big data data sets stored by a column-oriented DBMS principle. According to Cloud Data Warehouse report published by Forrester in Q4 2018, Amazon Redshift has the largest Cloud data warehouse deployments, with more than 6,500 deployments.
Amazon has listed a number of business intelligence
software proprietors as partners and tested tools in their "APN Partner" program, including Actian, Actuate Corporation, Alteryx, Dundas Data Visualization, IBM Cognos, InetSoft, Infor, Logi Analytics, Looker (company), MicroStrategy, Pentaho,[8][9] Qlik, SiSense, Tableau Software, and Yellowfin. Amazon Redshift is based on an older version of PostgreSQL 8.0.2, and Redshift has made changes to that version.[3][4] An initial preview beta was released in November 2012[5] and a full release was made available on February 15, 2013. The service can handle connections from most other applications using ODBC and JDBC connections. Partner companies providing data integration tools include Informatica and SnapLogic. System integration and consulting partners include Accenture, Deloitte, Capgemini and DXC Technology. Comparing Amazon Redshift to Traditional Data Warehouses Traditional data warehousing techniques are designed to support programmedfunctionalities such as:
Roll-up: Data is generalized by summarizing it
Pivot: Cross tabulation (rotation) is performed Slice and Dice: Performing projection operations on the dimensions Drill-down: Revealing more details Selection: Information is available by value and range Sorting: Data is sorted by ordinal value The core benefits of data warehousing are as follows:
A collection of information for competitive and comparative analysis.
High-quality level of information enhancing completeness. Disaster recovery plans with any other data backup source. • Optimized for Data Warehousing • Amazon Redshift uses efficient techniques and a variety of innovations in order to obtain a very high level of query performance on large amounts of datasets, ranging from hundred gigabytes to a petabyte or more. This is not possible in any traditional data warehousing technique to process an optimized query with this much data. • Scalable • The Amazon Redshift can be easily scaled in just a few clicks through the AWS Management Console or by a simple API call. If your organization requires a change, you can easily add or remove a number of nodes in your cloud data warehouse. The scaling property in traditional data warehousing is not so easy and is very complex if you want to change your data warehousing structure. DS (Dense Storage) nodes allow you to handle very large data warehouse structure using HDDs (Hard Disk Drives). Fully Managed Monitoring, scaling and managing a traditional data warehouse can be challenging compared to Amazon Redshift. Automatic data backups, upgrades, and patches are services provided by Redshift that helps you focus on what’s important. Your data and business. Get started in minutes Using simple API calls or the AWS Management Console, you can create a cluster, define its size, security profile and underlying node type. The complete data warehouse for your organization will be up and running in no time. Data warehouses are important tools to use in order to access business insight and analysis into your business operations. There are some traditional ways available that can be used for data warehousing for any organization. These techniques can be challenging to implement, manage and analyze. Amazon Redshift provides an excellent solution for your data warehousing needs to create your cloud cluster, manage, scale and accessing insights. • Transfer your data into Redshift Setting up your pipeline to load your data into Redshift smoothly and easily can be quite a project, costing your organization valuable time and resources. This is especially true if you want your data to be replicated at near real-time, which is usually the case for tracking important business metrics. This is where FlyData comes in. FlyData provides continuous, near real- time replication into Redshift from your transactional databases, such as MySQL, PostgreSQL, Amazon Aurora, and more. With an easy, one-time setup, our robust system ensures 100% accuracy with each load. Your data is always up to date.
Assignment No 01 Subject Data Ware House Topic Comparison of DWH Tools Group Members Muhammad Haseeb Khan Hashim Shoukat Mir Abdul Wahab Submitted To Proffessor Anwar Ali