P. 1
CDH3 Installation Guide u1

CDH3 Installation Guide u1

|Views: 92|Likes:
Published by overllord
CDH3 Installation Guide u1
CDH3 Installation Guide u1

More info:

Published by: overllord on Apr 13, 2013
Copyright:Attribution Non-commercial


Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less






  • About this Guide
  • What's New in CDH3
  • What's New in CDH3 Update 1
  • CDH-wide changes for Update 1
  • Updated Components
  • What's New in CDH3 Update 0
  • CDH-wide changes
  • New Components
  • About CDH Package Manifests
  • Before You Install CDH3 on a Cluster
  • Supported Operating Systems for CDH3
  • Install the Java Development Kit
  • CDH3 Installation
  • Ways To Install CDH3
  • Before You Begin Installing CDH Manually
  • Installing CDH3 on Red Hat Systems
  • Step 1: Download the CDH3 Repository or Package
  • • Download and install the CDH3 Package or
  • To add the CDH3 repository:
  • Step 2: Install CDH3
  • Installing CDH3 on Ubuntu Systems
  • • Add the CDH3 U1 repository or
  • To download and install the CDH3 package:
  • Installing CDH3 on SUSE Systems
  • Step 2: Install CDH
  • Installing CDH3 Components
  • Additional Configuration is Required for These CDH Components
  • Viewing the Apache Hadoop Documentation
  • Upgrading to CDH3
  • Before You Begin
  • Upgrading Hive and Hue in CDH3
  • Upgrading Flume in CDH3 Update 1
  • Changes in User Accounts and Groups in CDH3 Due to Security
  • Directory Ownership in Local File System
  • Directory Ownership on HDFS
  • Upgrading to CDH3 Update 1
  • Upgrading from CDH3 Beta 3, Beta 4, or Update 0 to CDH3 Update 1
  • Upgrading from a CDH release before CDH3 Beta 3 to CDH3 Update 1
  • CDH3 Deployment in Standalone Mode
  • To deploy CDH3 in standalone mode:
  • CDH3 Deployment in Pseudo-Distributed Mode
  • To deploy CDH3 in pseudo-distributed mode:
  • CDH3 Deployment on a Cluster
  • Configuration
  • Customizing the Configuration without Using a Configuration Package
  • Configuration Files and Properties
  • Configuring Local Storage Directories for Use by HDFS and MapReduce
  • Creating and Configuring the mapred.system.dir Directory in HDFS
  • Deploying your Custom Configuration to your Entire Cluster
  • To deploy your custom configuration to your entire cluster:
  • Initial Setup
  • Format the NameNode
  • Running Services
  • Starting HDFS
  • Starting MapReduce
  • Configuring the Hadoop Daemons to Start at Boot Time
  • When SSH is and is not Used
  • Flume Installation
  • Upgrading Flume in CDH3
  • Flume Packaging
  • Flume Prerequisites
  • Installing the Flume Tarball (.tgz)
  • Installing the Flume RPM or Debian Packages
  • Starting Flume Nodes on Boot Up Automatically
  • Starting the Flume Master on Boot Up Automatically
  • Running Flume
  • Files Installed by the Flume RPM and Debian Packages
  • Viewing the Flume Documentation
  • Sqoop Installation
  • Sqoop Packaging
  • Sqoop Prerequisites
  • Installing the Sqoop RPM or Debian Packages
  • Installing the Sqoop Tarball
  • Viewing the Sqoop Documentation
  • Hue Installation
  • Upgrading Hue in CDH3
  • Upgrading Hue from CDH3 Beta 4 to CDH3 Update 1
  • Installing, Configuring, and Starting Hue on One Machine
  • Specifying the Secret Key
  • Starting Hue on One Machine
  • Installing and Configuring Hue on a Cluster
  • Installing Hue on a Cluster
  • Configuring the Hadoop Plugins for Hue
  • Restarting the Hadoop Daemons
  • Pointing Hue to Your CDH NameNode and JobTracker
  • To point Hue to your CDH NameNode and JobTracker:
  • Web Server Configuration
  • Authentication
  • Listing all Configuration Options
  • Viewing Current Configuration Settings
  • Using Multiple Files to Store Your Configuration
  • Starting and Stopping Hue
  • Hue Process Hierarchy
  • Hue Logging
  • Viewing Recent Log Messages through your Web Browser
  • The Hue Database
  • Inspecting the Hue Database
  • Backing up the Hue Database
  • Configuring Hue to Access Another Database
  • Configuring Hue to Store Data in MySQL
  • To configure Hue to store data in MySQL:
  • Installing and Configuring Hue Shell
  • Verifying Hue Shell Installation
  • Modifying the Hue Shell Configuration File
  • Unix User Accounts
  • Running the Appropriate Web Server
  • Viewing the Hue and Hue Shell Documentation
  • Pig Installation
  • Incompatible Changes as of the Pig 0.7.0 Release
  • Using Pig with HBase
  • Viewing the Pig Documentation
  • Oozie Installation
  • Oozie Packaging
  • Oozie Prerequisites
  • Installing Oozie Tarball
  • Installing Oozie RPM or Debian Packages
  • Configuring Oozie
  • Hadoop Configuration
  • Database Configuration
  • Enabling Oozie Web Console
  • Configuring Oozie with Kerberos Security
  • Installing Oozie ShareLib in Hadoop HDFS
  • Starting, Stopping, and Accessing the Oozie Server
  • Starting the Oozie Server
  • Stopping the Oozie Server
  • Accessing the Oozie Server with the Oozie Client
  • Accessing the Oozie Server with a Browser
  • Viewing the Oozie Documentation
  • Hive Installation
  • Upgrading Hive in CDH3
  • Installing Hive
  • Configuring a MySQL-based Hive Metastore
  • Hive Configuration
  • Using Hive with HBase
  • Viewing the Hive Documentation
  • HBase Installation
  • Installing HBase
  • Host Configuration Settings for HBase
  • Using DNS with HBase
  • Using the Network Time Protocol (NTP) with HBase
  • Setting User Limits for HBase
  • Using dfs.datanode.max.xcievers with HBase
  • Starting HBase in Standalone Mode
  • Installing the HBase Master for Standalone Operation
  • Starting the HBase Master
  • Accessing HBase by using the HBase Shell
  • Using MapReduce with HBase
  • Configuring HBase in Pseudo-distributed Mode
  • Modifying the HBase Configuration
  • Creating the /hbase Directory in HDFS
  • Enabling Servers for Pseudo-distributed Operation
  • Installing the HBase Thrift Server
  • Deploying HBase in a Distributed Cluster
  • Choosing where to Deploy the Processes
  • Configuring for Distributed Operation
  • Troubleshooting
  • Viewing the HBase Documentation
  • ZooKeeper Installation
  • Installing the ZooKeeper Packages
  • Installing the ZooKeeper Base Package
  • Installing the ZooKeeper Server Package on a Single Server
  • Installing ZooKeeper in a Production Environment
  • Maintaining a ZooKeeper Server
  • Viewing the ZooKeeper Documentation
  • Whirr Installation
  • Generating an SSH Key Pair
  • Defining a Whirr Cluster
  • Launching a Cluster
  • Running a Whirr Proxy
  • Running a MapReduce job
  • Destroying a cluster
  • To destroy a cluster:
  • Viewing the Whirr Documentation
  • Snappy Installation
  • Using Snappy for MapReduce Compression
  • Using Snappy for Pig Compression
  • Using Snappy for Hive Compression
  • Configuring Flume to use Snappy Compression
  • Using Snappy compression in Flume Sinks
  • Using Snappy compression in Sqoop Imports
  • Configuring HBase to use Snappy Compression
  • Viewing the Snappy Documentation
  • Mountable HDFS
  • Java Development Kit Installation
  • JDK Installation on Red Hat 5 and 6, CentOS 5, or SLES 11 Systems
  • Creating a Local Yum Respository
  • JDK Installation on Ubuntu Systems
  • Using the CDH3 Maven Repository
  • Building RPMs from CDH Source RPMs
  • Prerequisites
  • Setting up an environment for building RPMs
  • Red Hat or CentOS systems
  • SUSE systems
  • Getting Support
  • Building an RPM
  • Cloudera Support
  • Community Support
  • Apache License
  • Third-Party Licenses

For additional documentation see the Whirr Documentation.

Snappy Installation

Snappy Installation

CDH3 Installation Guide | 121

Snappy Installation

Snappy is a compression/decompression library. It aims for very high speeds and
reasonable compression, rather than maximum compression or compatibility with other
compression libraries.

You're Reading a Free Preview

/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->