载入中。。。 'S bLog
 
载入中。。。
 
载入中。。。
载入中。。。
载入中。。。
载入中。。。
载入中。。。
 
填写您的邮件地址,订阅我们的精彩内容:


 
Cloudera云平台,支持Hadoop,web方式全自动化安装了,支持50个以下节点自动安装。
[ 2011/7/1 21:40:00 | By: 梦翔儿 ]
 

Cloudera CDH3云服务与管理平台,支持Hadoop,web方式全自动化安装了,支持50个以下节点自动安装。

Installing CDH3 with Service and Configuration Manager Express Edition

Contents

  • Introduction
    • About the SCM Express Installation Program
    • About the SCM Express Wizard
  • Supported Operating Systems for SCM Express
    • Other Requirements
  • Downloading and Running the SCM Express Installer
  • Starting the SCM Admin Console
  • Using SCM Express for Initial CDH3 Installation and Configuration
  • Testing the Cluster
      • Running a Sample Job
  • Generating a Client Configuration
  • Other SCM Express Documentation
  • Getting Support
    • Cloudera Support
    • Community Support
  • Uninstalling SCM Express

Introduction

Service and Configuration Manager Express Edition (SCM Express) automates the installation and configuration of CDH3 on an entire cluster (up to 50 nodes), requiring only that you have root SSH access to your cluster's machines. SCM Express consists of:

  • A small self-executing SCM Express installation program to install the SCM Server and other packages in preparation for cluster host installation
  • SCM Express wizard for automating CDH3 installation and configuration on the cluster hosts
  • SCM Admin Console for configuring the cluster after installation is completed

About the SCM Express Installation Program

The SCM Express installation program, which you will install on the host where you want to the SCM Server to run, automatically:

  • Installs the package repositories for SCM and the Java Runtime Environment (JRE)
  • Installs the JRE if it's not already installed
  • Installs the SCM Server
  • Installs and configures an embedded PostgreSQL database

About the SCM Express Wizard

After you have installed the SCM Server and when you run it for the first time, you can use the
the SCM Express wizard to automatically do the following on the cluster hosts:

  • Using SSH, discover the cluster hosts you specify via IP address ranges or hostnames
  • Configures the package repositories for SCM, CDH3 and the JRE
  • Install the SCM Agent and CDH3 (including Hue) on the cluster hosts
  • Install the JRE if it's not already installed on the cluster hosts
  • Determine mapping of services to host
  • Suggest a Hadoop configuration and start the Hadoop services

You can choose to abort the SCM Agent and CDH3 installation process and SCM Express wizard will automatically revert and completely rollback the installation process for any uninstalled components. Installed components are not uninstalled during an abort.

Note
The SCM Express license allows you to manage CDH3 on a maximum of 50 hosts. SCM Express also does not support using SCM to configure and manage Hadoop security with Kerberos.
Important
To use SCM Express, the SCM Server must have SSH access to the cluster hosts and you must login using a root account or an account that has password-less sudo permission. For authentication during the following procedure, you will need to either enter the password or upload a public and private key pair for the root or sudo user account. If you want to use a public and private key pair, the public key must be installed on the clusters hosts before you use SCM Express. Authentication is not supported for accounts that have password-protected sudo permission.

For more overview, architecture, and instructional information, see the User Guide for SCM Express.

Supported Operating Systems for SCM Express

SCM Express supports the following operating systems:

  • For Red Hat systems, Cloudera provides 64-bit packages for Red Hat Enterprise Linux 5 and CentOS 5, and 64-bit packages for Red Hat Enterprise Linux 6. We recommend using update 5 or later of Red Hat Enterprise Linux 5.
    Note
    For Red Hat Enterprise Linux 6, make sure the RHEL Server Supplementary channel is activated. For instructions, see JDK Installation on Red Hat 6 systems.
  • For SUSE systems, Cloudera provides 64-bit packages for SUSE Linux Enterprise Server 11. Service Pack 1 or later is required. Also, the SUSE Linux Enterprise Software Development Kit 11 SP1 is required on cluster hosts running the SCM Agents (not required on the SCM Server host); you can download the SDK here.
Note
For Hadoop to work properly, the same version of the same operating system must be installed on all cluster hosts.

Other Requirements

  • SSH access to cluster hosts on port 22 from SCM Server host.
  • All hosts must have access to standard distro RPM repositories.
  • All hosts must have access to the archive.cloudera.com site on the Internet to allow SCM to download Cloudera packages and repositories.
  • Cluster hosts must have DNS and reverse DNS properly configured.
  • No blocking by firewalls; make sure port 7180 is open because it is the port used to access the SCM Admin Console after installation.
  • No blocking by Security-Enhanced Linux (SELinux)

Downloading and Running the SCM Express Installer

Important
If SCM Enterprise Edition has been previously installed on your cluster, you must completely uninstall all components of it before using SCM Express.

To download and run the SCM Express installer:

  1. Download scm-installer.bin from the Cloudera Downloads page to the host where you want to install the SCM Server that is on your cluster or is accessible to your cluster over your network.

  2. After downloading scm-installer.bin, change it to have executable permission.
    $ chmod u+x scm-installer.bin 
    
  3. Run scm-installer.bin.
    $ sudo ./scm-installer.bin 
    
  4. Read the SCM Express Readme and then click Next.

  5. Read the Cloudera SCM Express License appears and then click Next. Click Yes to confirm you accept the license.

  6. Read the Oracle Binary Code License Agreement and then click Next. Click Yes to confirm you accept the Oracle Binary Code License Agreement.

    The SCM Express installer begins installing the CDH3 and SCM repo files and then installs the packages. The installer also installs the SCM Server.

    Note
    If an error message "Failed to start server" appears while running scm-installer.bin, exit the installation program. If the SCM Server log file /var/log/cloudera-scm-server/cloudera-scm-server.log contains the following message, then it's likely you have SELinux enabled:
    Caused by: java.lang.ClassNotFoundException: com.mysql.jdbc.Driver
            at java.net.URLClassLoader$1.run(Unknown Source)
            at java.security.AccessController.doPrivileged(Native Method)
            at java.net.URLClassLoader.findClass(Unknown Source)
            at java.lang.ClassLoader.loadClass(Unknown Source)
            ...
    

    You can disable SELinux by running the following command on the SCM Server host:

    $ sudo setenforce 0
    

    To disble it permanently, edit /etc/selinux/config.

  7. Click Close.

  8. Click Finish to quit the installer program.

  9. The next step is to use the SCM Admin Console to install and configure CDH3 on your cluster hosts. For instructions, start with the next section Starting the SCM Admin Console.

Starting the SCM Admin Console

The SCM Admin console enables you to use SCM to configure Hadoop on your cluster. In this release, the SCM Admin Console supports the following browsers:

  • Internet Explorer 8 and 9
  • Google Chrome 10
  • Safari 5
  • Firefox 3.6 and 4

To start the SCM Admin Console:

  1. In a web browser, type the following URL:
    http(s)://<Server host>:<port>
    

    where:
    <Server host> is the name or IP address of the host machine where the SCM Server is installed.
    <port> is the port configured for the SCM Server. The default port is 7180.

    For example, if you are on the host where the SCM Server is installed, enter the following URL:

    http://localhost:7180/
    

    The SCM Admin Console appears.



  2. Log into the SCM Admin Console. The default credentials are:

    Username: admin (you cannot change the username)
    Password: admin (you can change the password using the SCM Admin Console after you run the wizard in the next section)

Using SCM Express for Initial CDH3 Installation and Configuration

The following instructions show you how to use SCM Express wizard to do an initial installation and configuration. The wizard helps you to install and set up Cloudera packages across your cluster (up to 50 nodes in SCM Express) and will:

  • Find the cluster hosts you specify via hostname and IP-address ranges
  • Connect to each host with SSH to install the SCM Agent and CDH3 (including Hue)
  • Install the Java Runtime Environment (JRE) on the cluster hosts (if not already installed)
  • Configure Hadoop automatically and start the Hadoop services

To use SCM Express:

  1. The first time you start the SCM Admin Console, the following wizard starts up. It is only displayed the first time you start up. Click Continue to get started.



  2. Fill in the registration form and click Submit Registration and choose whether you want to receive product information from Cloudera and whether you agree to send anonymous SCM usage information to Cloudera. Or, click Skip Registration to skip this screen.



  3. To enable SCM Express to automatically discover your cluster hosts where you want to install CDH3, enter the cluster hostnames or IP addresses. You can also specify hostname and IP address ranges:

    For example:
    Use this Expansion Range To Specify these Hosts
    10.1.1.[1-4] 10.1.1.1, 10.1.1.2, 10.1.1.3, 10.1.1.4
    host[1-3].company.com host1.company.com, host2.company.com, host3.company.com
    host[07-10].company.com host07.company.com, host08.company.com, host09.company.com, host10.company.com


    You can specify multiple addresses and address ranges by separating them by commas, semicolons, tabs, or blank spaces, or by placing them on separate lines. Use this technique to make more specific searches instead of searching overly wide ranges.

    The scan results will include all addresses scanned, but only scans that reach hosts running SSH will be selected for inclusion in your cluster by default.

    Note
    If you don't know the IP addresses of all of the hosts, you can enter an address range that spans over unused addresses and then deselect the hosts that do not exist (and are not discovered) later in this procedure. However, keep in mind that wider ranges will require more time to scan.
  4. Click Find Hosts.

    SCM identifies the hosts on your cluster to allow you to configure them for CDH3. If there are a large number of hosts on your cluster, wait a few moments to allow them to be discovered and shown in the wizard. If the search is taking too long, you can stop the scan by clicking Abort Scan. To find additional hosts, add their host name or IP address and click Find Hosts again.

    Note
    SCM Express scans hosts by checking to see if they have SSH port 22 open. If there are some hosts where you want to install CDH3 that are not shown in the list, make sure you have network connectivity between the SCM Server host and those hosts. Common causes of loss of connectivity are firewalls and interference from SELinux.
  5. Verify that the number of hosts shown matches the number of hosts where you want to install CDH3. Deselect host entries that do not exist and deselect the hosts where you do not want to install CDH3. Click Continue.
    Note
    The SCM Express license limits the maximum selected number of hosts to 50.
  6. To authenticate with the hosts, you must either use a root account that is on all of your cluster hosts, or use an account that has password-less sudo permissions. Select root or enter the user name for an account that has password-less sudo permissions.



  7. You can either use a shared password for the account, or use a public and private key pair.
    Note
    In this release, Cloudera has tested OpenSSH-style key pairs. Other key pairs (such as PuTTY-generated pairs) may not work.

    To enter a password, click All hosts accept same password and enter the account password.

    To use a public and private key pair, click All hosts accept same public key. Specify or browse for the location of the public and private keys. If your keys contain a passphrase, enter it.



  8. To begin installing CDH3 and the SCM Agent on the cluster hosts, click Start Installation.

    The SCM Express wizard uses SSH to access the cluster hosts and follows a sequence of steps to download and install CDH3 and the SCM Agent. The wizard configures package repositories, installs the JRE (if necessary), CDH3, and the SCM Agent, and then starts the SCM Agent. The wizard runs a maximum of 10 installations in parallel to avoid excessive network load. The status of installation on each host is displayed in the following screen. You can also click the Details link for individual hosts to view detailed information about the installation and error messages if installation fails on any hosts.



    Note
    If you click the Abort Installation button while installation is in progress, it will halt any pending or in-progress installations and rollback any in-progress installations to a clean state. The Abort Installation button does not affect host installations that already completed successfully or already failed.

    If installation fails on a host as shown below, you can click the Retry link next to the failed host to try installation on that host again. To retry installation on all failed hosts, click Retry Failed Hosts at the bottom of the screen.



  9. When the Continue button appears at the bottom of the screen, the installation process is completed.



    If the installation completes successfully on some hosts but fails on other hosts as shown below, you can click Continue if you want to skip installation on the failed hosts and continue to the next screen to start configuring CDH3 on the successful hosts.



  10. In the following screen, choose the Hadoop services you want to start. You can choose one of the three combinations; the combinations take into account the dependencies between the Hadoop services.



    After you select the services and click Continue, the wizard automatically evaluates the hardware configurations of the cluster hosts to determine the best machines for each role. For example, the wizard assigns the NameNode role to the machine that best meets the NameNode requirements. The assignments are based on the size of the cluster and the physical characteristics of each machines, such as the number of CPUs, amount of RAM, and disk space.

  11. In the following screen, review the configuration settings that the wizard automatically set based on the hardware configurations of the cluster hosts. The wizard configured some configuration options to match each host's hardware configurations, such as the number of map and reduce slots for TaskTracker. You can leave the settings as they are for now and change them later if necessary in the SCM Admin Console.

    You should, however, confirm the settings entered for file system paths, such as the NameNode Data Directory and the DataNode Data Directory.



  12. Click Save and Continue.

    The wizard starts the services on your cluster.

  13. When all of the services are started, click Continue on the following screen.



  14. In the final screen, you can read instructions for generating a client configuration to allow users work with the HDFS, MapReduce, HBase, or other services you created. The procedure is also documented below in the Generating a Client Configuration section.



  15. Click Continue. The following SCM Admin Console appears.

Testing the Cluster

You can test your cluster by starting a Hue session.

  1. Click the Hue Service hue1 link in the navigation list on the left side of the SCM Admin Console.



  2. Click the Instances link at the top of the window.



  3. Click the HUE_SERVER-1 link.



  4. Click the status link for the hue/hue.sh process.



  5. Specify hdfs as the initial super user name and enter any password to log into Hue.
    Note
    By specifying hdfs as the initial Hue user, you can later use the hdfs account to create other user accounts and their home directories, which allows them to run jobs in Hue. Use the hdfs account to test your cluster in the next section and to run jobs until you are satisfied your Hadoop cluster is working correctly. After that, you can set up additional user accounts and their home directories to use the cluster.



Running a Sample Job

You can run a sample job in Hue to test your cluster.

  1. Click the Job Designer icon in the bottom of the Hue window to start the Job Designer application.

  2. Click Install Samples and then click OK.

  3. Click Jar and enter a name for the job.

  4. At the bottom of the Job Designer window, choose Submit upon save, click Save, and then click OK.

    The job should run and when it is completed, the status in the upper right corner of the job window will display finished. You just ran your first job on your cluster.

Generating a Client Configuration

To allow Hadoop users to work with the HDFS, MapReduce, and HBase services you created, you can create a zip file that contains all of the relevant configuration files with the settings from your service. You can then distribute the client configuration files to the users of a service. MapReduce configuration files is the type most people want because it implicitly includes the HDFS configuration. The global client configuration also contains files for HBase services.

The zip file contains the following configuration files:

  • core-site.xml
  • hadoop-env.sh
  • hdfs-site.xml
  • log4j.properties
  • ssl-client.xml.example

To generate a client configuration zip file:

  1. In the SCM Admin Console, navigate to the service instance for which you want to generate client configuration files. For example, click the HDFS Service hdfs1 link in the navigation list on the left side of the SCM Admin Console. You can also generate a global configuration zip file for all services by clicking the Services link in the SCM Admin Console navigation tree.

  2. Click the Generate Client Configuration command button.

  3. If you are generating a file for an individual service, click Generate Client Configuration that appears in the next screen to confirm.

  4. Click Download Result Data to download the client configuration zip file.

  5. Save the client configuration zip file locally.

  6. Distribute the client configuration zip file to the users who need to use the HDFS service on your cluster.

  7. Tell the users to unpack the client configuration zip file and set their HADOOP_CONF_DIR environment variable to the path where they unpacked the zip file. Depending on your Hadoop cluster's configuration, users may also need to modify some of the settings in the generated configuration files.

  8. You can save the MapReduce export to /etc/hadoop/conf.scm, and then update alternatives according to the instructions on this page.

You can now use the SCM Admin Console to stop and start the Hadoop services, change configurations, and add new services as necessary. You can also view status and error logs.

Other SCM Express Documentation

For more information about:

  • Using Service and Configuration Manager, see the User Guide for SCM Express.
  • The SCM Express Installation program and wizard, see What the SCM Express Installer and Wizard Do on a Cluster.
  • Known Issues in SCM Express, see SCM Express Edition 3.6 Release Notes.

Getting Support

Cloudera Support

Cloudera can help you install, configure, optimize, tune, and run Hadoop for large scale data processing and analysis. Cloudera supports Hadoop whether you run our distribution on servers in your own data center, or on hosted infrastructure services such as Amazon EC2, Rackspace, SoftLayer, or VMware's vCloud.

For more information, see: http://www.cloudera.com/hadoop-support

Community Support

If you have any questions or comments about installing CDH3 using Service and Configuration Manager Express, you can send a message to the SCM user's list: scm-users@cloudera.org You can register for the SCM user's group here.

If you have any questions or comments about Cloudera's Distribution including Apache Hadoop (CDH), you can send a message to the CDH user's list: cdh-user@cloudera.org

Uninstalling SCM Express

To uninstall the SCM Server, run this command on the SCM Server host:

$ sudo /usr/share/cmf/uninstall-scm-express.sh

To uninstall SCM Express from the cluster hosts, you must manually remove the RPMs:

  1. To uninstall the SCM Agent, run this command on each host:
    On Red Hat:
    $ sudo yum remove cloudera-scm-agent
    

    On SUSE systems:

    $ sudo zypper remove cloudera-scm-agent
    
  2. To uninstall CDH3, run this command on each host:
    On Red Hat:
    $ sudo yum remove hadoop-0.20 hadoop-0.20-native hadoop-0.20-sbin hadoop-zookeeper hadoop-hive hadoop-hbase hue*
    

    On SUSE systems:

    $ sudo zypper remove hadoop-0.20 hadoop-0.20-native hadoop-0.20-sbin hadoop-zookeeper hadoop-hive hadoop-hbase hue*
    
  3. To uninstall the repository RPMs, run this command on each host:
    On Red Hat:
    $ sudo yum remove cloudera-scm-free cloudera-cdh
    

    On SUSE systems:

    $ sudo zypper remove cloudera-scm-free cloudera-cdh
    

https://ccp.cloudera.com/display/CDHDOC/Installing+CDH3+with+Service+and+Configuration+Manager+Express+Edition
 
 
发表评论:
载入中。。。

 
 
 

梦翔儿网站 梦飞翔的地方 http://www.dreamflier.net
中华人民共和国信息产业部TCP/IP系统 备案序号:辽ICP备09000550号

Powered by Oblog.