Use the Cloudera Manager Wizard for Software Installation and Configuration
Use Cloudera Manager to search for cluster hosts that will run CDH and managed services. Enter in the public IP addresses of your EC2 instances.
Remember you can find those IP addresses from your AWS Console in EC2 service, Instances section:
Once you have entered the IP addresses, click on "Search" for Cloudera Manager to identify the hosts on your cluster that are ready to be configured with CDH services. Once done, you should see a screen similar to the following:
After the hosts have been identified, you can select the installing method. Select "Use Parcels" if not already selected and click on "Continue".
On the next screen, make sure you select "Install Oracle Java SE Development Kit (JDK)". You do not need to select "Install Java Unlimited Strength Encryption Policy Files". Click on "Continue".
On the next screen, click on "Continue".
On the next screen, you should provide Cloudera Manager with SSH credentials. Select "Another user" and type in "ubuntu" (same login used to SSH to your EC2 instance). For the Authentication method, select "All hosts accept same private key" and for the Private Key File, select your EC2 private key (Windows users: use the .pem file, not the .ppk one).
On the next screen, the Install agents will be downloaded and installed on all hosts. Wait for the installation to succeed on all instances (all green). Once done, click on "Continue".
Then the parcels that you selected earlier will be installed. Once done, click on "Continue".
The Host Inspector will now validate the installation and provides a summary of the results. Click on "Finish".
On the first page of the Add Services wizard, choose the combination of services to install. You can for example select "Custom Services" and choose the following services: HBase, HDFS, Hive, Hue, Impala, Oozie, Spark, Sqoop2, YARN.
Click on "Continue".
The next screen lets you customize the role of each host of your cluster. In our case, make sure that the "Master" node or "NameNode" or "Gateway" node, etc… are all pointed to the instance that we chose as t2.large.
If the IPs shown in CDH installer are not the t2.large instance for the master/namenode/gateways, then change it so that it points to the t2.large instance.
The View By Host should look like this:
Click on "Continue".
On the next screen, Keep the default setting of "Use Embedded Database"to have Cloudera Manager create and configure required databases. Record the auto-generated passwords.
Click "Test Connection", when all the tests are successful/skipped the "Continue" button turns blue. Click "Continue".
On the next screen, you should be able to apply any configuration change for your cluster. You can keep all default options and click "Continue".
The wizard starts a First Run of the services. When all of the services are started, click "Continue".
You will see a success message indicating that your cluster has been successfully started:
Your Hadoop cluster has now been successfully installed!