Tuesday, 7 July 2015


Distributed HADOOP cluster setup requires SSH key based authentication among master and slave nodes. Using SSH key based authentication, master node can connect to slave nodes or secondary nodes to start/stop the daemons\processes without any password. For example master node launches node manager and data node daemons in all the slave machines using SSH key based authentication. If password-less SSH is not setup, user has to manually specify the password on each individual machine to start all the processes.

Here are the steps to setup password-less ssh.

Install SSH-Client on the master
>> sudo apt-get install openssh-client

Install SSH-server on all the slave machines
>> sudo apt-get install openssh-server

Generate the SSH key. We can generate DSA or RSA keys. Both are encryption algorithms.
>> ssh-keygen -t rsa

DSA means Digital Signature Algorithm
RSA means Rivest Shamir Adleman

Go to .ssh directory and list the files.
>> cd .ssh/
>> ls
Here we can find id_rsa.pub (public key) and id_rsa (private key) files.

Copy the public key to all the slave machines.
>> ssh-copy-id -i id_rsa.pub username@slave-hostname

New file will be created as "authroized_keys" which has the same content as public key.

If the master node also acts as a slave machine, then copy the public key to local authorized keys as below:
>> cd .ssh
>> cat id_rsa.pub >> authorized_keys

Verify the SSH connection as below. It should get connected without prompting the password.

>> ssh username@slave-hostname