Let’s configure Hadoop from Ansible

  1. Copy the hadoop and jdk softwares on the managed node
  2. Install these softwares
  3. Create the namenode directory
  4. Configure the hdfs and core-site.xml files
  5. Format the namenode directory
  6. Start the hadoop services

A brief about hadoop

Hadoop is an open-source software framework for storing data and running applications on clusters of commodity hardware. It provides massive storage for any kind of data, enormous processing power and the ability to handle virtually limitless concurrent tasks or jobs.

  • DataNode
  • NameNode

Let’s begin

Before beginning the tasks let’s see if we have proper connectivity with the managed nodes.

1. Copy the hadoop and jdk softwares on the managed nodes

2. Install these softwares

3. Create the namenode directory

4. Configure the hdfs and core-site.xml files

core-site.xml
hdfs-site.xml

5. Formatting the namenode

6. Start the hadoop services

Conclusion

So today we created an automated hdfs cluster. This is a very important and crucial task in the industry since there might be a condition where we want to configure 100’s of nodes urgently. Doing this task manually makes little sense since it would be very slow and prone to errors too. Ansible provides an easier and faster way of achieving this in a faster manner.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store