| Define different Hadoop configuration files?
Below are the different Hadoop configuration files
(1)HADOOP-ENV.sh:-It will specifies the environment variables which will effect JDK used by bin/Hadoop (Hadoop Daemon). As we knows the Hadoop framework is written in Java and will uses JRE. We can use these affect some aspects of Hadoop daemon behavior where log files are stored. The only variable we should need to change in this file is JAVA_HOME and which specifies the path to the Java 1.5.x installation used by Hadoop.
(2)mapred-default.xml:-This is one of important configuration files which required runtime environment settings of Hadoop. And file contains site spefific settings for Hadoop Map/Reduce daemons and jobs. Here in this file we specify a framework name for MapReduce by setting the MapReduce.framework.name. And this file is empty by default. We use this file to tailor the behavior of Map/Reduce on your site.
(3)CORE-SITE.XML:-This is one of most important configuration files that required runtime environment settings for a Hadoop cluster. And this will informs Hdaoop daemons where the NAMENODE runs in cluster. It also informs the Name Node as to which IP and ports it should bind.
(4)HDFS-SITE.XML:-This is also an important configuration files which required for runtime enviornment settings of Haddop.This will contains the configuration settings for NAMENode, DATANODE, SECONDARYNODE. This will also used to specifiy default block replication. Here actual number of replication can also be specified when file is created.
(5)YARN-SITE.XML:-This file will supports an extensible resource model. BY default YARN tracks CPU and memory for all nodes, application and queues but resource definition can be extended to include arbitrary countable resources. A countable resource is a resource that is consumed while a container is running, but is released afterwards. CPU and memory are both countable resources. Other examples include GPU resources and software licenses.
(6)Master and Slaves:-This will used to determine the master Nodes in Hadoop cluster. It will inform about the location of SECONDARY NAMENODE to Hadoop Daemon. The Mater File on Slave node is blank.
Slave:-It is used to determine the slave Nodes in Hadoop cluster. The Slave file at Master Node contains a list of hosts, one per line. The Slave file at Slave server contains IP address of Slave nodes. | | |