制作基于ubuntu的Hadoop镜像

文中使用的到的docker知识及命令可参考 Docker基础

在开始之前需确保已安装好docker环境

安装JAVA

1
2
3
4
5
6
7
#由于hadoop运行需要java环境,在宿主机新建一个容器用于配置java及安装hadoop
docker run -it ubuntu
apt-get install software-properties-common python-software-properties 
add-apt-repository ppa:webupd8team/java
apt-get update 
apt-get install oracle-java7-installer 
java -version

安装Hadoop

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
cd ~
mkdir soft
cd soft
mkdir hadoop
wget http://mirrors.sonic.net/apache/hadoop/common/hadoop-2.6.5/hadoop-2.6.5.tar.gz
tar -zxvf hadoop-2.6.5.tar.gz

vim ~/.bashrc
export JAVA_HOME=/usr/lib/jvm/java-7-oracle 
export HADOOP_HOME=/root/soft/hadoop/hadoop-2.6.5 
export HADOOP_CONFIG_HOME=$HADOOP_HOME/etc/hadoop 
export PATH=$PATH:$HADOOP_HOME/bin 
export PATH=$PATH:$HADOOP_HOME/sbin

vim hadoop-env.sh
export JAVA_HOME=/usr/lib/jvm/java-7-oracle

配置Hadoop

分别配置 core-site.xmlhdfs-site.xmlmapred-site.xml三个文件

1
2
3
4
5
cd $HADOOP_CONFIG_HOME
mkdir tmp
mkdir namenode
mkdir datanode
#创建各节点的存放目录用于配置

vim core-site.xml 配置configuration中的property 如下

<configuration>

<property>

<name>hadoop.tmp.dir</name>

<value>/root/soft/hadoop/hadoop-2.6.5/tmp</value>

<description>A base for other temporary directories.</description>

</property>

<property>

<name>fs.default.name</name>

<value>hdfs://master:9000</value>

<final>true</final>

<description>The name of the default file system. A URI whose scheme and authority determine the FileSystem implementation. The uri’s scheme determines the config property (fs.SCHEME.impl) naming the FileSystem implementation class. The uri’s authority is used to determine the host, port, etc. for a filesystem.

</description>

</property>

</configuration>

vim hdfs.site.xml

<configuration>

<property>

<name>dfs.replication</name>

<value>2</value>

<final>true</final>

<description>Default block replication.

The actual number of replications can be specified when the file is created.

The default is used if replication is not specified in create time.

</description>

</property>

<property>

<name>dfs.namenode.name.dir</name>

<value>/root/soft/hadoop/hadoop-2.6.5/namenode</value>

<final>true</final>

</property>

<property>

<name>dfs.datanode.data.dir</name>

<value>/root/soft/hadoop/hadoop-2.6.5/datanode</value>

<final>true</final>

</property>

</configuration>

cp mapred-site.xml.template mapred-site.xml

vim mapred-site.xml

<configuration>

<property>

<name>mapred.job.tracker</name>

<value>master:9001</value>

<description>The host and port that the MapReduce job tracker runs

at. If “local”, then jobs are run in-process as a single map

and reduce task.

</description>

</property>

</configuration>

配置节点认证互信

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
#设置服务自启动
vim ~/.bashrc
/usr/sbin/sshd

ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa 
cd ~/.ssh
cat id_dsa.pub >> authorized_keys
#实验中,各节点可使用同一个公钥;也可后续生成不同的公钥,将各节点自己的公钥加入到其他节点的authorized_keys中
#注意authorized_keys文件非owner用户不能有w权限

exit 
docker commit -m 'hadoop installed' container_id ubuntu:hadoop
#提交保持hadoop镜像用于后续新建实例

Hadoop集群实施

基于之前的hadoop的镜像,此时只需new(docker run)多个容器即可创建多个hadoop节点,然后修改其中的配置,配置为Master或Slave

新建hadoop节点

1
2
3
4
#在三个不同的终端标签页中分别执行以下命令
docker run -it -h master -p 50070:50070 ubuntu:hadoop
docker run -it -h slave1 ubuntu:hadoop
docker run -it -h slave2 ubuntu:hadoop

配置Hosts

1
2
3
4
5
#在各节点容器中执行ifconfig命令,得到具体ip,然后向各节点hosts都写入
vim /etc/hosts
172.17.0.4 master
172.17.0.5 slave1
172.17.0.6 slave2

Master节点设置

1
2
3
4
cd $HADOOP_CONFIG_HOME
vim slaves #写入所有slave节点的主机名hostname
slave1
slave2

启动服务

1
2
#在master节点执行start-all.sh 若能看到以下类似信息表示成功
slave1: starting nodemanager, logging to /root/soft/hadoop/hadoop-2.6.5/logs/yarn-root-nodemanager-slave1.out

查看服务状态

#在各节点中执行jps命令查看服务进程

jps #master中信息

1762 NodeManager

3592 Jps

974 NameNode

1563 SecondaryNameNode

336 ResourceManager

1418 DataNode

对于非Toolbox的Docker环境可直接在浏览器中访问master节点 http://172.17.0.4:50070

如果是采用Toolbox安装的Docker,则访问http://192.168.99.100:50070

此中情况需要做端口映射,见开始docker run处,如果没有可在宿主机执行,达到动态修改端口~

iptables -t nat -A PREROUTING -p tcp -m tcp –dport 50070 -j DNAT –to-destination 172.17.0.4:50070

![](/assets/屏幕快照 2016-11-19 下午10.21.04.png)

ifreer@2016.11.19