hadoop cluster setup

搭建 Hadoop 集群过程记录

开发环境:Ubuntu 16.04 server

Hadoop 版本:Apache Hadoop 3.0.0-alpha4

openjdk 版本:1.8.0_131

单节点模式(Local Mode)

  1. 安装 ssh:

    1
    sudo apt install ssh
  2. 下载 Hadoop 软件,并解压:

    1
    2
    wget http://mirror.metrocast.net/apache/hadoop/common/hadoop-3.0.0-alpha4/hadoop-3.0.0-alpha4.tar.gz
    tar xvf hadoop-3.0.0-alpha4/hadoop-3.0.0-alpha4.tar.gz
  3. 修改 Hadoop 中 Java 的路径(etc/hadoop/hadoop-env.sh):

    1
    export JAVA_HOME=export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/jre
  4. 准备输入文件:

    1
    2
    mkdir input
    cp etc/hadoop/*.xml input
  5. 使用 Hadoop 自带的数单词个数的程序测试结果:

    1
    bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0-alpha4.jar wordcount input output
  6. 查看输出结果:

    1
    cat output/*

    实验结果:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    "*"	18
    "AS 9
    "License"); 9
    "alice,bob 18
    "clumping" 1
    (ASF) 1
    (root 1
    (the 9
    --> 18
    -1 1
    0.0 1
    1-MAX_INT. 1
    1. 1
    1.0. 1
    2.0 9
    40 2
    40+20=60 1
    <!-- 18
    </configuration> 9
    </description> 29
    </property> 50
    <?xml 8
    <?xml-stylesheet 4
    <configuration> 9
    <description> 28
    <description>ACL 21
    ...

伪集群模式(Pseudo-Distributed Mode)

在单节点模式的基础上进行下面的操作:

  1. 配置 etc/hadoop/core-site.xml

    1
    2
    3
    4
    5
    6
    <configuration>
    <property>
    <name>fs.defaultFS</name>
    <value>hdfs://localhost:9000</value>
    </property>
    </configuration>
  2. 配置 etc/hadoop/hdfs-site.xml

    1
    2
    3
    4
    5
    6
    <configuration>
    <property>
    <name>dfs.replication</name>
    <value>1</value>
    </property>
    </configuration>
  3. 配置 ssh 免密码登录:

    1
    2
    ssh-keygen
    cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
  4. 测试 ssh 免密码登录:

    1
    ssh localhost

    若没有提示输入密码,则配置成功。

  5. 格式化 namenode:

    1
    bin/hdfs namenode -format
  6. 启动 namenode 和 datanode:

    1
    sbin/start-dfs.sh

    成功后可以通过 http://localhost:9870/ 访问 namenode 的 web 界面。

  7. 创建 hdfs 中的目录结构:

    1
    2
    bin/hdfs dfs -mkdir /user
    bin/hdfs dfs -mkdir /user/f
  8. 将输入文件到分布式文件系统中去:

    1
    2
    bin/hdfs dfs -mkdir input
    bin/hdfs dfs -put etc/hadoop/*.xml input
  9. 使用 Hadoop 自带的数单词个数的程序测试结果:

    1
    bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0-alpha4.jar wordcount input output
  10. 查看输出结果:

    1
    bin/hdfs dfs -cat output/*

    实验结果:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    "*"	18
    "AS 9
    "License"); 9
    "alice,bob 18
    "clumping" 1
    (ASF) 1
    (root 1
    (the 9
    --> 18
    -1 1
    0.0 1
    1-MAX_INT. 1
    1. 1
    1.0. 1
    2.0 9
    40 2
    40+20=60 1
    <!-- 18
    </configuration> 9
    </description> 29
    </property> 50
    <?xml 8
    <?xml-stylesheet 4
    <configuration> 9
    <description> 28
    <description>ACL 21
    ...
  11. 关闭分布式文件系统:

    1
    sbin/stop-dfs.sh

集群模式(Fully-Distributed Mode)

这个实验是在几台不同的主机上进行的,为此我们创建了三台虚拟机(master, slave, slave2)。

  1. 安装好两台 slave 节点的系统。

  2. 在 master 节点进行 3 - 8 步的配置。

  3. 获取几台主机的 ip 地址,修改 /etc/hosts 文件,方便识别各个主机:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    127.0.0.1	localhost
    127.0.1.1 ubuntu
    192.168.124.129 slave
    192.168.124.130 slave2
    192.168.124.128 master

    # The following lines are desirable for IPv6 capable hosts
    ::1 localhost ip6-localhost ip6-loopback
    ff02::1 ip6-allnodes
    ff02::2 ip6-allrouters
  4. 修改 core-site.xml :

    1
    2
    3
    4
    <property>
    <name>fs.default.name</name>
    <value>hdfs://master:9000</value>
    </property>
  5. 修改 hdfs-site.xml

    1
    2
    3
    4
    <property>
    <name>dfs.replication</name>
    <value>2</value>
    </property>
  6. 修改 yarn-site.xml

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    <property>
    true<name>yarn.resourcemanager.resource-tracker.address</name>
    true<value>master:8025</value>
    </property>
    <property>
    true<name>yarn.resourcemanager.scheduler.address</name>
    true<value>master:8035</value>
    </property>
    <property>
    true<name>yarn.resourcemanager.address</name>
    true<value>master:8050</value>
    </property>
  7. 修改 mapred-site.xml

    1
    2
    3
    4
    5
    6
    7
    8
    <property>
    true<name>mapreduce.job.tracker</name>
    true<value>master:5431</value>
    </property>
    <property>
    true<name>mapred.framework.name</name>
    true<value>yarn</value>
    </property>
  8. 修改 workers

    1
    2
    3
    master
    slave
    slave2
  9. 同之前两种模式一样,配置 master, slave, slave2 的 ssh ,使它们能够两两之间免密码登录。

  10. 使用 scp 将 master 节点的 Hadoop 文件夹拷贝到 slave, slave2 的对应位置:

    1
    2
    scp -r /usr/local/hadoop f@slave:/usr/local
    scp -r /usr/local/hadoop f@slave2:/usr/local
  11. 停止 dfs,并重新格式化 namenode:

    1
    /usr/local/hadoop/bin/hdfs namenode -format
  12. 启动 hdfs:

    1
    /usr/local/hadoop/sbin/start-dfs.sh
  13. 启动 yarn:

    1
    /usr/local/hadoop/sbin/start-yarn.sh
  14. 在 master 节点运行 jps,得到输出结果:

    1
    2
    3
    4
    5
    6
    7
    f@ubuntu:~$ jps
    2357 ResourceManager
    7511 Jps
    2667 NodeManager
    1772 NameNode
    2110 SecondaryNameNode
    1903 DataNode
  15. 在 slave 节点运行 jps,得到输出结果:

    1
    2
    3
    4
    f@ubuntu:~$ jps
    2244 Jps
    1288 NodeManager
    1164 DataNode
  16. 在 slave2 节点运行 jps,得到输出结果:

    1
    2
    3
    4
    f@ubuntu:~$ jps
    1291 NodeManager
    1167 DataNode
    4015 Jps
  17. 在 master 节点运行 /usr/local/hadoop/bin/hdfs dfsadmin -report,得到输出结果:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    f@ubuntu:~$ /usr/local/hadoop/bin/hdfs dfsadmin -report
    Configured Capacity: 76822855680 (71.55 GB)
    Present Capacity: 45398016000 (42.28 GB)
    DFS Remaining: 45397708800 (42.28 GB)
    DFS Used: 307200 (300 KB)
    DFS Used%: 0.00%
    Under replicated blocks: 0
    Blocks with corrupt replicas: 0
    Missing blocks: 0
    Missing blocks (with replication factor 1): 0
    Pending deletion blocks: 0

    -------------------------------------------------
    Live datanodes (3):

    Name: 192.168.124.128:9866 (master)
    Hostname: ubuntu
    Decommission Status : Normal
    Configured Capacity: 39043194880 (36.36 GB)
    DFS Used: 106794 (104.29 KB)
    Non DFS Used: 20659420886 (19.24 GB)
    DFS Remaining: 16376745984 (15.25 GB)
    DFS Used%: 0.00%
    DFS Remaining%: 41.95%
    Configured Cache Capacity: 0 (0 B)
    Cache Used: 0 (0 B)
    Cache Remaining: 0 (0 B)
    Cache Used%: 100.00%
    Cache Remaining%: 0.00%
    Xceivers: 1
    Last contact: Sun Sep 24 06:23:22 PDT 2017
    Last Block Report: Sun Sep 24 04:41:17 PDT 2017

    Name: 192.168.124.129:9866 (slave)
    Hostname: ubuntu
    Decommission Status : Normal
    Configured Capacity: 18889830400 (17.59 GB)
    DFS Used: 73728 (72 KB)
    Non DFS Used: 3396104192 (3.16 GB)
    DFS Remaining: 14510510080 (13.51 GB)
    DFS Used%: 0.00%
    DFS Remaining%: 76.82%
    Configured Cache Capacity: 0 (0 B)
    Cache Used: 0 (0 B)
    Cache Remaining: 0 (0 B)
    Cache Used%: 100.00%
    Cache Remaining%: 0.00%
    Xceivers: 1
    Last contact: Sun Sep 24 06:23:22 PDT 2017
    Last Block Report: Sun Sep 24 06:21:10 PDT 2017

    Name: 192.168.124.130:9866 (slave2)
    Hostname: ubuntu
    Decommission Status : Normal
    Configured Capacity: 18889830400 (17.59 GB)
    DFS Used: 126678 (123.71 KB)
    Non DFS Used: 3396108586 (3.16 GB)
    DFS Remaining: 14510452736 (13.51 GB)
    DFS Used%: 0.00%
    DFS Remaining%: 76.82%
    Configured Cache Capacity: 0 (0 B)
    Cache Used: 0 (0 B)
    Cache Remaining: 0 (0 B)
    Cache Used%: 100.00%
    Cache Remaining%: 0.00%
    Xceivers: 1
    Last contact: Sun Sep 24 06:23:22 PDT 2017
    Last Block Report: Sun Sep 24 06:21:07 PDT 2017
  18. 创建 hdfs 中的目录结构:

    1
    2
    f@ubuntu:~$ /usr/local/hadoop/bin/hdfs dfs -mkdir /user
    f@ubuntu:~$ /usr/local/hadoop/bin/hdfs dfs -mkdir /user/f
  19. 将输入文件到分布式文件系统中去:

    1
    2
    f@ubuntu:~$ /usr/local/hadoop/bin/hdfs dfs -mkdir input
    f@ubuntu:~$ /usr/local/hadoop/bin/hdfs dfs -put /usr/local/hadoop/etc/hadoop/*.xml input
  20. 在 master 节点跑一个数单词个数的测试程序:

    1
    f@ubuntu:~$ /usr/local/hadoop/bin/hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0-alpha4.jar wordcount input output
  21. 查看输出结果:

    1
    f@ubuntu:~$ /usr/local/hadoop/bin/hdfs dfs -cat output/*
  22. 得到输出结果:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    "*"	18
    "AS 9
    "License"); 9
    "alice,bob 18
    "clumping" 1
    (ASF) 1
    (root 1
    (the 9
    --> 18
    -1 1
    0.0 1
    1-MAX_INT. 1
    1. 1
    1.0. 1
    2.0 9
    40 2
    40+20=60 1
    <!-- 18
    </configuration> 9
    </description> 29
    </property> 50
    <?xml 8
    <?xml-stylesheet 4
    <configuration> 9
    <description> 28
    <description>ACL 21
    ...
  23. 在 master 节点访问资源管理器 http://127.0.0.1:8088/cluster/nodes,可以看到所有节点的信息。

  24. 在 master 节点访问 namenode 信息http://127.0.0.1:9870/dfshealth.html#tab-overview