客户端访问双网卡hadoop集群的HDFS-CFANZ编程社区

如果大数据平台处于两个网络中，其中内部网络用来进行数据交换和计算，配置万兆光纤网卡和光纤交换机；外部网络用来为其他部门提供服务、数据接口，这里使用的是千兆网络。

HDFS是支持混合网络的，详情见官方文档：Apache Hadoop 2.8.0 – HDFS Support for Multihomed Networks

集群部署过程中遇到一个这样的问题：外部网络无法通过IP直接访问内部网络的Hadoop NameNode和DataNode。比如，我们外部有一个独立的flink集群，上跑了一个流式任务，将数据写入到远程的hdfs上，发现sink时是可以在hdfs上创建隐藏文件（in-progress）的，in-progress文件大小始终是0；而且在checkpoint时会报错，无法将in-progress文件变成finished文件。报错信息如下：

org.apache.hadoop.hdfs.DFSClient                              - Exception in createBlockOutputStream
org.apache.hadoop.net.ConnectTimeoutException: 60000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=/192.168.16.2:9866]
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:534)
        at org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1717)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1447)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1400)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:554)
2022-01-04 22:23:39,647 Thread-237 INFO  org.apache.hadoop.hdfs.DFSClient

其中，192.168.16.2是hadoop集群的内网ip。

1、我们先来看下Client如何与NameNode、DataNode通信：

Client首先会和NameNode通信，获取Hadoop的存储块信息。其中Hadoop负责命名空间的维护、存储块的复制、数据块在DataNode中的信息等。

如果Client只是进行目录结构的查询，那么只需要和NameNode通信即可。如果需要获取数据块，那么还需要通过获取到的内容与DataNode进行交互。Client和Hadoop集群的通信步骤如下：

Client向NameNode发起文件写入或读取的请求
NameNode返回文件在DataNode上的信息
根据NameNode提供的信息，向DataNode写入或读取数据块。

说明：看到这里，我们就可以解释上面的sink时可以创建in-progress文件，但是checkpoint时无法写入数据的问题了。因为创建文件时和NameNode通信的，写数据时和DataNode通信的。

2、通信协议：

所有 HDFS 通信协议都位于 TCP/IP 协议之上，通过RPC的方式通信；按照设计，NameNode 从不发起任何 RPC，它只响应 DataNode 或客户端发出的 RPC 请求。

3、根据官方文档中Hadoop可以通过配置RPC的几个参数来支持混合网络：

3.1）Ensuring HDFS Daemons Bind All Interfaces：

<property>
  <name>dfs.namenode.rpc-bind-host</name>
  <value>0.0.0.0</value>
  <description>
    The actual address the RPC server will bind to. If this optional address is
    set, it overrides only the hostname portion of dfs.namenode.rpc-address.
    It can also be specified per name node or name service for HA/Federation.
    This is useful for making the name node listen on all interfaces by
    setting it to 0.0.0.0.
  </description>
</property>

<property>
  <name>dfs.namenode.servicerpc-bind-host</name>
  <value>0.0.0.0</value>
  <description>
    The actual address the service RPC server will bind to. If this optional address is
    set, it overrides only the hostname portion of dfs.namenode.servicerpc-address.
    It can also be specified per name node or name service for HA/Federation.
    This is useful for making the name node listen on all interfaces by
    setting it to 0.0.0.0.
  </description>
</property>

<property>
  <name>dfs.namenode.http-bind-host</name>
  <value>0.0.0.0</value>
  <description>
    The actual adress the HTTP server will bind to. If this optional address
    is set, it overrides only the hostname portion of dfs.namenode.http-address.
    It can also be specified per name node or name service for HA/Federation.
    This is useful for making the name node HTTP server listen on all
    interfaces by setting it to 0.0.0.0.
  </description>
</property>

<property>
  <name>dfs.namenode.https-bind-host</name>
  <value>0.0.0.0</value>
  <description>
    The actual adress the HTTPS server will bind to. If this optional address
    is set, it overrides only the hostname portion of dfs.namenode.https-address.
    It can also be specified per name node or name service for HA/Federation.
    This is useful for making the name node HTTPS server listen on all
    interfaces by setting it to 0.0.0.0.
  </description>
</property>

默认情况下，HDFS endpoints被指定为主机名或 IP 地址。在任何一种情况下，HDFS 守护进程都将绑定到单个 IP 地址，使守护进程无法从其他网络访问。

解决方案是对服务器端点进行单独设置，以强制绑定通配符 IP 地址 INADDR_ANY，即 0.0.0.0。

3.2）Clients use Hostnames when connecting to DataNodes

<property>
  <name>dfs.client.use.datanode.hostname</name>
  <value>true</value>
  <description>Whether clients should use datanode hostnames when
    connecting to datanodes.
  </description>
</property>

默认情况下，HDFS 客户端使用 NameNode 提供的 IP 地址连接到 DataNode。根据网络配置，客户端可能无法访问此 IP 地址。修复方法是让客户端执行他们自己的 DataNode 主机名的 DNS 解析。

3.3）DataNodes use HostNames when connecting to other DataNodes

<property>
  <name>dfs.datanode.use.datanode.hostname</name>
  <value>true</value>
  <description>Whether datanodes should use datanode hostnames when
    connecting to other datanodes for data transfer.
  </description>
</property>

极少数情况下，DataNode 的 NameNode 解析 IP 地址可能无法从其他 DataNode 访问。解决方法是强制 DataNode 为 DataNode 间连接执行自己的 DNS 解析。

总结：

也就是说我们通过配置使NameNode能够监听所在的两个网络接口中配置的IP，使数据节点之间的Socket通信通过主机名进行，这样如果在内网中我们只需要将/etc/hosts中的主机IP设置为内网IP即可实现数据交换只经过内网，而外部网络通过外部网络的IP与它通信，同样没有问题。

参考：

https://toutiao.io/posts/473478/app_preview

https://www.modb.pro/db/81172

https://www.zhihu.com/question/56421655

官方文档：

https://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-hdfs/HdfsMultihoming.html

https://nightlies.apache.org/flink/flink-docs-master/zh/docs/connectors/datastream/streamfile_sink/