0
点赞
收藏
分享

微信扫一扫

Flink1.11.0 flink on yarn 模式部署详解

at小涛 2023-02-01 阅读 75


Flink发布了新版本1.11.0,增加了很多重要新特性,包括增加了对Hadoop3.0.0以及更高版本Hadoop的支持,不再提供“flink-shaded-hadoop-*” jars,而是通过配置YARN_CONF_DIR或者HADOOP_CONF_DIR和HADOOP_CLASSPATH环境变量完成与yarn集群的对接。
具体步骤如下:

  1. 下载Flink1.11.0安装包
    ​​​flink安装包下载地址​​
  2. 确保安装有Hadoop集群,版本至少Hadoop 2.4.1
  3. 配置环境变量
    增加环境变量如下:

HADOOP_HOME=/opt/module/hadoop-3.1.3
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export HADOOP_CLASSPATH=`hadoop classpath`

  1. 启动Hadoop集群(HDFS和Yarn)
  2. 解压Flink安装包,并进入conf目录,修改flink-conf.yaml文件,修改以下配置,若在提交命令中不特定指明,这些配置将作为默认配置

# The total process memory size for the JobManager.
#
# Note this accounts for all memory usage within the JobManager process, including JVM metaspace and other overhead.

jobmanager.memory.process.size: 1600m


# The total process memory size for the TaskManager.
#
# Note this accounts for all memory usage within the TaskManager process, including JVM metaspace and other overhead.

taskmanager.memory.process.size: 1728m

# To exclude JVM metaspace and overhead, please, use total Flink memory size instead of 'taskmanager.memory.process.size'.
# It is not recommended to set both 'taskmanager.memory.process.size' and Flink memory.
#
# taskmanager.memory.flink.size: 1280m

# The number of task slots that each TaskManager offers. Each slot runs one parallel pipeline.

taskmanager.numberOfTaskSlots: 8

# The parallelism used for programs that did not specify and other parallelism.

parallelism.default: 1

  1. 向yarn集群申请资源
    执行以下命令向yarn集群申请资源

bin/yarn-session -nm test

可以使用以下参数:

-D <arg>                        Dynamic properties
-d,--detached Start detached
-jm,--jobManagerMemory <arg> Memory for JobManager Container with optional unit (default: MB)
-nm,--name Set a custom name for the application on YARN
-at,--applicationType Set a custom application type on YARN
-q,--query Display available YARN resources (memory, cores)
-qu,--queue <arg> Specify YARN queue.
-s,--slots <arg> Number of slots per TaskManager
-tm,--taskManagerMemory <arg> Memory per TaskManager Container with optional unit (default: MB)
-z,--zookeeperNamespace <arg> Namespace to create the Zookeeper sub-paths for HA mode

  1. 提交任务

yarn session启动之后会给出一个web UI地址以及一个yarn application ID,用户可以通过web UI或者命令行两种方式提交任务。

Flink1.11.0 flink on yarn 模式部署详解_flink


web UI提交任务比较简单,通过命令行提交需要执行以下命令

bin/flink run ./examples/batch/WordCount.jar

可选参数如下:

-c,--class <classname>           Class with the program entry point ("main"
method or "getPlan()" method. Only needed
if the JAR file does not specify the class
in its manifest.
-m,--jobmanager <host:port> Address of the JobManager to
which to connect. Use this flag to connect
to a different JobManager than the one
specified in the configuration.
-p,--parallelism <parallelism> The parallelism with which to run the
program. Optional flag to override the
default value specified in the
configuration

客户端可以自行确定JobManager的地址,也可以通过-m或者–jobmanager 参数指定JobManager的地址,JobManager的地址在yarn session的启动页面中可以找到。


举报

相关推荐

0 条评论