文章目录
- 1、解压 Spark 安装包
- 2、配置 spark 系统环境
- 3、配置 集群节点
- 4、配置 spark-env.sh
- 5、分发 spark
- 6、spark 集群启动
准备环境:
- Hadoop 完全分布式集群环境
- Scala 安装包:https://www.scala-lang.org/download/all.html
- Spark 安装包:http://archive.apache.org/dist/spark/
- Scala 的安装参见本人博文:【CentOS】scala安装
- Spark 基础模式参见本人博文:【CentOS】Spark 运行环境(Local、Standalone)
1、解压 Spark 安装包
上传本地的 spark 安装包到虚机上:
解压安装后重命名:
[root@server download]# tar -zxvf spark-2.4.2-bin-hadoop2.6.tgz -C /usr/local/src/
[root@server download]# cd /usr/local/src/
[root@server download]# ll
返回顶部
2、配置 spark 系统环境
打开 /etc/profile
添加如下内容配置spark环境:
# set spark environment
export SPARK_HOME=/usr/local/src/spark
export PATH=$PATH:$SPARK_HOME/bin
配置完成后保存退出 ,source
命令使其生效!
返回顶部
3、配置 集群节点
进入解压缩后路径的 conf
目录,修改 slaves.template
文件名为 slaves
,在其中删除 localhost
,并添加虚机的主机名(一行一个):
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# A Spark Worker will be started on each of the machines listed below.
server # 此处可以省略server主节点的配置
返回顶部
4、配置 spark-env.sh
复制 spark-env.sh.template
文件名为 spark-env.sh
,添加 JAVA_HOME
环境变量和集群对应的 master
主节点,以及用到的 hadoop
对应配置:
# 环境配置
export JAVA_HOME=/usr/local/src/java
export SCALA_HOME=/usr/local/src/scala
export HADOOP_HOME=/usr/local/src/hadoop
export HADOOP_CONF_DIR=/usr/local/src/hadoop/etc/hadoop
# 主节点配置
export SPARK_MASTER_HOST=server
export SPARK_MASTER_IP=192.168.64.183
export SPARK_MASTER_PORT=7077
export SPARK_WORKER_MEMORY=1G
export SPARK_EXECUTOR_CORES=2
返回顶部
5、分发 spark
将主节点的 spark 分发给各从节点虚机:
root@server sbin]# scp -r /usr/local/src/spark root@agent1:/usr/local/src/
root@server sbin]# scp -r /usr/local/src/spark root@agent2:/usr/local/src/
分发完成后,各虚拟机配置环境,并使其生效!
6、spark 集群启动
首先启动 hadoop集群
,然后在主节点使用启动 sbin/start-all.sh
脚本进行启动,从节点便会跟着启动:
浏览器输入:http://server:8080
,查看 Master 资源监控 Web UI 界面:
返回顶部