文章目录
- 主节点安装软件
- (1)下载Scala和Spark软件包并解压
- 设置Spark参数
- (1)修改spark-env.sh文件
- (2)修改slaves文件
- (3)修改环境变量并生效
- Slave节点安装软件
- (1)登录从节点1节点安装软件
- (2)登录从节点2节点安装软件
- (3)修改从节点1节点和从节点2节点环境变量并生效
- 测试Spark
- (1)登录各集群节点启动Zookeeper服务并查看服务状态
- (2)在主节点上启动Hadoop服务
- (3)在主节点上启动Spark服务
- (4)查看各集群节点进程
- (5)打开浏览器输入“http://master:8080”,查看Spark集群情况
- 若在防火墙关闭的情况下出现了页面404,可能端口被占用
- (6)打开浏览器输入“http://slave1:8081”,查看Worker执行情况
- (7)先启动spark-shell浏览器输入“http://master:4040”查看“Spark Jobs”。
主节点安装软件
(1)下载Scala和Spark软件包并解压
hadoop-master:~$ cd /opt/
hadoop-master:~$ tar xvzf /home/hadoop/scala-2.12.11.tgz
hadoop-master:~$ sudo chown -R hadoop:hadoop /opt/scala-2.12.11/
hadoop-master:~$ sudo tar xvzf /home/hadoop/spark-2.1.0-bin-hadoop2.7.tgz
hadoop-master:~$ sudo chown -R hadoop:hadoop /opt/spark-2.1.0-bin-hadoop2.7/
设置Spark参数
(1)修改spark-env.sh文件
hadoop-master:~$ cd /opt/spark-2.1.0-bin-hadoop2.7/conf/
hadoop-master:/opt/spark-2.1.0-bin-hadoop2.7/conf$ mv spark-env.sh.template spark-env.sh
添加:
export JAVA_HOME=/opt/jdk1.8.0_221
export HADOOP_HOME=/opt/hadoop-2.8.5
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export SCALA_HOME=/opt/scala-2.12.11
export SPARK_HOME=/opt/spark-2.1.0-bin-hadoop2.7
export SPARK_MASTER_IP=ddai-master
export SPARK_WORKER_MEMORY=2g
(2)修改slaves文件
hadoop-master:/opt/spark-2.1.0-bin-hadoop2.7/conf$ mv slaves.template slaves
ddai-slave1
ddai-slave2
(3)修改环境变量并生效
hadoop-master:~$ vim /home/hadoop/.profile
hadoop-master:~$ source /home/hadoop/.profile
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export SCALA_HOME=/opt/scala-2.12.11
export SPARK_HOME=/opt/spark-2.1.0-bin-hadoop2.7
export PATH=$PATH:$SCALA_HOME/bin:$SPARK_HOME/bin
Slave节点安装软件
(1)登录从节点1节点安装软件
hadoop-slave1:~$ sudo scp -r hadoop-master:/opt/scala-2.12.11 /opt
hadoop-slave1:~$ sudo scp -r hadoop-master:/opt/spark-2.1.0-bin-hadoop2.7 /opt
hadoop-slave1:~$ sudo chown -R hadoop:hadoop /opt/scala-2.12.11/
hadoop-slave1:~$ sudo chown -R hadoop:hadoop /opt/spark-2.1.0-bin-hadoop2.7/
(2)登录从节点2节点安装软件
(同上操作)
(3)修改从节点1节点和从节点2节点环境变量并生效
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export SCALA_HOME=/opt/scala-2.12.11
export SPARK_HOME=/opt/spark-2.1.0-bin-hadoop2.7
export PATH=$PATH:$SCALA_HOME/bin:$SPARK_HOME/bin
测试Spark
(1)登录各集群节点启动Zookeeper服务并查看服务状态
zkServer.sh start
zkServer.sh status
(2)在主节点上启动Hadoop服务
start-all.sh
mr-jobhistory-daemon.sh start historyserver
(3)在主节点上启动Spark服务
hadoop-master:~$ /opt/spark-2.1.0-bin-hadoop2.7/sbin/start-all.sh
(4)查看各集群节点进程
(5)打开浏览器输入“http://master:8080”,查看Spark集群情况
若在防火墙关闭的情况下出现了页面404,可能端口被占用
修改以下配置,参与集群节点都要
vim /opt/spark-2.1.0-bin-hadoop2.7/sbin/start-master.sh