- spark-shell
- dos命令行下输入
spark-shell
- 引入依赖:
spark-shell --jars /path/myjar1.jar,/path/myjar2.jar
- 指定资源:
spark-shell --master yarn-client --driver-memory 16g --num-executors 60 --executor-memory 20g --executor-cores 2
- 自动加载内容
- 显示日志级别
spark.sparkContext.setLogLevel("ERROR")
- dos命令行下输入
- intellij配置
- 修改pom文件添加依赖
<properties> <maven.compiler.source>1.8</maven.compiler.source> <maven.compiler.target>1.8</maven.compiler.target> <encoding>UTF-8</encoding> <scala.version>2.11.8</scala.version> <spark.version>2.2.0</spark.version> <hadoop.version>2.7.1</hadoop.version> <scala.compat.version>2.11</scala.compat.version> </properties> <!--声明并引入公有的依赖--> <dependencies> <dependency> <groupId>org.scala-lang</groupId> <artifactId>scala-library</artifactId> <version>${scala.version}</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.11</artifactId> <version>${spark.version}</version> </dependency> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-client</artifactId> <version>${hadoop.version}</version> </dependency> </dependencies>
- 修改pom文件添加依赖
- 定义spark和sc
- 定义spark
val spark = SparkSession.builder().appName("Word Count").getOrCreate()
- 定义sc
sc = spark.sparkContext()
- 定义spark