Spark SQL 笔记(16)—— Spark on YARN
生活随笔
收集整理的這篇文章主要介紹了
Spark SQL 笔记(16)—— Spark on YARN
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
1 Spark 的4種運行模式
不管使用壽命模式,Spark 應用程序的代碼是不變的,只需要在提交的時候通過 --master參數來指定
1.1 概述
- Spark 支持可插拔的集群管理模式;
- 對于YARN,Spark Application 僅僅是一個客戶端;
1.2 Spark on YARN 的模式
1.2.1 client 模式
- Driver 運行在 client 端(提交 Spark 作業的機器)
- Client 會和請求到的 Container 進行通信來完成作業的調度和執行,Client 不能退出;
- 日志在控制臺輸出,便于測試
1.2.2 cluster 模式
- Driver 運行在 Application Master;
- Client 只要提交完作業之后就可以關掉,因為作業已經在 YARN 上運行
- 日志是在終端看不到的,因為日志在Driver上,只能通過 yarn logs -applicationId <app ID>
1.3 設置 HADOOP_CONF_DIR 或者 YARN_CONF_DIR
配置方法有以下幾種:
1.4 測試
1.4.1 啟動YARN
[hadoop@node1 ~]$ start-all.sh This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh 18/11/16 20:36:12 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Starting namenodes on [node1] node1: starting namenode, logging to /home/hadoop/apps/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-namenode-node1.out node2: starting datanode, logging to /home/hadoop/apps/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-datanode-node2.out node3: starting datanode, logging to /home/hadoop/apps/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-datanode-node3.out Starting secondary namenodes [0.0.0.0] 0.0.0.0: starting secondarynamenode, logging to /home/hadoop/apps/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-secondarynamenode-node1.out 18/11/16 20:36:29 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable starting yarn daemons starting resourcemanager, logging to /home/hadoop/apps/hadoop-2.6.0-cdh5.7.0/logs/yarn-hadoop-resourcemanager-node1.out node2: starting nodemanager, logging to /home/hadoop/apps/hadoop-2.6.0-cdh5.7.0/logs/yarn-hadoop-nodemanager-node2.out node3: starting nodemanager, logging to /home/hadoop/apps/hadoop-2.6.0-cdh5.7.0/logs/yarn-hadoop-nodemanager-node3.out [hadoop@node1 ~]$http://node1:8088/cluster
1.4.2 提交
- client 模式
- cluster 模式
https://spark.apache.org/docs/2.1.3/running-on-yarn.html
[hadoop@node1 ~]$ yarn logs -applicationId application_1542371790854_0006 18/11/16 20:58:55 INFO client.RMProxy: Connecting to ResourceManager at node1/192.168.30.131:8032 18/11/16 20:58:56 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable /tmp/logs/hadoop/logs/application_1542371790854_0006does not exist. Log aggregation has not completed or is not enabled. [hadoop@node1 ~]$總結
以上是生活随笔為你收集整理的Spark SQL 笔记(16)—— Spark on YARN的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: oracle11g-R2数据库的逻辑备份
- 下一篇: string数据库使用和实践的第二部分网