日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 运维知识 > 数据库 >内容正文

数据库

Spark SQL 笔记(16)—— Spark on YARN

發布時間:2023/12/15 数据库 31 豆豆
生活随笔 收集整理的這篇文章主要介紹了 Spark SQL 笔记(16)—— Spark on YARN 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

1 Spark 的4種運行模式

不管使用壽命模式,Spark 應用程序的代碼是不變的,只需要在提交的時候通過 --master參數來指定

  • Local,開發時使用
  • Standalone,Spark自帶的,如果一個集群是 Standalone ,那么就需要在多臺機器同時部署Spark環境;
  • YARN:建議在生產中使用;
  • Mesos
  • 1.1 概述

    • Spark 支持可插拔的集群管理模式;
    • 對于YARN,Spark Application 僅僅是一個客戶端;

    1.2 Spark on YARN 的模式

    1.2.1 client 模式

    • Driver 運行在 client 端(提交 Spark 作業的機器)
    • Client 會和請求到的 Container 進行通信來完成作業的調度和執行,Client 不能退出;
    • 日志在控制臺輸出,便于測試

    1.2.2 cluster 模式

    • Driver 運行在 Application Master;
    • Client 只要提交完作業之后就可以關掉,因為作業已經在 YARN 上運行
    • 日志是在終端看不到的,因為日志在Driver上,只能通過 yarn logs -applicationId <app ID>

    1.3 設置 HADOOP_CONF_DIR 或者 YARN_CONF_DIR

    配置方法有以下幾種:

  • export HADOOP_CONF_DIR=/home/hadoop/apps/hadoop-2.6.0-cdh5.7.0/etc/hadoop
  • spark-env.sh
  • 1.4 測試

    1.4.1 啟動YARN

    [hadoop@node1 ~]$ start-all.sh This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh 18/11/16 20:36:12 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Starting namenodes on [node1] node1: starting namenode, logging to /home/hadoop/apps/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-namenode-node1.out node2: starting datanode, logging to /home/hadoop/apps/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-datanode-node2.out node3: starting datanode, logging to /home/hadoop/apps/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-datanode-node3.out Starting secondary namenodes [0.0.0.0] 0.0.0.0: starting secondarynamenode, logging to /home/hadoop/apps/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-secondarynamenode-node1.out 18/11/16 20:36:29 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable starting yarn daemons starting resourcemanager, logging to /home/hadoop/apps/hadoop-2.6.0-cdh5.7.0/logs/yarn-hadoop-resourcemanager-node1.out node2: starting nodemanager, logging to /home/hadoop/apps/hadoop-2.6.0-cdh5.7.0/logs/yarn-hadoop-nodemanager-node2.out node3: starting nodemanager, logging to /home/hadoop/apps/hadoop-2.6.0-cdh5.7.0/logs/yarn-hadoop-nodemanager-node3.out [hadoop@node1 ~]$

    http://node1:8088/cluster

    1.4.2 提交

    • client 模式
    [hadoop@node1 spark-2.1.3-bin-2.6.0-cdh5.7.0]$ ./bin/spark-submit \ > --class org.apache.spark.examples.SparkPi \ > --master yarn \ > --executor-memory 1G \ > --num-executors 1 \ > ./examples/jars/spark-examples_2.11-2.1.3.jar \ > 5 18/11/16 20:49:35 INFO spark.SparkContext: Running Spark version 2.1.3 18/11/16 20:49:36 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 18/11/16 20:49:36 INFO spark.SecurityManager: Changing view acls to: hadoop 18/11/16 20:49:36 INFO spark.SecurityManager: Changing modify acls to: hadoop 18/11/16 20:49:36 INFO spark.SecurityManager: Changing view acls groups to: 18/11/16 20:49:36 INFO spark.SecurityManager: Changing modify acls groups to:
    • cluster 模式
    [hadoop@node1 spark-2.1.3-bin-2.6.0-cdh5.7.0]$ ./bin/spark-submit \ > --class org.apache.spark.examples.SparkPi \ > --master yarn-cluster \ > --executor-memory 1G \ > --num-executors 1 \ > ./examples/jars/spark-examples_2.11-2.1.3.jar \ > 5 Warning: Master yarn-cluster is deprecated since 2.0. Please use master "yarn" with specified deploy mode instead. 18/11/16 20:53:18 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 18/11/16 20:53:19 INFO client.RMProxy: Connecting to ResourceManager at node1/192.168.30.131:8032 18/11/16 20:53:19 INFO yarn.Client: Requesting a new application from cluster with 2 NodeManagers ............................ 18/11/16 20:53:38 INFO yarn.Client: Application report for application_1542371790854_0006 (state: RUNNING) 18/11/16 20:53:39 INFO yarn.Client: Application report for application_1542371790854_0006 (state: RUNNING) 18/11/16 20:53:40 INFO yarn.Client: Application report for application_1542371790854_0006 (state: RUNNING) 18/11/16 20:53:41 INFO yarn.Client: Application report for application_1542371790854_0006 (state: RUNNING) 18/11/16 20:53:42 INFO yarn.Client: Application report for application_1542371790854_0006 (state: RUNNING) 18/11/16 20:53:43 INFO yarn.Client: Application report for application_1542371790854_0006 (state: RUNNING) 18/11/16 20:53:44 INFO yarn.Client: Application report for application_1542371790854_0006 (state: FINISHED) 18/11/16 20:53:44 INFO yarn.Client: client token: N/Adiagnostics: N/AApplicationMaster host: 192.168.30.133ApplicationMaster RPC port: 0queue: root.hadoopstart time: 1542372803673final status: SUCCEEDEDtracking URL: http://node1:8088/proxy/application_1542371790854_0006/Auser: hadoop 18/11/16 20:53:44 INFO util.ShutdownHookManager: Shutdown hook called 18/11/16 20:53:44 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-619e92b6-4fb4-47ac-ab8f-4836ccf9d086

    https://spark.apache.org/docs/2.1.3/running-on-yarn.html

    [hadoop@node1 ~]$ yarn logs -applicationId application_1542371790854_0006 18/11/16 20:58:55 INFO client.RMProxy: Connecting to ResourceManager at node1/192.168.30.131:8032 18/11/16 20:58:56 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable /tmp/logs/hadoop/logs/application_1542371790854_0006does not exist. Log aggregation has not completed or is not enabled. [hadoop@node1 ~]$

    總結

    以上是生活随笔為你收集整理的Spark SQL 笔记(16)—— Spark on YARN的全部內容,希望文章能夠幫你解決所遇到的問題。

    如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。