日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

阿龙学堂-启动spark任务的两种方式

發布時間:2023/12/14 编程问答 39 豆豆
生活随笔 收集整理的這篇文章主要介紹了 阿龙学堂-启动spark任务的两种方式 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

1、簡介

spark在現在的數據分析,包括離線分析,微批次處理過程中有很多的運用,但是在啟動任務的過程中,大部分是將spark作為一個單獨的項目進行處理,但是有時候,在一些和web等項目整合的時候,就不需要單獨的進行處理,因此就有了一下兩種啟動spark的任務方式:

1.1、使用方式1

  • 創建傳統的maven項目,將spark進行單獨的處理,首先添加pom坐標
  • <?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0"xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"><modelVersion>4.0.0</modelVersion><groupId>cn.alongxuetang.com</groupId><artifactId>alongxuetang-root</artifactId><version>1.0-SNAPSHOT</version><properties><spark.version>2.0.2</spark.version></properties><dependencies><dependency><groupId>org.apache.spark</groupId><artifactId>spark-core_2.11</artifactId><version>${spark.version}</version></dependency><dependency><groupId>org.apache.spark</groupId><artifactId>spark-sql_2.11</artifactId><version>${spark.version}</version></dependency><dependency><groupId>org.apache.spark</groupId><artifactId>spark-hive_2.11</artifactId><version>${spark.version}</version></dependency><dependency><groupId>org.scala-lang</groupId><artifactId>scala-library</artifactId><version>2.11.8</version></dependency><dependency><groupId>org.scala-lang</groupId><artifactId>scala-compiler</artifactId><version>2.11.8</version></dependency><dependency><groupId>org.scala-lang</groupId><artifactId>scala-reflect</artifactId><version>2.11.8</version></dependency><dependency><groupId>org.apache.hadoop</groupId><artifactId>hadoop-common</artifactId><version>2.7.5</version></dependency><dependency><groupId>org.apache.hadoop</groupId><artifactId>hadoop-hdfs</artifactId><version>2.7.5</version></dependency><!--寫入hbase使用--><dependency><groupId>org.apache.hbase</groupId><artifactId>hbase-client</artifactId><version>1.3.1</version></dependency><dependency><groupId>org.slf4j</groupId><artifactId>slf4j-api</artifactId><version>1.7.25</version></dependency><dependency><groupId>org.slf4j</groupId><artifactId>jcl-over-slf4j</artifactId><version>1.7.25</version></dependency><dependency><groupId>org.apache.logging.log4j</groupId><artifactId>log4j-core</artifactId><version>2.8.2</version></dependency><dependency><groupId>org.apache.logging.log4j</groupId><artifactId>log4j-api</artifactId><version>2.8.2</version></dependency><dependency><groupId>org.apache.logging.log4j</groupId><artifactId>log4j-slf4j-impl</artifactId><version>2.8.2</version></dependency><dependency><groupId>com.typesafe</groupId><artifactId>config</artifactId><version>1.3.1</version></dependency><dependency><groupId>it.nerdammer.bigdata</groupId><artifactId>spark-hbase-connector_2.10</artifactId><version>1.0.3</version></dependency></dependencies><build><sourceDirectory>src/main/scala</sourceDirectory><resources><resource><directory>src/main/resources</directory><filtering>false</filtering><!-- <excludes>&lt;!&ndash; 在調用時指定,不要打到jar包中 &ndash;&gt;<exclude>application.conf</exclude></excludes>--></resource></resources><plugins><plugin><groupId>org.scala-tools</groupId><artifactId>maven-scala-plugin</artifactId><version>2.15.2</version><executions><execution><goals><goal>compile</goal><goal>testCompile</goal></goals></execution></executions></plugin><plugin><artifactId>maven-assembly-plugin</artifactId><version>2.6</version><configuration><descriptorRefs><descriptorRef>jar-with-dependencies</descriptorRef></descriptorRefs><outputDirectory>target</outputDirectory></configuration><executions><execution><id>make-assembly</id><phase>package</phase><goals><goal>single</goal></goals></execution></executions></plugin></plugins></build></project>

    2、編寫scala代碼:

    package cn.java.spark.comimport org.apache.spark.rdd.RDD import org.apache.spark.{SparkConf, SparkContext}object WordCount {val lines:String="lines String hive hive hive def main def main Array"def main(args: Array[String]): Unit = {val sparkConf: SparkConf = new SparkConf().setAppName("WordCount")val sc = new SparkContext(sparkConf)sc.setLogLevel("WARN")val line: RDD[String] = sc.parallelize(Seq(lines))val words: RDD[String] = line.flatMap(_.split(" "))val wordAnd1: RDD[(String, Int)] = words.map((_,1))val result: RDD[(String, Int)] = wordAnd1.reduceByKey(_+_)val array: Array[(String, Int)] = result.sortBy(_._2,false).collect()array.foreach(println)sc.stop()} }

    ?3、啟動任務

    #!/bin/bash /usr/local/service/spark/bin/spark-submit \ --class cn.java.spark.com.WordCount \ --master yarn \ --num-executors 10 \ --driver-memory 2g \ --executor-memory 5g \ /home/hadoop/alongxuetang/servers/spark/prod/alongxuetang-rec-spark-1.0-SNAPSHOT-jar-with-dependencies.jar

    1.2、使用方式2

    1、添加以上maven的pom依賴

    2、編寫Java代碼

    package cn.java.spark.com;import org.apache.spark.launcher.SparkAppHandle; import org.apache.spark.launcher.SparkLauncher;import java.io.IOException;/*** @ClassName Launcher* @Description TODO* @Author Administrator* @Date 2019/10/24* @Version 1.0*/ public class Launcher {public static void main(String[] args) throws IOException {SparkAppHandle handler = new SparkLauncher().setAppName("zhouyalong-apps-NB100").setSparkHome("/usr/local/service/spark").setMaster("yarn").setConf("spark.driver.memory", "2g").setConf("spark.executor.memory", "1g").setConf("spark.executor.cores", "3").setAppResource("/home/hadoop/data/zhouyalong/aggr/shidian-root-1.0-SNAPSHOT.jar").setMainClass("cn.java.spark.com.WordCount")//.addAppArgs("I come from Launcher")//.setDeployMode("cluster").startApplication(new SparkAppHandle.Listener() {@Overridepublic void stateChanged(SparkAppHandle handle) {System.out.println("********** state changed **********");}@Overridepublic void infoChanged(SparkAppHandle handle) {System.out.println("********** info changed **********");}});while (!"FINISHED".equalsIgnoreCase(handler.getState().toString()) && !"FAILED".equalsIgnoreCase(handler.getState().toString())) {System.out.println("id " + handler.getAppId());System.out.println("state " + handler.getState());try {Thread.sleep(10000);} catch (InterruptedException e) {e.printStackTrace();}}} }

    3、啟動任務

    java -Djava.ext.dirs=/home/hadoop/data/alongxuetang/aggr -cp alongxuetang-root-1.0-SNAPSHOT.jar cn.java.spark.com.Launcher /usr/local/service/spark yarn

    ?3、兩種啟動spark任務方式介紹完畢,在項目中根據自己的需要進行使用即可。歡迎關注微信公眾號? 【阿龍學堂】,更多編程基礎知識及機器學習學習內容

    ?

    總結

    以上是生活随笔為你收集整理的阿龙学堂-启动spark任务的两种方式的全部內容,希望文章能夠幫你解決所遇到的問題。

    如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。