日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

hibench 压测flink_【原创】大数据基础之Benchmark(1)HiBench

發布時間:2025/3/19 编程问答 42 豆豆
生活随笔 收集整理的這篇文章主要介紹了 hibench 压测flink_【原创】大数据基础之Benchmark(1)HiBench 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

HiBench 7

官方:https://github.com/intel-hadoop/HiBench

一 簡介

HiBench is a big data benchmark suite that helps evaluate different big data frameworks in terms of speed, throughput and system resource utilizations. It contains a set of Hadoop, Spark and streaming workloads, including Sort, WordCount, TeraSort, Sleep, SQL, PageRank, Nutch indexing, Bayes, Kmeans, NWeight and enhanced DFSIO, etc. It also contains several streaming workloads for Spark Streaming, Flink, Storm and Gearpump.

There are totally 19 workloads in HiBench.

Supported Hadoop/Spark/Flink/Storm/Gearpump releases:

Hadoop: Apache Hadoop 2.x, CDH5, HDP

Spark: Spark 1.6.x, Spark 2.0.x, Spark 2.1.x, Spark 2.2.x

Flink: 1.0.3

Storm: 1.0.1

Gearpump: 0.8.1

Kafka: 0.8.2.2

二 spark sql測試

1 download

$ wget https://github.com/intel-hadoop/HiBench/archive/HiBench-7.0.tar.gz

$ tar xvf HiBench-7.0.tar.gz

$ cd HiBench-HiBench-7.0

2 build

1)build all

$ mvn -Dspark=2.1 -Dscala=2.11 clean package

2)build hadoopbench and sparkbench

$ mvn -Phadoopbench -Psparkbench -Dspark=2.1 -Dscala=2.11 clean package

3)only build spark sql

$ mvn -Psparkbench -Dmodules -Psql -Dspark=2.1 -Dscala=2.11 clean package

3 prepare

$ cp conf/hadoop.conf.template conf/hadoop.conf

$ vi conf/hadoop.conf

$ cp conf/spark.conf.template conf/spark.conf

$ vi conf/spark.conf

$ vi conf/hibench.conf

# Data scale profile. Available value is tiny, small, large, huge, gigantic and bigdata.

# The definition of these profiles can be found in the workload's conf file i.e. conf/workloads/micro/wordcount.conf

hibench.scale.profile bigdata

4 run

sql測試分為3種:scan/aggregation/join

$ bin/workloads/sql/scan/prepare/prepare.sh

$ bin/workloads/sql/scan/spark/run.sh

具體配置位于conf/workloads/sql/scan.conf

prepare之后會在hdfs的/HiBench/Scan/Input下生成測試數據,在report/scan/prepare/下生成報告

run之后會在report/scan/spark/下生成報告,比如monitor.html,在hive的default庫下可以看到測試數據表

$ bin/workloads/sql/join/prepare/prepare.sh

$ bin/workloads/sql/join/spark/run.sh

$ bin/workloads/sql/aggregation/prepare/prepare.sh

$ bin/workloads/sql/aggregation/spark/run.sh

依此類推

如果prepare時報錯內存溢出

嘗試修改

$ vi bin/functions/workload_functions.sh

local CMD="${HADOOP_EXECUTABLE} --config ${HADOOP_CONF_DIR} jar $job_jar $job_name $tail_arguments"

格式:hadoop jar -D mapreduce.reduce.memory.mb=5120 -D mapreduce.reduce.java.opts=-Xmx4608m

發現不能生效,嘗試增加map數量

$ vi bin/functions/hibench_prop_env_mapping.py:

NUM_MAPS="hibench.default.map.parallelism",

$ vi conf/hibench.conf

hibench.default.map.parallelism 5000

參考:

https://github.com/intel-hadoop/HiBench/blob/master/docs/build-hibench.md

https://github.com/intel-hadoop/HiBench/blob/master/docs/run-sparkbench.md

與50位技術專家面對面20年技術見證,附贈技術全景圖

總結

以上是生活随笔為你收集整理的hibench 压测flink_【原创】大数据基础之Benchmark(1)HiBench的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。