日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

大数据应用之 --- apache doris 基于ssb测试

發布時間:2023/12/18 编程问答 29 豆豆
生活随笔 收集整理的這篇文章主要介紹了 大数据应用之 --- apache doris 基于ssb测试 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

大數據應用之 — apache doris 基于ssb測試

  • 下載doris的ssb-tools

    https://github.com/apache/doris

    將 doris-master\tools\ssb-tools 上傳到lsyk01:/softw

  • 下載 ssb-gen工具包(因為虛擬機沒有聯網)

    https://palo-cloud-repo-bd.bd.bcebos.com/baidu-doris-release/ssb-dbgen-linux.tar.gz

  • ? 上傳至 lsyk01:/softw/ssb-tools

  • 修改腳本 /softw/ssb-tools/build-ssb-dbgen.sh
  • vi /softw/ssb-tools/build-ssb-dbgen.sh #修改如下:不去下載了,直接解壓下載好的包 # download ssb-dbgen first if [[ -d $SSB_DBGEN_DIR ]]; thenecho "Dir $CURDIR/ssb-dbgen/ already exists. No need to download."echo "If you want to download ssb-dbgen again, please delete this dir first." else#curl https://palo-cloud-repo-bd.bd.bcebos.com/baidu-doris-release/ssb-dbgen-linux.tar.gz | tar xz -C $CURDIR/tar -zxvf $CURDIR/ssb-dbgen-linux.tar.gz -C $CURDIR/ fi
  • 編譯ssb-gen
  • cd /softw/ssb-tools sh build-ssb-dbgen.sh
  • 生成測試數據
  • sh gen-ssb-data.sh -s 40du -sh * 110M customer.tbl 228K date.tbl 2.4G lineorder.tbl.1 2.4G lineorder.tbl.10 2.4G lineorder.tbl.2 2.4G lineorder.tbl.3 2.4G lineorder.tbl.4 2.4G lineorder.tbl.5 2.4G lineorder.tbl.6 2.4G lineorder.tbl.7 2.4G lineorder.tbl.8 2.4G lineorder.tbl.9 99M part.tbl 6.5M supplier.tblwc -l *1200000 customer.tbl2556 date.tbl23996604 lineorder.tbl.124001837 lineorder.tbl.1023992403 lineorder.tbl.223996070 lineorder.tbl.324003563 lineorder.tbl.424005968 lineorder.tbl.524005179 lineorder.tbl.623998304 lineorder.tbl.724002460 lineorder.tbl.824009902 lineorder.tbl.91200000 part.tbl80000 supplier.tbl
  • 配置 doris-cluster.conf
  • # Any of FE host export FE_HOST='lsyk01' # http_port in fe.conf export FE_HTTP_PORT=8030 # query_port in fe.conf export FE_QUERY_PORT=9030 # Doris username export USER='root' # Doris password export PASSWORD='fa' # The database where SSB tables located export DB='ssb'
  • 建表
  • sh ./create-ssb-tables.sh sh ./create-ssb-flat-table.sh
  • 導入數據
  • sh ./load-ssb-dimension-data.shsh ./load-ssb-fact-data.sh -c 5

    ? 很吃內存:

    用時 8分鐘,大小大概6.8G,原文件是:24G

    mysql> select count(1) from ssb.lineorder


    由此可見,apache doris 的緩存了得啊。。。

  • 導入flat寬表
  • sh ./load-ssb-flat-data.sh

    報錯:

    查看代碼,發現沒有指定密碼:

    增加-p密碼

    耗時25分鐘,還報錯了,是內存不足了:

    語句拿出來,半年一次,100秒,比官方的腳本快

    掛了

    INSERT INTO ssb.lineorder_flat SELECTLO_ORDERDATE,LO_ORDERKEY,LO_LINENUMBER,LO_CUSTKEY,LO_PARTKEY,LO_SUPPKEY,LO_ORDERPRIORITY,LO_SHIPPRIORITY,LO_QUANTITY,LO_EXTENDEDPRICE,LO_ORDTOTALPRICE,LO_DISCOUNT,LO_REVENUE,LO_SUPPLYCOST,LO_TAX,LO_COMMITDATE,LO_SHIPMODE,C_NAME,C_ADDRESS,C_CITY,C_NATION,C_REGION,C_PHONE,C_MKTSEGMENT,S_NAME,S_ADDRESS,S_CITY,S_NATION,S_REGION,S_PHONE,P_NAME,P_MFGR,P_CATEGORY,P_BRAND,P_COLOR,P_TYPE,P_SIZE,P_CONTAINER FROM (SELECTlo_orderkey,lo_linenumber,lo_custkey,lo_partkey,lo_suppkey,lo_orderdate,lo_orderpriority,lo_shippriority,lo_quantity,lo_extendedprice,lo_ordtotalprice,lo_discount,lo_revenue,lo_supplycost,lo_tax,lo_commitdate,lo_shipmodeFROM ssb.lineorder-- WHERE ${con} ) l INNER JOIN ssb.customer c ON (c.c_custkey = l.lo_custkey) INNER JOIN ssb.supplier s ON (s.s_suppkey = l.lo_suppkey) INNER JOIN ssb.part p ON (p.p_partkey = l.lo_partkey); select 'part',count(*) from ssb.part union all select 'customer',count(*) from ssb.customer union all select 'supplier',count(*) from ssb.supplier union all select 'date',count(*) from ssb.dates union all select 'lineorder',count(*) from ssb.lineorder union all select 'lineorder_flat',count(*) from ssb.lineorder_flat

  • 測試結果
  • set global enable_vectorized_engine=1; set global parallel_fragment_exec_instance_num=8; set global exec_mem_limit=48G; set global batch_size=4096; set global enable_projection=true; set global runtime_filter_mode=global;--Q1.1 0.68 SELECTSUM(LO_EXTENDEDPRICE * LO_DISCOUNT) AS revenue FROMssb.lineorder_flat WHERELO_ORDERDATE >= 19930101AND LO_ORDERDATE <= 19931231AND LO_DISCOUNT BETWEEN 1 AND 3AND LO_QUANTITY < 25;--Q1.2 0.12 SELECTSUM(LO_EXTENDEDPRICE * LO_DISCOUNT) AS revenue FROMssb.lineorder_flat WHERELO_ORDERDATE >= 19940101AND LO_ORDERDATE <= 19940131AND LO_DISCOUNT BETWEEN 4 AND 6AND LO_QUANTITY BETWEEN 26 AND 35;--Q1.3 0.78 SELECTSUM(LO_EXTENDEDPRICE * LO_DISCOUNT) AS revenue FROMssb.lineorder_flat WHEREweekofyear(LO_ORDERDATE) = 6AND LO_ORDERDATE >= 19940101AND LO_ORDERDATE <= 19941231AND LO_DISCOUNT BETWEEN 5 AND 7AND LO_QUANTITY BETWEEN 26 AND 35;--Q2.1 4.47 SELECTSUM(LO_REVENUE),(LO_ORDERDATE DIV 10000) AS YEAR,P_BRAND FROMssb.lineorder_flat WHEREP_CATEGORY = 'MFGR#12'AND S_REGION = 'AMERICA' GROUP BYYEAR,P_BRAND ORDER BYYEAR,P_BRAND;--Q2.2 2.69 SELECTSUM(LO_REVENUE),(LO_ORDERDATE DIV 10000) AS YEAR,P_BRAND FROMssb.lineorder_flat WHEREP_BRAND >= 'MFGR#2221'AND P_BRAND <= 'MFGR#2228'AND S_REGION = 'ASIA' GROUP BYYEAR,P_BRAND ORDER BYYEAR,P_BRAND;--Q2.3 2.07 SELECTSUM(LO_REVENUE),(LO_ORDERDATE DIV 10000) AS YEAR,P_BRAND FROMssb.lineorder_flat WHEREP_BRAND = 'MFGR#2239'AND S_REGION = 'EUROPE' GROUP BYYEAR,P_BRAND ORDER BYYEAR,P_BRAND;--Q3.1 4.10 SELECTC_NATION,S_NATION,(LO_ORDERDATE DIV 10000) AS YEAR,SUM(LO_REVENUE) AS revenue FROMssb.lineorder_flat WHEREC_REGION = 'ASIA'AND S_REGION = 'ASIA'AND LO_ORDERDATE >= 19920101AND LO_ORDERDATE <= 19971231 GROUP BYC_NATION,S_NATION,YEAR ORDER BYYEAR ASC,revenue DESC;--Q3.2 3.99 SELECTC_CITY,S_CITY,(LO_ORDERDATE DIV 10000) AS YEAR,SUM(LO_REVENUE) AS revenue FROMssb.lineorder_flat WHEREC_NATION = 'UNITED STATES'AND S_NATION = 'UNITED STATES'AND LO_ORDERDATE >= 19920101AND LO_ORDERDATE <= 19971231 GROUP BYC_CITY,S_CITY,YEAR ORDER BYYEAR ASC,revenue DESC;--Q3.3 1.76 SELECTC_CITY,S_CITY,(LO_ORDERDATE DIV 10000) AS YEAR,SUM(LO_REVENUE) AS revenue FROMssb.lineorder_flat WHEREC_CITY IN ('UNITED KI1', 'UNITED KI5')AND S_CITY IN ('UNITED KI1', 'UNITED KI5')AND LO_ORDERDATE >= 19920101AND LO_ORDERDATE <= 19971231 GROUP BYC_CITY,S_CITY,YEAR ORDER BYYEAR ASC,revenue DESC;--Q3.4 0.1 SELECTC_CITY,S_CITY,(LO_ORDERDATE DIV 10000) AS YEAR,SUM(LO_REVENUE) AS revenue FROMssb.lineorder_flat WHEREC_CITY IN ('UNITED KI1', 'UNITED KI5')AND S_CITY IN ('UNITED KI1', 'UNITED KI5')AND LO_ORDERDATE >= 19971201AND LO_ORDERDATE <= 19971231 GROUP BYC_CITY,S_CITY,YEAR ORDER BYYEAR ASC,revenue DESC;--Q4.1 5.97 SELECT(LO_ORDERDATE DIV 10000) AS YEAR,C_NATION,SUM(LO_REVENUE - LO_SUPPLYCOST) AS profit FROMssb.lineorder_flat WHEREC_REGION = 'AMERICA'aND S_REGION = 'AMERICA'AND P_MFGR IN ('MFGR#1', 'MFGR#2') GROUP BYYEAR,C_NATION ORDER BYYEAR ASC,C_NATION ASC;--Q4.2 1.48 SELECT(LO_ORDERDATE DIV 10000) AS YEAR,S_NATION,P_CATEGORY,SUM(LO_REVENUE - LO_SUPPLYCOST) AS profit FROMssb.lineorder_flat WHEREC_REGION = 'AMERICA'AND S_REGION = 'AMERICA'AND LO_ORDERDATE >= 19970101AND LO_ORDERDATE <= 19981231AND P_MFGR IN ('MFGR#1', 'MFGR#2') GROUP BYYEAR,S_NATION,P_CATEGORY ORDER BYYEAR ASC,S_NATION ASC,P_CATEGORY ASC;--Q4.3 1.13 SELECT(LO_ORDERDATE DIV 10000) AS YEAR,S_CITY,P_BRAND,SUM(LO_REVENUE - LO_SUPPLYCOST) AS profit FROMssb.lineorder_flat WHERES_NATION = 'UNITED STATES'AND LO_ORDERDATE >= 19970101AND LO_ORDERDATE <= 19981231AND P_CATEGORY = 'MFGR#14' GROUP BYYEAR,S_CITY,P_BRAND ORDER BYYEAR ASC,S_CITY ASC,P_BRAND ASC;--Q5.1 58.79 selectcount(1),sum(cnt) from(selectLO_ORDERPRIORITY,LO_SHIPMODE,P_COLOR,P_BRAND,count(1) as cnt,sum(LO_SUPPLYCOST)fromssb.lineorder_flatgroup byLO_ORDERPRIORITY,LO_SHIPMODE,P_COLOR,P_BRANDorder byLO_ORDERPRIORITY,LO_SHIPMODE,P_COLOR,P_BRAND ) t ; --3218808 240012290--Q5.2 5.43 selectcount(1),sum(cnt) from(selectLO_ORDERPRIORITY,LO_SHIPMODE,P_COLOR,P_BRAND,count(1) as cnt,sum(LO_SUPPLYCOST)fromssb.lineorder_flatwhereS_NATION = 'UNITED STATES'AND P_CATEGORY = 'MFGR#14'group byLO_ORDERPRIORITY,LO_SHIPMODE,P_COLOR,P_BRANDorder byLO_ORDERPRIORITY,LO_SHIPMODE,P_COLOR,P_BRAND ) t ; --117571--Q6.1 58.79 selectcount(1),sum(cnt) from(selectLO_ORDERPRIORITY,LO_SHIPMODE,P_COLOR,P_BRAND,count(1) as cnt,sum(LO_SUPPLYCOST) as sm,count(distinct S_NAME) as dcntfromssb.lineorder_flatgroup byLO_ORDERPRIORITY,LO_SHIPMODE,P_COLOR,P_BRANDorder byLO_ORDERPRIORITY,LO_SHIPMODE,P_COLOR,P_BRAND ) t ; --報錯,內存不足--Q6.2 10.81 selectcount(1),sum(cnt) from(selectLO_ORDERPRIORITY,LO_SHIPMODE,P_COLOR,P_BRAND,count(1) as cnt,sum(LO_SUPPLYCOST) as sm,count(distinct S_NAME) as dcntfromssb.lineorder_flatwhereS_NATION = 'UNITED STATES'AND P_CATEGORY = 'MFGR#14'group byLO_ORDERPRIORITY,LO_SHIPMODE,P_COLOR,P_BRANDorder byLO_ORDERPRIORITY,LO_SHIPMODE,P_COLOR,P_BRAND ) t ; --117571 386092

    總結

    以上是生活随笔為你收集整理的大数据应用之 --- apache doris 基于ssb测试的全部內容,希望文章能夠幫你解決所遇到的問題。

    如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。