日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

sqoop、datax几种导入导出

發布時間:2023/12/19 编程问答 36 豆豆
生活随笔 收集整理的這篇文章主要介紹了 sqoop、datax几种导入导出 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

一、sqoop

1、列出數據庫

sqoop list-databases \ --connect jdbc:mysql://hadoop01:3306 \ --username root --password 123456;

2、列出表

sqoop list-tables \ --connect jdbc:mysql://hadoop01:3306/test \ --username root --password 123456;

3、本地密碼

本地文件: file:///home/user/password hdfs文件: /user/user/password 注意: 密碼文件直接vim編輯會報錯, 應采用重定向追加或覆蓋寫入操作 echo -n "password" >> /home/taojiamin/data/passwd.pwd && chmod 400 /home/taojiamin/data/passwd.pwdecho -n 不換行輸出;bin/sqoop import \ --connect jdbc:mysql://com.hadoop05:3306/testdb \ --username root \ --password-file file:///home/hadoop/mypasswd \ --table user \ --target-dir /sqoop/input \ -m 1 \ --fields-terminated-by '\t' \ --delete-target-dir

4、mysql導入數據到hdfs

1)全量導入

sqoop import --connect jdbc:mysql://hadoop01:3306/test \ --driver com.mysql.jdbc.Driver \ --username root --password 123456 \ ##--table user \ --query “select * from user where \$CONDITIONS” \ -m 1 \ --split-by user_id\ --fields-terminated-by '\t' \ --compress \ --compression-codec com.hadoop.compression.lzo.LzoCodec \ --hive-drop-import-delims \ --null-string “\\N” \ --null-non-string “\\N” \ --columns 'user_id,user_name,trade_time' \ --delete-target-dir \ --target-dir hdfs://hadoop01:9000/sqooptest/mysql2hdfs 參數說明
--connect指定連接jdbc端口和數據庫名稱
--drivermysql驅動
--tablemysql表名
--query查詢語句后面必須有where條件\$CONDITIONS相當于1=0
-m指定mapper數量,默認為4
--split-by數據切分字段,默認主鍵 ,無主鍵必須指定
--fields-terminated-by指定數據分隔符,默認逗號
--compress開啟壓縮,默認gz,不寫compress則不啟
--compression-codec指定壓縮格式
--hive-drop-import-delims刪除數據中包含的Hive默認分隔符(^A, ^B, \n)
--null-string指定string類型空值的替換符
--null-non-string指定非string類型空值的替換符
--columns數據庫字段,可不寫
--delete-target-dir如果目標目錄已存在,則先刪除
--target-dir導入目標目錄

2)增量導入

①Append方式

sqoop import \--connect jdbc:mysql://hadoop01:3306/test \--username root \--password 123456 \--query “select order_id, name from user where \$CONDITIONS” \--target-dir hdfs://hadoop01:9000/sqooptest/mysql2hdfs \ --split-by user_id\-m 1\--incremental append \--check-column user_id \--last-value 0 參數說明
--incremental append基于遞增列的增量導入(將遞增列值大于閾值的所有數據增量導入Hadoop
--check-column遞增列(int)
--last-value閾值(int)

②lastModify方式

sqoop import \--connect jdbc:mysql://192.168.xxx.xxx:3316/testdb \--username root \--password transwarp \--query “select order_id, name from order_table where \$CONDITIONS” \--target-dir /user/root/order_all \ --split-by id \-m 4 \--incremental lastmodified \--merge-key order_id \--check-column time \# remember this date !!!--last-value “2014-11-09 21:00:00” 參數說明
--incremental lastmodified基于時間列的增量導入(將時間列大于等于閾值的所有數據增量導入Hadoop)
--check-column時間列
--last-value閾值
–merge-key合并列(主鍵,合并鍵值相同的記錄)

5、hdfs普通文件(無壓縮)導入數據到mysql

sqoop export \ --connect jdbc:mysql://hadoop01:3306/test \ --username root \ --password 123456 \ --table u2 \ --input-fields-terminated-by ',' --export-dir hdfs://hadoop01:9000/sqooptest/mysql2hdfs2/*\

6、mysql導入數據到hive(無壓縮)

sqoop import --connect jdbc:mysql://hadoop01:3306/test \ --driver com.mysql.jdbc.Driver \ --username root --password 123456 \ --table user \ --columns 'user_id,user_name,trade_time' \ --null-string '\\N' \ --null-non-string 0 \ --delete-target-dir \ --fields-terminated-by '\t' \ --hive-import \ --hive-overwrite \ --hive-table qf24.sq1

7、增量導入

append:

sqoop import --connect jdbc:mysql://hadoop01:3306/test \ --driver com.mysql.jdbc.Driver \ --username root --password 123456 \ --table user \ -m 1 \ --incremental append \ --check-column id \ --last-value 0 \ --target-dir hdfs://hadoop01:9000/sqooptest/mysql2hdfs

lastmodified(使用時間戳):

在這里插入代碼片

8、添加到sqoop job

sqoop job:

--create <job-id> Create a new saved job--delete <job-id> Delete a saved job--exec <job-id> Run a saved job--help Print usage instructions--list List saved jobs--meta-connect <jdbc-uri> Specify JDBC connect string for themetastore--show <job-id> Show the parameters for a saved job--verbose Print more information while workingsqoop job --list 查看sqoop job列表

添加job:

sqoop job --create myjob -- import --connect jdbc:mysql://hadoop01:3306/test \ --driver com.mysql.jdbc.Driver \ --username root --password 123456 \ --table user \ -m 1 \ --incremental append \ --check-column id \ --last-value 0 \ --target-dir hdfs://hadoop01:9000/sqooptest/mysql2hdfs

執行job

sqoop job --exec myjob

如下兩個使用于有主鍵的表的數據更新:

--update-key --update-mode allowinsert

二、datax

注:需要提前建好目錄和表
運行:python /home/hadoop/app/datax/datax/bin/datax.py mysql2hive.json

1、mysql導入hdfs

vi mysql2hdfs.json

{"job": {"content": [{"reader": {"name": "mysqlreader", "parameter": {"column": ["user_id","user_name","trade_time"], "connection": [{"jdbcUrl": ["jdbc:mysql://hadoop02:3306/test"], "table": ["user"]}], "username": "root", "password": "123456"}}, "writer": {"name": "hdfswriter","parameter": {"defaultFS": "hdfs://hadoop02:9000","fileType": "orc","path": "/datax/mysql2hdfs/userorc","fileName": "m2huser","column": [{"name": "col1","type": "BIGINT"},{"name": "col2","type": "STRING"},{"name": "col3","type": "STRING"}],"writeMode": "append","fieldDelimiter": "\t","compress":"NONE"}}}], "setting": {"speed": {"channel": "1"}}} }

2、hdfs導入mysql

mysql的數值型(int,bigint、long等),在hdfs導入mysql是type為LONG
vi hdfs2mysql.json:

{"job": {"content": [{"reader": {"name": "hdfsreader","parameter": {"path": "/datax/mysql2hdfs/userorc/*","defaultFS": "hdfs://hadoop02:9000","column": [ {"index": "0","type": "LONG"},{"index": "1","type": "STRING"},{"index": "2","type": "STRING"}],"fileType": "orc","encoding": "UTF-8","fieldDelimiter": "\t"}},"writer": {"name": "mysqlwriter", "parameter": {"column": ["user_id","user_name","trade_time"], "connection": [{"jdbcUrl": "jdbc:mysql://hadoop01:3306/test", "table": ["u2"]}], "password": "123456", "username": "root"}}}], "setting": {"speed": {"channel": "1"}}} }

3、mysql導入hive

vi hdfs2hive.json:

{"job": {"content": [{"reader": {"name": "mysqlreader","parameter": {"column": ["user_id","user_name","trade_time"], "connection": [{"jdbcUrl": ["jdbc:mysql://hadoop01:3306/test"],"table": ["u2"]}], "username": "root","password": "123456"}},"writer": {"name": "hdfswriter","parameter": {"defaultFS": "hdfs://hadoop02:9000","fileType": "text","path": "/user/hive/warehouse/qf24.db/dataxhive1","fileName": "user","column": [{"name": "col1","type": "BIGINT"},{"name": "col2","type": "STRING"},{"name": "col3","type": "STRING"}],"writeMode": "append","fieldDelimiter": "\t"}}}], "setting": {"speed": {"channel": "1"}}} }

4、hive導入mysql

創作挑戰賽新人創作獎勵來咯,堅持創作打卡瓜分現金大獎

總結

以上是生活随笔為你收集整理的sqoop、datax几种导入导出的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。