日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

Hadoop 之Pig的安装的与配置之遇到的问题---待解决

發布時間:2025/5/22 编程问答 28 豆豆
生活随笔 收集整理的這篇文章主要介紹了 Hadoop 之Pig的安装的与配置之遇到的问题---待解决 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

1. 前提是hadoop集群已經配置完成并且可以正常啟動;以下是我的配置方案:

首先配置vim /etc/hosts

192.168.1.64 xuegod64

192.168.1.65 xuegod65

192.168.1.63 xuegod63

(將配置好的文件拷貝到其他兩臺機器,我是在xuegod64上配置的,使用scp /etc/hosts xuegod63:/etc/進行拷貝,進行該步驟前提是已經配置好SSH免密碼登錄;關于SSH免密碼登錄在此就不再詳說了)

2.準備安裝包如下圖

[hadoop@xuegod64 ~]$ ls

hadoop-2.4.1.tar.gz

pig-0.15.0.tar.gz

jdk-8u66-linux-x64.rpm

zookeeper-3.4.7.tar.gz(可以不用)

3.配置/etc/profile

[hadoop@xuegod64 ~]$ vim /etc/profile #前提是使用root用戶將編輯此文件的權限賦予hadoop用戶

export JAVA_HOME=/usr/java/jdk1.8.0_66/

export HADOOP_HOME=/home/hadoop/hadoop-2.4.1/

export HBASE_HOME=/home/hadoop/hbase-1.1.2/

export ZOOKEEPER_HOME=/home/hadoop/zookeeper-3.4.7/

export PIG_HOME=/home/hadoop/pig-0.15.0/

export PATH=$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$ZOO

KEEPER_HOME/bin:$HBASE_HOME/bin:$PIG_HOME/bin:$PATH

4.檢驗pig是否配置成功

[hadoop@xuegod64 ~]$ pig -help

Apache Pig version 0.15.0 (r1682971)

compiled Jun 01 2015, 11:44:35

USAGE: Pig [options] [-] : Run interactively in grunt shell.

Pig [options] -e[xecute] cmd [cmd ...] : Run cmd(s).

Pig [options] [-f[ile]] file : Run cmds found in file.

5.Pig執行模式

Pig有兩種執行模式,分別為:

1)本地模式(Local)本地模式下,Pig運行在單一的JVM中,可訪問本地文件。該模式適用于處理小規模數據或學習之用。

運行以下命名設置為本地模式:

pig –x local

2)MapReduce模式在MapReduce模式下,Pig將查詢轉換為MapReduce作業提交給Hadoop(可以說群集 ,也可以說偽分布式)。應該檢查當前Pig版本是否支持你當前所用的Hadoop版本。某一版本的Pig僅支持特定版本的Hadoop,你可以通過訪問Pig官網獲取版本支持信息。

Pig會用到HADOOP_HOME環境變量。如果該變量沒有設置,Pig也可以利用自帶的Hadoop庫,但是這樣就無法保證其自帶肯定庫和你實際使用的HADOOP版本是否兼容,所以建議顯式設置HADOOP_HOME變量。且還需要設置如下變量:

export PIG_CLASSPATH=$HADOOP_HOME/etc/hadoop

Pig默認模式是mapreduce,你也可以用以下命令進行設置:

[hadoop@xuegod64 ~]$ pig -x mapreduce

(中間略)

grunt>

下一步,需要告訴Pig它所用Hadoop集群的Namenode和Jobtracker。一般情況下,正確安裝配置Hadoop后,這些配置信息就已經可用了,不需要做額外的配置

6.運行Pig程序
Pig程序執行方式有三種

1) 腳本方式

直接運行包含Pig腳本的文件,比如以下命令將運行本地scripts.pig文件中的所有命令:

pig scripts.pig

2)Grunt方式

a) Grunt提供了交互式運行環境,可以在命令行編輯執行命令

b) Grund同時支持命令的歷史記錄,通過上下方向鍵訪問。

c) Grund支持命令的自動補全功能。比如當你輸入a = foreach b g時,按下Tab鍵,則命令行自動變成a = foreach b generate。你甚至可以自定義命令自動補全功能的詳細方式。具體請參閱相關文檔。

3) 嵌入式方式

可以在java中運行Pig程序,類似于使用JDBC運行SQL程序

(不熟悉)

6.啟動集群

[hadoop@xuegod64 ~]$ start-all.sh

[hadoop@xuegod64 ~]$ jps

4722 DataNode

5062 DFSZKFailoverController

5159 ResourceManager

4905 JournalNode

5321 Jps

4618 NameNode

2428 QuorumPeerMain

5279 NodeManager

[hadoop@xuegod64 ~]$ ssh xuegod63

Last login: Sat Jan 2 23:10:21 2016 from xuegod64

[hadoop@xuegod63 ~]$ jps

2130 QuorumPeerMain

3125 Jps

2982 NodeManager

2886 JournalNode

2795 DataNode

[hadoop@xuegod64 ~]$ ssh xuegod65

Last login: Sat Jan 2 15:11:33 2016 from xuegod64

[hadoop@xuegod65 ~]$ jps

3729 Jps

2401 QuorumPeerMain

3415 JournalNode

3484 DFSZKFailoverController

3325 DataNode

3583 NodeManager

3590 SecondNameNode

7.簡單示例

我們以查找最高氣溫為例,演示如何利用Pig統計每年的最高氣溫。假設數據文件內容如下(每行一個記錄,tab分割)

以local模式進入pig,依次輸入以下命令(注意以分號結束語句):

[hadoop@xuegod64 ~]$ pig -x local

grunt> records = load'/home/hadoop/zuigaoqiwen.txt'as(year:chararray,temperature:int);

2016-01-02 16:12:05,700 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum

2016-01-02 16:12:05,701 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS

grunt> dump records;

(1930:28:1,)

(1930:0:1,)

(1930:22:1,)

(1930:22:1,)

(1930:22:1,)

(1930:22:1,)

(1930:28:1,)

(1930:0:1,)

(1930:0:1,)

(1930:0:1,)

(1930:11:1,)

(1930:0:1,)

(過程略)

grunt> describe records;

records: {year: chararray,temperature: int}

grunt> valid_records = filter records by temperature!=999;

grunt> grouped_records = group valid_records by year;

grunt> dump grouped_records;

grunt> describe grouped_records;

grouped_records: {group: chararray,valid_records: {(year: chararray,temperature: int)}}

grunt> grouped_records = group valid_records by year;

grunt> dump grouped_records;

.

.

2016-01-02 16:16:02,974 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.MapTask - Processing split: Number of splits :1

Total Length = 7347344

Input split[0]:

Length = 7347344

ClassName: org.apache.hadoop.mapreduce.lib.input.FileSplit

Locations:

2016-01-02 16:16:08,011 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete

2016-01-02 16:16:08,012 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics:

HadoopVersion PigVersion UserId StartedAt FinishedAFeatures

2.4.1 0.15.0 hadoop 2016-01-02 16:16:02 2016-01-02 16:16:08 GROUP_BY,FILTER

Success!

Job Stats (time in seconds):

JobId Maps Reduces MaxMapTime MinMapTime AvgMapTimMedianMapTime MaxReduceTime MinReduceTime AvgReduceTime MedianReducetime Alias Feature Outputs

job_local798558500_0002 1 1 n/a n/a n/a n/a n/a n/a n/a n/a grouped_records,records,valid_records GROUP_BY file:/tmp/temp-206603117/tmp-1002834084,

Input(s):

Successfully read 642291 records from: "/home/hadoop/zuigaoqiwen.txt"

Output(s):

Successfully stored 0 records in: "file:/tmp/temp-206603117/tmp-1002834084"

Counters:

Total records written : 0

Total bytes written : 0

Spillable Memory Manager spill count : 0

Total bags proactively spilled: 0

Total records proactively spilled: 0

Job DAG:

job_local798558500_0002

grunt> describe grouped_records;

grouped_records: {group: chararray,valid_records: {(year: chararray,temperature: int)}}

grunt> max_temperature = foreach grouped_records generate group,MAX(valid_records.temperature);

grunt> dump max_temperature;

(1990,23)

(1991,21)

(1992,30)

grunt> quit

2016-01-02 16:24:25,303 [main] INFO org.apache.pig.Main - Pig script completed in 14 minutes, 27 seconds and 123 milliseconds (867123 ms)

中間有些問題,搞不定:

錯誤提示:

2016-01-02 16:18:28,049 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized

2016-01-02 16:18:28,050 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized

2016-01-02 16:18:28,050 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized

2016-01-02 16:18:28,055 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Encountered Warning ACCESSING_NON_EXISTENT_FIELD 642291 time(s).

2016-01-02 16:18:28,055 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!

2016-01-02 16:18:28,055 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum

2016-01-02 16:18:28,056 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS

2016-01-02 16:18:28,056 [main] WARN org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already been initialized

2016-01-02 16:18:28,246 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1

2016-01-02 16:18:28,246 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1

錯誤日志:

at java.lang.reflect.Method.invoke(Method.java:497)

Pig Stack Trace

at org.apache.pig.tools.grunt.GruntParser.processPig(Grunt

Parser.java:1082)

at org.apache.pig.tools.pigscript.parser.PigScriptParser.p

arse(PigScriptParser.java:505)

at org.apache.pig.tools.grunt.GruntParser.parseStopOnError

(GruntParser.java:230)

at org.apache.pig.tools.grunt.GruntParser.parseStopOnError

(GruntParser.java:205)

at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:66)

at org.apache.pig.Main.run(Main.java:565)

at org.apache.pig.Main.main(Main.java:177)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Met

hod)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMetho

dAccessorImpl.java:62)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(Delegat

ingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:497)

at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

==============

轉載于:https://www.cnblogs.com/zd520pyx1314/p/6534005.html

總結

以上是生活随笔為你收集整理的Hadoop 之Pig的安装的与配置之遇到的问题---待解决的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。