日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當(dāng)前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

Hadoop伪集群环境搭建

發(fā)布時間:2025/6/15 编程问答 27 豆豆
生活随笔 收集整理的這篇文章主要介紹了 Hadoop伪集群环境搭建 小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.
結(jié)合網(wǎng)上多份文檔,不斷反復(fù)的修正hadoop啟動和運行過程中出現(xiàn)的問題,終于把Hadoop2.5.2偽分布式安裝起來,跑通了wordcount例子。Hadoop的安裝復(fù)雜性的體現(xiàn)之一是,Hadoop的安裝文檔非常多,但是能一個文檔走下來的少之又少,尤其是Hadoop不同版本的配置差異非常的大。Hadoop2.5.2于前兩天發(fā)布,但是它的配置跟2.5.0,2.5.1沒有分別。 系統(tǒng)環(huán)境: Ubuntu 12.04 LTS x86_32

一、創(chuàng)建用戶組和用戶

?

  • ?創(chuàng)建用戶組,為系統(tǒng)添加一個用戶組hadoop
Java代碼??
  • sudo?addgroup?hadoop??
    • 創(chuàng)建用戶,為系統(tǒng)添加一個用戶hadoop

    ?

    Java代碼??
  • useradd?-g?hadoop?hadoop??
  • ?

    • 使用hadoop用戶登陸
    Java代碼??
  • su?hadoop??
  • ?

    二、使用SSH免密碼登錄

    • 執(zhí)行如下命令,生成ssh的密鑰,創(chuàng)建的key的路徑是/home/hadoop/.ssh/id_rsa
    Java代碼??
  • ssh-keygen?-t?rsa?-P?""??
  • ????????
    • 將 /home/hadoop/.ssh/id_rsa.pub中的內(nèi)容追加到/home/hadoop/.ssh/authorized_keys中,保存
    • 執(zhí)行ssh localhost,驗證無密碼即可登陸

    ?

    三、禁用IPv6

    • 執(zhí)行如下命令查看當(dāng)前IPv6是否禁用,1表示禁用,0表示未禁用,默認(rèn)是0
    Java代碼??
  • cat?/proc/sys/net/ipv6/conf/all/disable_ipv6??
    • ?編輯如下文件,添加三行,禁用IPv6
    Java代碼??
  • sudo?vim?/etc/sysctl.conf??
  • ?
    Java代碼??
  • net.ipv6.conf.all.disable_ipv6?=?1??
  • net.ipv6.conf.default.disable_ipv6?=?1??
  • net.ipv6.conf.lo.disable_ipv6?=?1??
  • ?
    • ?重啟機器,再次查看IPv6是否禁用

    四、安裝配置JDK

    ??

    • ?? 編輯/etc/profile文件,設(shè)置JAVA相關(guān)的系統(tǒng)變量

    ?

    Java代碼??
  • export?JAVA_HOME=/software/devsoftware/jdk1.7.0_55??
  • export?PATH=$JAVA_HOME/bin:$PATH??
  • ?

    ?五、安裝配置Hadoop2.5.2

    • ?編輯/etc/profile文件,設(shè)置Hadoop相關(guān)的系統(tǒng)變量
    Java代碼??
  • export?HADOOP_HOME=/home/hadoop/hadoop-2.5.2??
  • export?PATH=$HADOOP_HOME/bin:$PATH??
  • ?
    • 執(zhí)行如下使上面配置的系統(tǒng)變量生效
    Java代碼??
  • source?/etc/profile??
    • 將JDK設(shè)置到Hadoop的環(huán)境腳本/home/hadoop/hadoop-2.5.2/etc/hadoop/hadoop-env.sh中,追加一行
    Java代碼??
  • export?JAVA_HOME=/software/devsoftware/jdk1.7.0_55??
  • ?

    六、Hadoop2.5.2配置文件設(shè)置

    Hadoop2.5.2有四個配置文件需要配置,它們都位于/home/hadoop/hadoop-2.5.2/etc/hadoop目錄下。四個文件分別是
    • core-site.xml??
    • yarn-site.xml
    • mapred-site.xml
    • hdfs-site.xml

    這寫配置文件中有些需要手工創(chuàng)建目錄,有些需要根據(jù)系統(tǒng)的實際情況,設(shè)置hostname,hostname不能是IP或者localhost,需要在/etc/hosts中進(jìn)行設(shè)置。需要補充一點,有幾個文檔指出,127.0.0.1最好只跟一個hostname(即Hadoop用到的)綁定,把其余的注釋掉。這個究竟是否產(chǎn)生影響,沒有測,只是按照網(wǎng)上的說法,只保留一個hostname

    ?

    6.1 core-site.xml配置

    ?

    ?

    Xml代碼??
  • <configuration>??
  • <property>??
  • ??
  • ??????<name>hadoop.tmp.dir</name>??
  • ??????<!--目錄必須手動創(chuàng)建出來-->??
  • ??????<value>/home/hadoop/data/tmp</value>??
  • ??
  • ??????<description>A?base?for?other?temporary?directories.</description>??
  • ??
  • ??</property>??
  • ??
  • <!--file?system?properties-->??
  • ??
  • ??<property>??
  • ??
  • ??????<name>fs.defaultFS</name>??
  • ????????
  • ?????<!--HDFS的服務(wù)地址,只能使用域名,不能設(shè)置為IP或者localhost-->??
  • ??????<value>hdfs://hostname:9000</value>??
  • ??
  • ??</property>??
  • ??<property>??
  • ????<!--使用Hadoop自帶的so庫-->??
  • ????<name>hadoop.native.lib</name>??
  • ????<value>true</value>??
  • ????<description>Should?native?hadoop?libraries,?if?present,?be?used.</description>??
  • </property>??
  • ??
  • </configuration>??
  • ?

    6.2 mapred-site.xml配置

    ?mapred-site.xml文件默認(rèn)不存在,使用cp命令從mapred-site.xml.template拷貝一份 Java代碼??
  • cp?mapred-site.xml.template?mapred-site.xml??
  • ? 做如下設(shè)置, Xml代碼??
  • <configuration>??
  • ??<property>??
  • ??<name>mapreduce.framework.name</name>??
  • ??<!--yarn全是小寫,不是Yarn-->??
  • ??<value>yarn</value>??
  • ??</property>??
  • ?</configuration>??
  • ??

    6.3 yarn-site.xml配置

    ?

    Xml代碼??
  • <configuration>??
  • ??
  • <!--?Site?specific?YARN?configuration?properties?-->?????
  • ??
  • ??<property>??
  • ??
  • ????<!--yarn是小寫,或許大些Y也可以-->??
  • ????<name>yarn.nodemanager.aux-services</name>??
  • ??????
  • ????<!--不是mapreduce.shuffle-->??
  • ??
  • ????<value>mapreduce_shuffle</value>???
  • ??
  • ??</property>??
  • ??
  • ??<property>??
  • ????<name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>??
  • ????<value>org.apache.hadoop.mapred.ShuffleHandler</value>??
  • ?</property>???????
  • ??
  • ??<property>??
  • ??
  • ????<description>The?address?of?the?applications?manager?interface?in?the?RM.</description>?????????
  • ??
  • ????<name>Yarn.resourcemanager.address</name>???
  • ?????
  • ????<!--根據(jù)實際情況,設(shè)置hostname域名-->??????????
  • ????<value>hostname:18040</value>?????????????
  • ??
  • ??</property>??
  • ??
  • ??<property>??
  • ??
  • ????<description>The?address?of?the?scheduler?interface.</description>??
  • ??
  • ????<name>Yarn.resourcemanager.scheduler.address</name>???
  • ???
  • ????<!--根據(jù)實際情況,設(shè)置hostname域名-->??
  • ????<value>hostname:18030</value>?????
  • ??
  • ??</property>??
  • ??
  • ??<property>??
  • ??
  • ????<description>The?address?of?the?RM?web?application.</description>??
  • ??
  • ????<name>Yarn.resourcemanager.webapp.address</name>???
  • ????<!--根據(jù)實際情況,設(shè)置hostname域名-->??
  • ????<value>hostname:18088</value>?????
  • ??
  • ??</property>??
  • ??
  • ??<property>??
  • ??
  • ????<description>The?address?of?the?resource?tracker?interface.</description>??
  • ??
  • ????<name>Yarn.resourcemanager.resource-tracker.address</name>???
  • ??
  • ????<!--根據(jù)實際情況,設(shè)置hostname域名-->??
  • ????<value>hostname:8025</value>?????
  • ??
  • ??</property>??
  • ??
  • </configuration>??
  • ?

    6.4 hdfs-site.xml 配置

    ?

    ?

    Xml代碼??
  • <configuration>??
  • ??
  • ????<property>??
  • ??
  • ????????<name>dfs.namenode.name.dir</name>??
  • ????????<!--手工創(chuàng)建好-->??
  • ????????<value>/home/hadoop/data/hdfs/name</value>??
  • ??
  • ????</property>??
  • ??
  • ????<property>??
  • ??
  • ????????<name>dfs.datanode.data.dir</name>??
  • ????????<!--手工創(chuàng)建好-->??
  • ????????<value>/home/hadoop/data/hdfs/data</value>??
  • ??
  • ????</property>??
  • ??
  • ????<property>??
  • ??
  • ????????<!--HDFS文件復(fù)本數(shù)-->??
  • ????????<name>dfs.replication</name>??
  • ??
  • ????????<value>1</value>??
  • ??
  • ????</property>??
  • ??
  • </configuration>??
  • ?

    七、Hadoop初始化并啟動

    ?

    • 格式化Hadoop NameNode

    ?

    Java代碼??
  • hadoop?namenode?-format???
  • ?

    ?

    觀察日志,如果有輸出中包括Storage directory /home/hadoop/data/hdfs/name has been successfully formatted,則表示格式化成功

    ?

    • 啟動Hadoop

    ?

    Java代碼??
  • /home/hadoop/hadoop-2.5.2/sbin/start-all.sh??
  • ?

    ?

    • 使用JDK的jps檢查Hadoop狀態(tài),如果是如下結(jié)果,則表示安裝成功

    ?

    Java代碼??
  • 10682?DataNode??
  • 10463?NameNode??
  • 11229?ResourceManager??
  • 24647?Jps??
  • 11040?SecondaryNameNode??
  • 11455?NodeManager??
  • ?

    ?

    • ?使用netstat -anp|grep java觀察Hadoop端口號使用情況

    ?

    Java代碼??
  • tcp????????0??????0?0.0.0.0:8042????????????0.0.0.0:*???????????????LISTEN??????11455/java????????
  • tcp????????0??????0?0.0.0.0:50090???????????0.0.0.0:*???????????????LISTEN??????11040/java????????
  • tcp????????0??????0?0.0.0.0:50070???????????0.0.0.0:*???????????????LISTEN??????10463/java????????
  • tcp????????0??????0?0.0.0.0:8088????????????0.0.0.0:*???????????????LISTEN??????11229/java????????
  • tcp????????0??????0?0.0.0.0:34456???????????0.0.0.0:*???????????????LISTEN??????11455/java????????
  • tcp????????0??????0?0.0.0.0:13562???????????0.0.0.0:*???????????????LISTEN??????11455/java????????
  • tcp????????0??????0?0.0.0.0:50010???????????0.0.0.0:*???????????????LISTEN??????10682/java????????
  • tcp????????0??????0?0.0.0.0:50075???????????0.0.0.0:*???????????????LISTEN??????10682/java????????
  • tcp????????0??????0?0.0.0.0:8030????????????0.0.0.0:*???????????????LISTEN??????11229/java????????
  • tcp????????0??????0?0.0.0.0:8031????????????0.0.0.0:*???????????????LISTEN??????11229/java????????
  • tcp????????0??????0?0.0.0.0:8032????????????0.0.0.0:*???????????????LISTEN??????11229/java????????
  • tcp????????0??????0?0.0.0.0:8033????????????0.0.0.0:*???????????????LISTEN??????11229/java????????
  • tcp????????0??????0?0.0.0.0:50020???????????0.0.0.0:*???????????????LISTEN??????10682/java????????
  • tcp????????0??????0?0.0.0.0:8040????????????0.0.0.0:*???????????????LISTEN??????11455/java????
  • ?

    ?

    ?

    • 瀏覽NameNode、DataNode信息,可以查看HDFS狀態(tài)信息

    ?

    ?

    Java代碼??
  • http://hostname:50070??
  • ?

    ?

    • 瀏覽ResourceManagered運行狀態(tài),可以瀏覽MapReduce任務(wù)的執(zhí)行情況

    ?

    ?

    Java代碼??
  • http://hostname:8088??
  • ?

    ?

    八、運行Hadoop自帶的WordCount實例

    • 創(chuàng)建本地文件用于計算這個文件中的單詞數(shù)

    ?

    Java代碼??
  • echo?"My?first?hadoop?example.?Hello?Hadoop?in?input.?"?>?/home/hadoop/input??
  • ?

    ?

    • 創(chuàng)建HDFS輸入目錄,用于將上面的文件寫入這個目錄

    ?

    Java代碼??
  • hadoop?fs?-mkdir?/user/hadooper??
  • ?

    ?

    • 傳文件到HDFS輸入目錄

    ?

    Java代碼??
  • hadoop?fs?-put?/home/hadoop/input?/user/hadooper??
  • ?

    ?

    • 執(zhí)行Hadoop自帶的WordCount例子

    ?

    Java代碼??
  • hadoop?jar?/home/hadoop/hadoop-2.5.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.2.jar?wordcount?/user/hadooper/input?/user/hadooper/output??
  • ?

    ?

    • MapReduce的過程輸出

    ?

    Java代碼??
  • hadoop@hostname:~/hadoop-2.5.2/share/hadoop/mapreduce$?hadoop?jar?/home/hadoop/hadoop-2.5.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.2.jar?wordcount?/user/hadooper/input?/user/hadooper/output??
  • 14/11/23?19:45:04?INFO?client.RMProxy:?Connecting?to?ResourceManager?at?/0.0.0.0:8032??
  • 14/11/23?19:45:05?INFO?input.FileInputFormat:?Total?input?paths?to?process?:?1??
  • 14/11/23?19:45:05?INFO?mapreduce.JobSubmitter:?number?of?splits:1??
  • 14/11/23?19:45:06?INFO?mapreduce.JobSubmitter:?Submitting?tokens?for?job:?job_1416742510596_0001??
  • 14/11/23?19:45:06?INFO?impl.YarnClientImpl:?Submitted?application?application_1416742510596_0001??
  • 14/11/23?19:45:07?INFO?mapreduce.Job:?The?url?to?track?the?job:?http://hostname:8088/proxy/application_1416742510596_0001/??
  • 14/11/23?19:45:07?INFO?mapreduce.Job:?Running?job:?job_1416742510596_0001??
  • 14/11/23?19:45:18?INFO?mapreduce.Job:?Job?job_1416742510596_0001?running?in?uber?mode?:?false??
  • 14/11/23?19:45:18?INFO?mapreduce.Job:??map?0%?reduce?0%??
  • 14/11/23?19:45:26?INFO?mapreduce.Job:??map?100%?reduce?0%??
  • 14/11/23?19:45:36?INFO?mapreduce.Job:??map?100%?reduce?100%??
  • 14/11/23?19:45:37?INFO?mapreduce.Job:?Job?job_1416742510596_0001?completed?successfully??
  • 14/11/23?19:45:37?INFO?mapreduce.Job:?Counters:?49??
  • ????File?System?Counters??
  • ????????FILE:?Number?of?bytes?read=102??
  • ????????FILE:?Number?of?bytes?written=195793??
  • ????????FILE:?Number?of?read?operations=0??
  • ????????FILE:?Number?of?large?read?operations=0??
  • ????????FILE:?Number?of?write?operations=0??
  • ????????HDFS:?Number?of?bytes?read=168??
  • ????????HDFS:?Number?of?bytes?written=64??
  • ????????HDFS:?Number?of?read?operations=6??
  • ????????HDFS:?Number?of?large?read?operations=0??
  • ????????HDFS:?Number?of?write?operations=2??
  • ????Job?Counters???
  • ????????Launched?map?tasks=1??
  • ????????Launched?reduce?tasks=1??
  • ????????Data-local?map?tasks=1??
  • ????????Total?time?spent?by?all?maps?in?occupied?slots?(ms)=5994??
  • ????????Total?time?spent?by?all?reduces?in?occupied?slots?(ms)=6925??
  • ????????Total?time?spent?by?all?map?tasks?(ms)=5994??
  • ????????Total?time?spent?by?all?reduce?tasks?(ms)=6925??
  • ????????Total?vcore-seconds?taken?by?all?map?tasks=5994??
  • ????????Total?vcore-seconds?taken?by?all?reduce?tasks=6925??
  • ????????Total?megabyte-seconds?taken?by?all?map?tasks=6137856??
  • ????????Total?megabyte-seconds?taken?by?all?reduce?tasks=7091200??
  • ????Map-Reduce?Framework??
  • ????????Map?input?records=1??
  • ????????Map?output?records=8??
  • ????????Map?output?bytes=80??
  • ????????Map?output?materialized?bytes=102??
  • ????????Input?split?bytes=119??
  • ????????Combine?input?records=8??
  • ????????Combine?output?records=8??
  • ????????Reduce?input?groups=8??
  • ????????Reduce?shuffle?bytes=102??
  • ????????Reduce?input?records=8??
  • ????????Reduce?output?records=8??
  • ????????Spilled?Records=16??
  • ????????Shuffled?Maps?=1??
  • ????????Failed?Shuffles=0??
  • ????????Merged?Map?outputs=1??
  • ????????GC?time?elapsed?(ms)=101??
  • ????????CPU?time?spent?(ms)=2640??
  • ????????Physical?memory?(bytes)?snapshot=422895616??
  • ????????Virtual?memory?(bytes)?snapshot=2055233536??
  • ????????Total?committed?heap?usage?(bytes)=308281344??
  • ????Shuffle?Errors??
  • ????????BAD_ID=0??
  • ????????CONNECTION=0??
  • ????????IO_ERROR=0??
  • ????????WRONG_LENGTH=0??
  • ????????WRONG_MAP=0??
  • ????????WRONG_REDUCE=0??
  • ????File?Input?Format?Counters???
  • ????????Bytes?Read=49??
  • ????File?Output?Format?Counters???
  • ????????Bytes?Written=64??
  • ?

    • 查看MapReduce的運行結(jié)果

    ?

    ?

    Java代碼??
  • hadoop@hostname:~/hadoop-2.5.2/share/hadoop/mapreduce$?hadoop?fs?-cat?/user/hadooper/output/part-r-00000??
  • Hadoop??1??
  • Hello???1??
  • My??1??
  • example.????1??
  • first???1??
  • hadoop??1??
  • in??1??
  • input.??1??
  • ?

    ?

    九、運行Hadoop的PI程序

    ?

    Java代碼??
  • hadoop?jar?/home/hadoop/hadoop-2.5.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.2.jar?pi?10?10??
  • ?

    執(zhí)行結(jié)果是3.200000000000000000

    ?

    十、Hadoop常見問題

    1. hadoop不正常推出后,重啟后,NameNode將進(jìn)入Safe Mode,不能提交任務(wù),解決辦法:

    ?

    Java代碼??
  • hadoop?dfsadmin?-safemode?leave
  • 總結(jié)

    以上是生活随笔為你收集整理的Hadoop伪集群环境搭建的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。

    如果覺得生活随笔網(wǎng)站內(nèi)容還不錯,歡迎將生活随笔推薦給好友。