【Hadoop学起来】分布式Hadoop的搭建(Ubuntu 17.04)
正文之前
作為一個以后肯定要做大數據的人,至今還沒玩過Java 和 Hadoop 會不會被老師打死?所以就想著,在我的國外的云主機上搭建個Hadoop ,以后在 dell 電腦的ubuntu系統下也搭建一個,然后還有一臺老戴爾可以搭一個,mac也可以搭一個,勉強算是一個分布式集群了?不管了。反正今天先把Hadoop在Ubuntu 17.04 下搭建好吧!
正文
國內的資料都太老了。我就用Google搜了一波,果然好用啊!!
之后選擇了一個教程: www.admintome.com/blog/instal… 下面進入安裝環節:
1、 Install required software
# apt update && apt upgrade -y # reboot# apt install -y openjdk-8-jdk# apt install ssh pdsh -y 復制代碼2、 Download Hadoop
# wget http://apache.cs.utah.edu/hadoop/common/stable/hadoop-2.8.2.tar.gz # tar -xzvf hadoop-2.8.2.tar.gz # cd hadoop-2.8.2/ 復制代碼上面的網址現在好像廢了。我找了一些新的,你們自己看條件選擇:
apache.claz.org/hadoop/comm…
apache.claz.org/hadoop/comm…
下面是下載安裝和安裝完畢之后的樣子:
下面進入配置環節:
We need to make some additions to our configuration, so edit the next couple of files with the appropriate contents:
etc/hadoop/hadoop-env.sh
export JAVA_HOME=/usr 復制代碼etc/hadoop/core-site.xml
<configuration><property><name>fs.defaultFS</name><value>hdfs://localhost:9000</value></property> </configuration>復制代碼etc/hadoop/hdfs-site.xml
<configuration><property><name>dfs.replication</name><value>1</value></property> </configuration>復制代碼Now in order to make the scripts work, we need to setup passwordless SSH to localhost:
$ ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys$ chmod 0600 ~/.ssh/authorized_keys 復制代碼Format the HDFS filesystem.
# bin/hdfs namenode -format 復制代碼And finally, start up HDFS.
# sbin/start-dfs.sh 復制代碼After it starts up you can access the web interface for the NameNode at this URL: http://{server-ip}50070 .
因為我的是云主機,所以直接用類似網站的方式也可以進入:
Configure YARN
Create the directories we will need for YARN.
# bin/hdfs dfs -mkdir /user # bin/hdfs dfs -mkdir /user/root 復制代碼Edit etc/hadoop/mapred-site.xml and add the following contents:
<configuration><property><name>mapreduce.framework.name</name><value>yarn</value></property> </configuration> 復制代碼And edit
etc/hadoop/yarn-site.xml:
<property><name>yarn.nodemanager.aux-services</name><value>mapreduce_shuffle</value></property><property><name>yarn.nodemanager.vmem-check-enabled</name><value>false</value></property> 復制代碼Start YARN:
# sbin/start-yarn.sh 復制代碼如果無法啟動,報錯如下:
root@HustWolfzzb:/home/hustwolf/Hadoop/hadoop-2.8.2# sbin/start-yarn.sh starting yarn daemons resourcemanager running as process 16803. Stop it first. localhost: starting nodemanager, logging to /home/hustwolf/Hadoop/hadoop-2.8.2/logs/yarn-root-nodemanager-HustWolfzzb.out root@HustWolfzzb:/home/hustwolf/Hadoop/hadoop-2.8.2# kill -9 16803 root@HustWolfzzb:/home/hustwolf/Hadoop/hadoop-2.8.2# sbin/start-yarn.sh starting yarn daemons starting resourcemanager, logging to /home/hustwolf/Hadoop/hadoop-2.8.2/logs/yarn-root-resourcemanager-HustWolfzzb.out localhost: nodemanager running as process 17374. Stop it first. root@HustWolfzzb:/home/hustwolf/Hadoop/hadoop-2.8.2# ls 復制代碼那么,先把所有的先關了。方法是到sbin下采用stop腳本,可以直接stop-all.sh 也可以試試stop-dfs.sh stop-yarn.sh 兩個搭配。然后再開啟一次就ok.
You can now view the web interface at
http://{server-ip}:8088 .
Testing our installation
In order to test that everything is working we can run a MapReduce job using YARN:
# bin/yarn jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8.2.jar pi 16 1000 復制代碼This is going to calculate PI to 16 decimal places for us using the quasiMonteCarlo method. After a minute or two you should get your response:
Job Finished in 96.095 seconds Estimated value of Pi is 3.14250000000000000000 復制代碼我在這兒遇到了一個很苦惱的問題就是:
Number of Maps = 16 Samples per Map = 1000 17/11/24 07:49:52 WARN ipc.Client: Failed to connect to server: localhost/127.0.0.1:9000: try once and fail. java.net.ConnectException: Connection refusedat sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:682)at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:778)at org.apache.hadoop.ipc.Client$Connection.access$3500(Client.java:410)at org.apache.hadoop.ipc.Client.getConnection(Client.java:1544)at org.apache.hadoop.ipc.Client.call(Client.java:1375)at org.apache.hadoop.ipc.Client.call(Client.java:1339)at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227)at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:792)at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at java.lang.reflect.Method.invoke(Method.java:498)at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:409)at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:163)at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:155)at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:346)at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1704)at org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1436)at org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1433)at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1433)at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1437)at org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:278)at org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:358)at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)at org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:367)at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at java.lang.reflect.Method.invoke(Method.java:498)at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at java.lang.reflect.Method.invoke(Method.java:498)at org.apache.hadoop.util.RunJar.run(RunJar.java:234)at org.apache.hadoop.util.RunJar.main(RunJar.java:148) java.net.ConnectException: Call From HustWolfzzb/127.0.1.1 to localhost:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefusedat sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)at java.lang.reflect.Constructor.newInstance(Constructor.java:423)at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:801)at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732)at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1487)at org.apache.hadoop.ipc.Client.call(Client.java:1429)at org.apache.hadoop.ipc.Client.call(Client.java:1339)at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227)at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:792)at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at java.lang.reflect.Method.invoke(Method.java:498)at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:409)at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:163)at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:155)at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:346)at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1704)at org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1436)at org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1433)at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1433)at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1437)at org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:278)at org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:358)at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)at org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:367)at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at java.lang.reflect.Method.invoke(Method.java:498)at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at java.lang.reflect.Method.invoke(Method.java:498)at org.apache.hadoop.util.RunJar.run(RunJar.java:234)at org.apache.hadoop.util.RunJar.main(RunJar.java:148) Caused by: java.net.ConnectException: Connection refusedat sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:682)at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:778)at org.apache.hadoop.ipc.Client$Connection.access$3500(Client.java:410)at org.apache.hadoop.ipc.Client.getConnection(Client.java:1544)at org.apache.hadoop.ipc.Client.call(Client.java:1375)... 38 more 復制代碼至今還沒解決,不過文章實在沒啥好改的了。后續如果有了解決之道,我就在評論區貼出來,或者直接修改文章吧!拜了個拜~ 健身去~~!!
This should be enough to get you started on your Hadoop journey. Subscribe to my newsletter below to get notifications of more Hadoop articles.
I hope you enjoyed this post. If it was helpful or if it was way off then please comment and let me know.
好像成功了???我好像是漏了建立hdfs用戶的那一關?然后還有就是重啟了一次,以及對于一些東西的缺漏。 不過就在我期待值最高的時候,事實給了我狠狠的一擊。好吧,GG。不過還是發現了不少了的有用的教程!!
Hadoop環境安裝設置 hadoop 2.7.1安裝和配置
正文之后
人家老外英文寫的挺好的,我就不多改了。想必就算看不大懂也可以摸索著在百度翻譯的幫助下get 到點,實在不行可以發評論問我嘛 而且,命令都給你整好了,難道還不會?不存在的!!
總結
以上是生活随笔為你收集整理的【Hadoop学起来】分布式Hadoop的搭建(Ubuntu 17.04)的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: [HDU3336]Count the s
- 下一篇: ubuntu18.04安装pycharm