日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當(dāng)前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

Nutch2.x+Hadoop 2.5.2+Hbase0.94.26(续2)

發(fā)布時間:2024/4/17 编程问答 21 豆豆
生活随笔 收集整理的這篇文章主要介紹了 Nutch2.x+Hadoop 2.5.2+Hbase0.94.26(续2) 小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

1.執(zhí)行bin/nutch generate -topN 5 -crawlId tieba的時候,出現(xiàn)以下錯誤

java.lang.Exception: java.lang.ClassCastException: org.apache.avro.generic.GenericData$Record cannot be cast to org.apache.gora.persistency.Persistent

????????at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)

????????at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)

Caused by: java.lang.ClassCastException: org.apache.avro.generic.GenericData$Record cannot be cast to org.apache.gora.persistency.Persistent

????????at org.apache.gora.mapreduce.PersistentDeserializer.deserialize(PersistentDeserializer.java:71)

????????at org.apache.gora.mapreduce.PersistentDeserializer.deserialize(PersistentDeserializer.java:35)

????????at org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKeyValue(ReduceContextImpl.java:146)

????????at org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKey(ReduceContextImpl.java:121)

????????at org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.nextKey(WrappedReducer.java:302)

????????at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:170)

????????at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)

????????at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)

????????at org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)

????????at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

????????at java.util.concurrent.FutureTask.run(FutureTask.java:266)

????????at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

????????at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

????????at java.lang.Thread.run(Thread.java:745)

初步懷疑是avrò的版本不匹配,把avrò從1.7.7降級到1.7.6問題依然存在。然后發(fā)現(xiàn)執(zhí)行nutch的時候,classpath里面調(diào)用的都是hadoop 2.5.2的jar,而在hadoop-2.5.2/share/hadoop/common/lib/下,avro的版本是1.7.4,把1.7.7版本替換進去,問題解決

2.執(zhí)行bin/nutch fetch 1421804965-1372033824 -crawlId tieba -threads 50,其中1421804965-1372033824為在hbase shell中執(zhí)行 get 'tieba_webpage','com.baidu.tieba:http/' 所得f:bid timestamp=1421804970851, value=1421804965-1372033824

此時報錯,No agents listed in 'http.agent.name' property

修改nutch-default.properties中的 <name>http.agent.name</name>部分,添加任意字符串

轉(zhuǎn)載于:https://www.cnblogs.com/mactech/p/4239163.html

總結(jié)

以上是生活随笔為你收集整理的Nutch2.x+Hadoop 2.5.2+Hbase0.94.26(续2)的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯,歡迎將生活随笔推薦給好友。