當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

hadoop 分布式缓存

發布時間：2023/11/29 编程问答 28 豆豆

生活随笔收集整理的這篇文章主要介紹了 hadoop 分布式缓存小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

Hadoop 分布式緩存實現目的是在所有的MapReduce調用一個統一的配置文件，首先將緩存文件放置在HDFS中，然后程序在執行的過程中會可以通過設定將文件下載到本地具體設定如下：

public static void main(String[] arge) throws IOException, ClassNotFoundException, InterruptedException{
?? ?
?? ??? ?Configuration conf=new Configuration();
?? ??? ?conf.set("fs.default.name", "hdfs://192.168.1.45:9000");
?? ??? ?FileSystem fs=FileSystem.get(conf);
?? ??? ?fs.delete(new Path("CASICJNJP/gongda/Test_gd20140104"));
?? ??? ?
?? ??? ?conf.set("mapred.job.tracker", "192.168.1.45:9001");
?? ??? ?conf.set("mapred.jar", "/home/hadoop/workspace/jar/OBDDataSelectWithImeiTxt.jar");
?? ??? ?Job job=new Job(conf,"myTaxiAnalyze");
?? ??? ?
?? ??? ?
?? ?????DistributedCache.createSymlink(job.getConfiguration());//
?? ??? ?try {
?? ??? ??? ?DistributedCache.addCacheFile(new URI("/user/hadoop/CASICJNJP/DistributeFiles/imei.txt"), job.getConfiguration());
?? ??? ?} catch (URISyntaxException e1) {
?? ??? ??? ?// TODO Auto-generated catch block
?? ??? ??? ?e1.printStackTrace();
?? ??? ?}?? ??????? ?
?? ??? ?job.setMapperClass(OBDDataSelectMaper.class);
?? ??? ?job.setReducerClass(OBDDataSelectReducer.class);
?? ??? ?//job.setNumReduceTasks(10);
?? ??? ?//job.setCombinerClass(IntSumReducer.class);
?? ??? ?job.setMapOutputKeyClass(Text.class);
?? ??? ?job.setMapOutputValueClass(Text.class);
?? ??? ?
?? ??? ?FileInputFormat.addInputPath(job, new Path("/user/hadoop/CASICJNJP/SortedData/20140104"));
?? ??? ?FileOutputFormat.setOutputPath(job, new Path("CASICJNJP/gongda/SelectedData"));
?? ??? ?
?? ??? ?System.exit(job.waitForCompletion(true)?0:1);
?? ??? ?
?? ?}

??? 代碼中標紅的為將HDFS中的/user/hadoop/CASICJNJP/DistributeFiles/imei.txt作為分布式緩存

public class OBDDataSelectMaper extends Mapper<Object, Text, Text, Text> {
?? ?String[] strs;
?? ?String[] ImeiTimes;
?? ?String timei;
?? ?String time;
?? ?private java.util.List<Integer> ImeiList = new java.util.ArrayList<Integer>();

?? ?protected void setup(Context context) throws IOException,
?? ??? ??? ?InterruptedException {

?? ?????try {
?? ??? ??? ?Path[] cacheFiles = DistributedCache.getLocalCacheFiles(context
?? ??? ??? ??? ??? ?.getConfiguration());
?? ??? ??? ?if (cacheFiles != null && cacheFiles.length > 0) {
?? ??? ??? ??? ?String line;
?? ??? ??? ??? ?BufferedReader br = new BufferedReader(new FileReader(
?? ??? ??? ??? ??? ??? ?cacheFiles[0].toString()));
?? ??? ??? ??? ?try {
?? ??? ??? ??? ??? ?line = br.readLine();
?? ??? ??? ??? ??? ?while ((line = br.readLine()) != null) {
?? ??? ??? ??? ??? ??? ?ImeiList.add(Integer.parseInt(line));
?? ??? ??? ??? ??? ?}
?? ??? ??? ??? ?} finally {
?? ??? ??? ??? ??? ?br.close();
?? ??? ??? ??? ?}
?? ??? ??? ?}
?? ??? ?} catch (IOException e) {
?? ??? ??? ?System.err.println("Exception reading DistributedCache: " + e);
?? ??? ?}
?? ?}

?? ?public void map(Object key, Text value, Context context)
?? ??? ??? ?throws IOException, InterruptedException {

?? ??? ?try {
?? ??? ??? ?strs = value.toString().split("\t");
?? ??? ??? ?ImeiTimes = strs[0].split("_");
?? ??? ??? ?timei = ImeiTimes[0];
?? ??? ??? ?if (ImeiList.contains(Integer.parseInt(timei))) {
?? ??? ??? ??? ?context.write(new Text(strs[0]), value);
?? ??? ??? ?}
?? ??? ?} catch (Exception ex) {

?? ??? ?}
?? ?}
}

上述標紅代碼中在Map的setup函數中加載分布式緩存。

轉載于:https://www.cnblogs.com/mfryf/p/5360306.html

總結

以上是生活随笔為你收集整理的hadoop 分布式缓存的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇：梦到好几个男生追求我
下一篇：梦到身边有很多蛇围绕着我预示着什么