當(dāng)前位置:
首頁 >
MapReduce Java API实例-排序
發(fā)布時間:2025/3/19
45
豆豆
生活随笔
收集整理的這篇文章主要介紹了
MapReduce Java API实例-排序
小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.
場景
MapReduce Java API實(shí)例-統(tǒng)計(jì)單詞出現(xiàn)頻率:
https://blog.csdn.net/BADAO_LIUMANG_QIZHI/article/details/119410169
上面進(jìn)行項(xiàng)目環(huán)境搭建的基礎(chǔ)上。
怎樣實(shí)現(xiàn)對下面這組數(shù)據(jù)進(jìn)行排序
?
注:
博客:
https://blog.csdn.net/badao_liumang_qizhi
關(guān)注公眾號
霸道的程序猿
獲取編程相關(guān)電子書、教程推送與免費(fèi)下載。
實(shí)現(xiàn)
輸入數(shù)據(jù)格式為每行有一數(shù)值,通過MapReduce實(shí)現(xiàn)數(shù)據(jù)的排序功能。
利用Map階段的Sort功能將要排序的數(shù)值作為map函數(shù)的key輸出,
并在reduce函數(shù)設(shè)置一個計(jì)數(shù)器。
1、Map代碼實(shí)現(xiàn)
package com.badao.sort;import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Mapper;import java.io.IOException; import java.util.StringTokenizer;public class SortMapper extends Mapper<Object,Text,IntWritable,IntWritable> {public static IntWritable data = new IntWritable();//map將輸入中value化成IntWritable類型,作為輸出的key@Overridepublic void map(Object key, Text value, Context context) throws IOException, InterruptedException {String line = value.toString();data.set(Integer.parseInt(line));//通過write函數(shù)寫入到本地文件context.write(data,new IntWritable(1));} }2、Reduce代碼實(shí)現(xiàn)
package com.badao.sort;import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Reducer;import java.io.IOException;public class SortReducer extends Reducer<IntWritable, IntWritable,IntWritable,IntWritable> {public static IntWritable linenum = new IntWritable(1);public static int i =1;@Overridepublic void reduce(IntWritable key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {context.write(new IntWritable(i),key);++i;} }3、Job實(shí)現(xiàn)
package com.badao.sort;import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; import org.apache.hadoop.mapreduce.lib.reduce.IntSumReducer;import java.io.IOException;public class SortJob {public static void main(String[] args) throws InterruptedException, IOException, ClassNotFoundException {jobLocal();}public static void jobLocal()throws IOException, ClassNotFoundException, InterruptedException{Configuration conf = new Configuration();//實(shí)例化一個作業(yè),word count是作業(yè)的名字Job job = Job.getInstance(conf, "jobsort");//指定通過哪個類找到對應(yīng)的jar包job.setJarByClass(SortJob.class);//為job設(shè)置Mapper類job.setMapperClass(SortMapper.class);//為job設(shè)置reduce類job.setReducerClass(SortReducer.class);//為job的輸出數(shù)據(jù)設(shè)置key類job.setOutputKeyClass(IntWritable.class);//為job輸出設(shè)置value類job.setOutputValueClass(IntWritable.class);//為job設(shè)置輸入路徑,輸入路徑是存在的文件夾/文件FileInputFormat.addInputPath(job,new Path("D:\\sortData\\sort.txt"));//為job設(shè)置輸出路徑FileOutputFormat.setOutputPath(job,new Path("D:\\sortdataout"));job.waitForCompletion(true);}}運(yùn)行后查看輸出文件結(jié)果
?
總結(jié)
以上是生活随笔為你收集整理的MapReduce Java API实例-排序的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: MapReduce Java API实例
- 下一篇: MapReduce Java API-多