日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

本地运行hadoop

發(fā)布時間:2024/9/18 编程问答 28 豆豆
生活随笔 收集整理的這篇文章主要介紹了 本地运行hadoop 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

本地開發(fā)使用windows Intellij,沒有搭建hadoop,搭建教程使用:這個
本地使用pom.xml添加依賴:

<?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0"xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"><modelVersion>4.0.0</modelVersion><groupId>com.vincent</groupId><artifactId>hadoop-pro</artifactId><packaging>jar</packaging><version>1.0-SNAPSHOT</version><properties><!-- 定義Hadoop版本 --><hadoop-version>2.6.0-cdh5.15.1</hadoop-version><java.version>1.8</java.version><maven.compiler.source>${java.version}</maven.compiler.source><maven.compiler.target>${java.version}</maven.compiler.target><project.build.sourceEncoding>UTF-8</project.build.sourceEncoding></properties><!-- 引入CDH倉庫 --><repositories><repository><id>cloudera</id><url>https://repository.cloudera.com/artifactory/cloudera-repos</url></repository></repositories><dependencies><!-- 添加Hadoop依賴包 --><dependency><groupId>org.apache.hadoop</groupId><artifactId>hadoop-client</artifactId><version>${hadoop-version}</version></dependency><!-- 添加Junit依賴包 --><dependency><groupId>junit</groupId><artifactId>junit</artifactId><version>4.10</version><scope>test</scope></dependency><dependency><groupId>com.alibaba</groupId><artifactId>fastjson</artifactId><version>1.2.41</version></dependency></dependencies><build><plugins><!-- Java Compiler --><plugin><groupId>org.apache.maven.plugins</groupId><artifactId>maven-compiler-plugin</artifactId><version>3.3</version><configuration><source>${java.version}</source><target>${java.version}</target><encoding>utf8</encoding></configuration></plugin></plugins></build> </project>

編寫MapReduce:

public class ETLApp {public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {System.setproperty("hadoop.home.dir","C:/Users/user/Desktop/tools/winutils")Configuration configuration = new Configuration();FileSystem fileSystem = FileSystem.get(configuration);Path outputPath = new Path("projectData/input/etl");if(fileSystem.exists(outputPath)) {fileSystem.delete(outputPath, true);}Job job = Job.getInstance(configuration);job.setJarByClass(ETLApp.class);job.setMapperClass(ETLApp.MyMapper.class);job.setMapOutputKeyClass(NullWritable.class);job.setMapOutputValueClass(Text.class);FileInputFormat.setInputPaths(job, new Path("projectData/raw/trackinfo_20130721.data"));FileOutputFormat.setOutputPath(job, new Path("projectData/input/etl"));job.waitForCompletion(true);}static class MyMapper extends Mapper<LongWritable, Text, NullWritable, Text> {@Overrideprotected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {String log = value.toString();Map<String, String> info = logParser.parse(log);String ip = info.get("ip");String country = info.get("country");String province = info.get("province");String city = info.get("city");String url = info.get("url");String time = info.get("time");String pageId = ContentUtils.getPageId(url);StringBuilder builder = new StringBuilder();builder.append(ip).append("\t");builder.append(country).append("\t");builder.append(province).append("\t");builder.append(city).append("\t");builder.append(url).append("\t");builder.append(time).append("\t");builder.append(pageId).append("\t");context.write(NullWritable.get(), new Text(builder.toString()));}}}

總結

以上是生活随笔為你收集整理的本地运行hadoop的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。