日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問(wèn) 生活随笔!

生活随笔

當(dāng)前位置: 首頁(yè) > 编程资源 > 编程问答 >内容正文

编程问答

Hadoop官方文档翻译——MapReduce Tutorial

發(fā)布時(shí)間:2023/12/18 编程问答 34 豆豆
生活随笔 收集整理的這篇文章主要介紹了 Hadoop官方文档翻译——MapReduce Tutorial 小編覺(jué)得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.
  • MapReduce Tutorial(個(gè)人指導(dǎo))
    • Purpose(目的)
    • Prerequisites(必備條件)
    • Overview(綜述)
    • Inputs and Outputs(輸入輸出)
    • MapReduce - User Interfaces(用戶(hù)接口)
      • Payload(有效負(fù)載)
        • Mapper
        • Reducer
        • Partitioner
        • Counter
      • Job Configuration(作業(yè)配置)
      • Task Execution & Environment(任務(wù)執(zhí)行和環(huán)境)
        • Memory Management(內(nèi)存管理)
        • Map Parameters(Map參數(shù))
        • Shuffle/Reduce Parameters(Shuffle/Reduce參數(shù))
        • Configured Parameters(配置參數(shù))
        • Task Logs(任務(wù)日志)
        • Distributing Libraries(分布式緩存 庫(kù))
      • Job Submission and Monitoring(作業(yè)提交和監(jiān)控)
        • Job Control(作業(yè)控制)
      • Job Input(作業(yè)輸入)
        • InputSplit(輸入塊)
        • RecordReader(記錄讀取器)
      • Job Output(作業(yè)輸出)
        • OutputCommitter(輸出提交器)
        • Task Side-Effect Files(任務(wù)副文件)
        • RecordWriter(記錄輸出器)
      • Other Useful Features(其他有用的特性)
        • Submitting Jobs to Queues(提交作業(yè)到隊(duì)列中)
        • Counters(計(jì)數(shù)器)
        • DistributedCache(分布式緩存)
        • Profiling(分析器)
        • Debugging(調(diào)試器)
        • Data Compression(數(shù)據(jù)壓縮)
        • Skipping Bad Records(跳過(guò)不良數(shù)據(jù)數(shù)據(jù))

?

?

Purpose

This document comprehensively describes all user-facing facets of the Hadoop MapReduce framework and serves as a tutorial.

該文檔作為一份個(gè)人指導(dǎo)全面性得描述了所有用戶(hù)使用Hadoop Mapreduce框架時(shí)遇到的方方面面。

Prerequisites

Ensure that Hadoop is installed, configured and is running. More details:

    • Single Node Setup?for first-time users.
    • Cluster Setup?for large, distributed clusters.

確保Hadoop安裝、配置和運(yùn)行。更多細(xì)節(jié):

    • 初次使用用戶(hù)配置單節(jié)點(diǎn)。
    • ?配置大型、分布式集群

Overview

Hadoop MapReduce is a software framework for easily writing applications which process vast amounts of data (multi-terabyte data-sets) in-parallel on large clusters (thousands of nodes) of commodity hardware in a reliable, fault-tolerant manner.

A MapReduce?job?usually splits the input data-set into independent chunks which are processed by the?map tasks?in a completely parallel manner. The framework sorts the outputs of the maps, which are then input to the?reduce tasks. Typically both the input and the output of the job are stored in a file-system. The framework takes care of scheduling tasks, monitoring them and re-executes the failed tasks.

Typically the compute nodes and the storage nodes are the same, that is, the MapReduce framework and the Hadoop Distributed File System (see?HDFS Architecture Guide) are running on the same set of nodes. This configuration allows the framework to effectively schedule tasks on the nodes where data is already present, resulting in very high aggregate bandwidth across the cluster.

The MapReduce framework consists of a single master?ResourceManager, one slave?NodeManager?per cluster-node, and?MRAppMaster?per application (see?YARN Architecture Guide).

Minimally, applications specify the input/output locations and supply?map?and?reduce?functions via implementations of appropriate interfaces and/or abstract-classes. These, and other job parameters, comprise the?job configuration.

The Hadoop?job client?then submits the job (jar/executable etc.) and configuration to the?ResourceManager?which then assumes the responsibility of distributing the software/configuration to the slaves, scheduling tasks and monitoring them, providing status and diagnostic information to the job-client.

Although the Hadoop framework is implemented in Java?, MapReduce applications need not be written in Java.

    • Hadoop Streaming?is a utility which allows users to create and run jobs with any executables (e.g. shell utilities) as the mapper and/or the reducer.
    • Hadoop Pipes?is a?SWIG-compatible C++ API to implement MapReduce applications (non JNI? based).

Hadoop Mapreduce是一個(gè)易于編程并且能在大型集群(上千節(jié)點(diǎn))快速地并行得處理大量數(shù)據(jù)的軟件框架,以可靠,容錯(cuò)的方式部署在商用機(jī)器上。

MapReduce Job通常將獨(dú)立大塊數(shù)據(jù)切片以完全并行的方式在map任務(wù)中處理。該框架對(duì)maps輸出的做為reduce輸入的數(shù)據(jù)進(jìn)行排序,Job的輸入輸出都是存儲(chǔ)在文件系統(tǒng)中。該框架調(diào)度任務(wù)、監(jiān)控任務(wù)和重啟失效的任務(wù)。

一般來(lái)說(shuō)計(jì)算節(jié)點(diǎn)和存儲(chǔ)節(jié)點(diǎn)都是同樣的設(shè)置,MapReduce框架和HDFS運(yùn)行在同組節(jié)點(diǎn)。這樣的設(shè)定使得MapReduce框架能夠以更高的帶寬來(lái)執(zhí)行任務(wù),當(dāng)數(shù)據(jù)已經(jīng)在節(jié)點(diǎn)上時(shí)。

MapReduce 框架包含一個(gè)主ResourceManager,每個(gè)集群節(jié)點(diǎn)都有一個(gè)從NodeManager和每個(gè)應(yīng)用都有一個(gè)MRAppMaster。

應(yīng)用最少必須指定輸入和輸出的路徑并且通過(guò)實(shí)現(xiàn)合適的接口或者抽象類(lèi)來(lái)提供map和reduce功能。前面這部分內(nèi)容和其他Job參數(shù)構(gòu)成了Job的配置。

Hadoop 客戶(hù)端提交Job和配置信息給ResourceManger,它將負(fù)責(zé)把配置信息分配給從屬節(jié)點(diǎn),調(diào)度任務(wù)并且監(jiān)控它們,把狀態(tài)信息和診斷信息傳輸給客戶(hù)端。

  盡管 MapReduce 框架是用Java實(shí)現(xiàn)的,但是 MapReduce 應(yīng)用卻不一定要用Java編寫(xiě)。

    • Hadoop Streaming 是一個(gè)工具允許用戶(hù)創(chuàng)建和運(yùn)行任何可執(zhí)行文件。
    • Hadoop Pipes 是兼容SWIG用來(lái)實(shí)現(xiàn) MapReduce 應(yīng)用的C++ API(不是基于JNI).

Inputs and Outputs

The MapReduce framework operates exclusively on?<key, value>?pairs, that is, the framework views the input to the job as a set of?<key, value>?pairs and produces a set of?<key, value>pairs as the output of the job, conceivably of different types.

The?key?and?value?classes have to be serializable by the framework and hence need to implement the?Writable?interface. Additionally, the key classes have to implement theWritableComparable?interface to facilitate sorting by the framework.

Input and Output types of a MapReduce job:

(input)?<k1, v1> ->?map?-> <k2, v2> ->?combine?-> <k2, v2> ->?reduce?-> <k3, v3>?(output)

MapReduce 框架只操作鍵值對(duì),MapReduce 將job的不同類(lèi)型輸入當(dāng)做鍵值對(duì)來(lái)處理并且生成一組鍵值對(duì)作為輸出。

Key和Value類(lèi)必須通過(guò)實(shí)現(xiàn)Writable接口來(lái)實(shí)現(xiàn)序列化。此外,Key類(lèi)必須實(shí)現(xiàn)WritableComparable 來(lái)使得排序更簡(jiǎn)單。

MapRedeuce job 的輸入輸出類(lèi)型:

(input) ->map-> ?->combine-> ?->reduce-> (output)

MapReduce - User Interfaces

This section provides a reasonable amount of detail on every user-facing aspect of the MapReduce framework. This should help users implement, configure and tune their jobs in a fine-grained manner. However, please note that the javadoc for each class/interface remains the most comprehensive documentation available; this is only meant to be a tutorial.

Let us first take the?Mapper?and?Reducer?interfaces. Applications typically implement them to provide the?map?and?reduce?methods.

We will then discuss other core interfaces including?Job,?Partitioner,?InputFormat,?OutputFormat, and others.

Finally, we will wrap up by discussing some useful features of the framework such as the?DistributedCache,?IsolationRunner?etc.

這部分將展示 MapReduce 中面向用戶(hù)方面的盡可能多的細(xì)節(jié)。這將會(huì)幫助用戶(hù)更小粒度地實(shí)現(xiàn)、配置和調(diào)試它們的Job。然而,請(qǐng)?jiān)?Javadoc 中查看每個(gè)類(lèi)和接口的綜合用法,這里僅僅是作為一份指導(dǎo)。

讓我們首先來(lái)看看Mapper和Reducer接口。應(yīng)用通常只實(shí)現(xiàn)它們提供的map和reduce方法。

我們將會(huì)討論其他接口包括Job、Partitioner、InputFormat和其他的。

最后,我們會(huì)討論一些有用的特性像分布式緩存、隔離運(yùn)行等。

?

Payload

Applications typically implement the?Mapper?and?Reducer?interfaces to provide the?map?and?reduce?methods. These form the core of the job.

應(yīng)用通常實(shí)現(xiàn)Mapper和Reducer接口提供map和reduce方法。這是Job的核心代碼。

Mapper

Mapper?maps input key/value pairs to a set of intermediate key/value pairs.

Maps are the individual tasks that transform input records into intermediate records. The transformed intermediate records do not need to be of the same type as the input records. A given input pair may map to zero or many output pairs.

The Hadoop MapReduce framework spawns(產(chǎn)卵) one map task for each?InputSplit?generated by the?InputFormat?for the job.

Overall,?Mapper?implementations are passed the?Job?for the job via the?Job.setMapperClass(Class)?method. The framework then calls?map(WritableComparable, Writable, Context)?for each key/value pair in the?InputSplit?for that task. Applications can then override the?cleanup(Context)?method to perform any required cleanup.

Output pairs do not need to be of the same types as input pairs. A given input pair may map to zero or many output pairs. Output pairs are collected with calls to context.write(WritableComparable, Writable).

Applications can use the?Counter?to report its statistics.

All intermediate(中間的) values associated(聯(lián)系) with a given output key are subsequently(隨后) grouped by the framework, and passed to the?Reducer(s) to determine the final output. Users can control the grouping by specifying a?Comparator?via?Job.setGroupingComparatorClass(Class).

The?Mapper?outputs are sorted and then partitioned per?Reducer. The total number of partitions is the same as the number of reduce tasks for the job. Users can control which keys (and hence records) go to which?Reducer?by implementing a custom?Partitioner.

Users can optionally(隨意) specify a?combiner, via?Job.setCombinerClass(Class), to perform local aggregation of the intermediate outputs, which helps to cut down the amount of data transferred from the?Mapper?to the?Reducer.

The intermediate, sorted outputs are always stored in a simple (key-len, key, value-len, value) format. Applications can control if, and how, the intermediate outputs are to be compressed and the?CompressionCodec?to be used via the?Configuration.

Mappers將輸入的鍵值對(duì)轉(zhuǎn)換成中間鍵值對(duì)。

Maps是多個(gè)單獨(dú)執(zhí)行的任務(wù)將輸入轉(zhuǎn)換成中間記錄。那些被轉(zhuǎn)換的中間記錄不一定要和輸入的記錄為相同類(lèi)型。輸入鍵值對(duì)可以在map后輸出0或者更多的鍵值對(duì)。

MapReduce 會(huì)根據(jù) InputFormat 切分成的各個(gè) InputSplit 都創(chuàng)建一個(gè)map任務(wù)

總的來(lái)說(shuō),通過(guò) job.setMapperClass(Class)來(lái)給Job設(shè)置Mapper實(shí)現(xiàn)類(lèi),并且將InputSplit輸入到map方法進(jìn)行處理。應(yīng)用可復(fù)寫(xiě)cleanup方法來(lái)執(zhí)行任何需要回收清除的操作。

輸出鍵值對(duì)不一定要和輸入鍵值對(duì)為相同的類(lèi)型。一個(gè)鍵值對(duì)輸入可以輸出0至多個(gè)不等的鍵值對(duì)。輸出鍵值對(duì)將通過(guò)context.write(WritableComparable,Writable)方法進(jìn)行緩存。

應(yīng)用可以通過(guò)Counter進(jìn)行統(tǒng)計(jì)。

所有的中間值都會(huì)按照Key進(jìn)行排序,然后傳輸給一個(gè)特定的Reducer做最后確定的輸出。用戶(hù)可以通過(guò)Job.setGroupingComparatorClass(Class)來(lái)控制分組規(guī)則。

Mapper輸出會(huì)被排序并且分區(qū)到每一個(gè)Reducer。分區(qū)數(shù)和Reduce的數(shù)目是一致的。用戶(hù)可以通過(guò)實(shí)現(xiàn)一個(gè)自定義的Partitioner來(lái)控制哪個(gè)key對(duì)應(yīng)哪個(gè)Reducer。

用戶(hù)可以隨意指定一個(gè)combiner,Job.setCombinerClass(Class),來(lái)執(zhí)行局部輸出數(shù)據(jù)的整合,將有效地降低Mapper和Reducer之間的數(shù)據(jù)傳輸量。

那些經(jīng)過(guò)排序的中間記錄通常會(huì)以(key-len, key, value-len, value)的簡(jiǎn)單格式儲(chǔ)存。應(yīng)用可以通過(guò)配置來(lái)決定是否需要和怎樣壓縮數(shù)據(jù)和選擇壓縮方式。

?

  How Many Maps?

The number of maps is usually driven by the total size of the inputs, that is, the total number of blocks of the input files.

The right level of parallelism(平行) for maps seems to be around 10-100 maps per-node, although it has been set up to 300 maps for very cpu-light map tasks. Task setup takes a while, so it is best if the maps take at least a minute to execute.

Thus, if you expect 10TB of input data and have a blocksize of?128MB, you'll end up with 82,000 maps, unless Configuration.set(MRJobConfig.NUM_MAPS, int) (which only provides a hint to the framework) is used to set it even higher.

?  maps的數(shù)據(jù)通常依賴(lài)于輸入數(shù)據(jù)的總長(zhǎng)度,也就是,輸入文檔的總block數(shù)。

每個(gè)節(jié)點(diǎn)map的正常并行度應(yīng)該在10-100之間,盡管每個(gè)cpu已經(jīng)設(shè)置的上限值為300。任務(wù)的配置會(huì)花費(fèi)一些時(shí)間,最少需要花費(fèi)一分鐘來(lái)啟動(dòng)運(yùn)行。

因此,如果你有10TB的數(shù)據(jù)輸入和定義blocksize為128M,那么你將需要82000 maps,除非通過(guò)Configuration.set(MRJobConfig.NUM_MAPS, int)(設(shè)置一個(gè)默認(rèn)值通知框架)來(lái)設(shè)置更高的值。

?

Reducer

Reducer?reduces a set of intermediate values which share a key to a smaller set of values.

The number of reduces for the job is set by the user via?Job.setNumReduceTasks(int).

Overall(總的來(lái)說(shuō)),?Reducer?implementations are passed the?Job?for the job via the?Job.setReducerClass(Class)?method and can override it to initialize themselves. The framework then callsreduce(WritableComparable, Iterable<Writable>, Context)?method for each?<key, (list of values)>?pair in the grouped inputs. Applications can then override the?cleanup(Context)method to perform any required cleanup.

Reducer?has 3 primary(主要) phases(階段): shuffle, sort and reduce.

?  Reduce處理一系列相同key的中間記錄。

用戶(hù)可以通過(guò)?Job.setNumReduceTasks(int)?來(lái)設(shè)置reduce的數(shù)量。

總的來(lái)說(shuō),通過(guò)?Job.setReducerClass(Class)?可以給?job?設(shè)置?recuder?的實(shí)現(xiàn)類(lèi)并且進(jìn)行初始化。框架將會(huì)調(diào)用?reduce?方法來(lái)處理每一組按照一定規(guī)則分好的輸入數(shù)據(jù),應(yīng)用可以通過(guò)復(fù)寫(xiě)cleanup?方法執(zhí)行任何清理工作。

Reducer有3個(gè)主要階段:混洗、排序和reduce。

?

Shuffle

Input to the?Reducer?is the sorted output of the mappers. In this phase the framework fetches(取得) the relevant(有關(guān)的,恰當(dāng)?shù)?#xff09; partition of the output of all the mappers, via HTTP.

輸出到Reducer的數(shù)據(jù)都在Mapper階段經(jīng)過(guò)排序的。在這個(gè)階段框架將通過(guò)HTTP從恰當(dāng)?shù)腗apper的分區(qū)中取得數(shù)據(jù)。

?

Sort

The framework groups?Reducer?inputs by keys (since different mappers may have output the same key) in this stage(階段).

The shuffle and sort phases occur simultaneously(同時(shí)); while map-outputs are being fetched they are merged.

這個(gè)階段框架將對(duì)輸入到的?Reducer?的數(shù)據(jù)通過(guò)key(不同的?Mapper?可能輸出相同的key)進(jìn)行分組。

混洗和排序階段是同時(shí)進(jìn)行;map的輸出數(shù)據(jù)被獲取時(shí)會(huì)進(jìn)行合并。

?

Secondary Sort

If equivalence(平等的) rules for grouping the intermediate keys are required to be different from those for grouping keys before reduction, then one may specify a?Comparator?via Job.setSortComparatorClass(Class). Since?Job.setGroupingComparatorClass(Class)?can be used to control how intermediate keys are grouped, these can be used in conjunction(協(xié)調(diào)) to simulate(模擬)?secondary sort on values.

如果想要對(duì)中間記錄實(shí)現(xiàn)與?map?階段不同的排序方式,可以通過(guò)Job.setSortComparatorClass(Class)?來(lái)設(shè)置一個(gè)比較器 。Job.setGroupingComparatorClass(Class)?被用于控制中間記錄的排序方式,這些能用來(lái)進(jìn)行值的二次排序。

?

Reduce

In this phase the reduce(WritableComparable, Iterable<Writable>, Context) method is called for each?<key, (list of values)>?pair in the grouped inputs.

The output of the reduce task is typically written to the?FileSystem?via Context.write(WritableComparable, Writable).

Applications can use the?Counter?to report its statistics.

The output of the?Reducer?is?not sorted.

在這個(gè)階段reduce方法將會(huì)被調(diào)用來(lái)處理每個(gè)已經(jīng)分好的組鍵值對(duì)。

reduce?任務(wù)一般通過(guò)?Context.write(WritableComparable, Writable)?將數(shù)據(jù)寫(xiě)入到FileSystem。

應(yīng)用可以使用?Counter?進(jìn)行統(tǒng)計(jì)。

Recuder?輸出的數(shù)據(jù)是不經(jīng)過(guò)排序的。

?

How Many Reduces?

The right number of reduces seems to be?0.95?or?1.75?multiplied(乘上) by (<no. of nodes> * <no. of maximum containers per node>).

With?0.95?all of the reduces can launch immediately(立刻) and start transferring map outputs as the maps finish. With?1.75?the faster nodes will finish their first round of reduces and launch a second wave(波浪) of reduces doing a much better job of load balancing(均衡).

Increasing the number of reduces increases the framework overhead(負(fù)擔(dān),天花板), but increases load balancing and lowers the cost of failures.

The scaling(規(guī)模) factors above are slightly(輕微的) less than whole numbers to reserve a few reduce slots in the framework for speculative(推測(cè)的)-tasks and failed tasks.

?  合適的?reduce?總數(shù)應(yīng)該在?節(jié)點(diǎn)數(shù)*每個(gè)節(jié)點(diǎn)的容器數(shù)*0.95 至 節(jié)點(diǎn)數(shù)*每個(gè)節(jié)點(diǎn)的容器數(shù)*1.75?之間。

當(dāng)設(shè)定值為0.95時(shí),map任務(wù)結(jié)束后所有的?reduce?將會(huì)立刻啟動(dòng)并且開(kāi)始轉(zhuǎn)移數(shù)據(jù),當(dāng)設(shè)定值為1.75時(shí),處理更多任務(wù)的時(shí)候?qū)?huì)快速地一輪又一輪地運(yùn)行?reduce?達(dá)到負(fù)載均衡。

reduce?的數(shù)目的增加將會(huì)增加框架的負(fù)擔(dān)(天花板),但是會(huì)提高負(fù)載均衡和降低失敗率。

整體的規(guī)模將會(huì)略小于總數(shù),因?yàn)橛幸恍?reduce slot?用來(lái)存儲(chǔ)推測(cè)任務(wù)和失敗任務(wù)。

?

Reducer NONE

It is legal to set the number of reduce-tasks to?zero?if no reduction is desired.

In this case the outputs of the map-tasks go directly to the?FileSystem, into the output path set by?FileOutputFormat.setOutputPath(Job, Path). The framework does not sort the map-outputs before writing them out to the?FileSystem.

當(dāng)沒(méi)有?reduction?需求的時(shí)候可以將?reduce-task?的數(shù)目設(shè)置為0,是允許的。

在這種情況當(dāng)中,map任務(wù)將直接輸出到?FileSystem,可通過(guò) ?FileOutputFormat.setOutputPath(Job, Path)?來(lái)設(shè)置。該框架不會(huì)對(duì)輸出的FileSystem?的數(shù)據(jù)進(jìn)行排序。

?

Partitioner

Partitioner?partitions the key space.

Partitioner controls the partitioning of the keys of the intermediate map-outputs. The key (or a subset (子集)of the key) is used to derive(取得;源自) the partition, typically by a?hash function. The total number of partitions is the same as the number of reduce tasks for the job. Hence this controls which of the?m?reduce tasks the intermediate key (and hence the record) is sent to for reduction.

HashPartitioner?is the default?Partitioner.

?  Partitioner對(duì)key進(jìn)行分區(qū)。

Partitioner?對(duì)?map?輸出的中間值的?key(Recuder之前)進(jìn)行分區(qū)。分區(qū)采用的默認(rèn)方法是對(duì)?key?取?hashcode。分區(qū)數(shù)等于?job?的?reduce?任務(wù)數(shù)。因此這會(huì)根據(jù)中間值的key?將數(shù)據(jù)傳輸?shù)綄?duì)應(yīng)的?reduce。

HashPartitioner?是默認(rèn)的的分區(qū)器。

?

Counter

Counter?is a facility for MapReduce applications to report its statistics.

Mapper?and?Reducer?implementations can use the?Counter?to report statistics.

Hadoop MapReduce comes bundled(捆綁) with a?library?of generally(普遍的) useful mappers, reducers, and partitioners.

?  ?計(jì)數(shù)器是一個(gè)工具用于報(bào)告?Mapreduce?應(yīng)用的統(tǒng)計(jì)。

Mapper?和?Reducer?實(shí)現(xiàn)類(lèi)可使用計(jì)數(shù)器來(lái)報(bào)告統(tǒng)計(jì)值。

Hadoop Mapreduce?是普遍的可用的?Mappers、Reducers?和?Partitioners?組成的一個(gè)庫(kù)。

?

Job Configuration

Job?represents(代表,表示) a MapReduce job configuration.

Job?is the primary interface for a user to describe a MapReduce job to the Hadoop framework for execution. The framework tries to faithfully(如實(shí)的) execute the job as described by?Job, however:

    • Some configuration parameters may have been marked as final by administrators (see?Final Parameters) and hence cannot be altered(改變).
    • While some job parameters are straight-forward to set (e.g.?Job.setNumReduceTasks(int)), other parameters interact(互相影響) subtly(微妙的) with the rest of the framework and/or job configuration and are more complex to set (e.g.?Configuration.set(JobContext.NUM_MAPS, int)).

Job?is typically used to specify the?Mapper, combiner (if any),?Partitioner,?Reducer,?InputFormat,?OutputFormat?implementations.?FileInputFormat?indicates(指定,表明) the set of input files (FileInputFormat.setInputPaths(Job, Path...)/?FileInputFormat.addInputPath(Job, Path)) and (?FileInputFormat.setInputPaths(Job, String...)/?FileInputFormat.addInputPaths(Job, String))and where the output files should be written (?FileOutputFormat.setOutputPath(Path)).

Optionally,?Job?is used to specify other advanced facets of the job such as the?Comparator?to be used, files to be put in the?DistributedCache, whether intermediate and/or job outputs are to be compressed (and how), whether job tasks can be executed in a?speculative?manner (?setMapSpeculativeExecution(boolean))/?setReduceSpeculativeExecution(boolean)), maximum number of attempts per task (setMaxMapAttempts(int)/?setMaxReduceAttempts(int)) etc.

Of course, users can use?Configuration.set(String, String)/?Configuration.get(String)?to set/get arbitrary parameters needed by applications. However, use the?DistributedCache?for large amounts of (read-only) data.

?  Job類(lèi)用來(lái)表示MapReduce作業(yè)的配置。Job是用戶(hù)用來(lái)描述MapReduce?job在Hadoop框架運(yùn)行的主要接口。Hadoop將盡量如實(shí)地按照job所描述的來(lái)執(zhí)行。然而:

    • 一些配置參數(shù)已經(jīng)被管理員標(biāo)注為不可更改的因此不能被改變。
    • 一些參數(shù)是直接設(shè)置的(如Job.setNumReduceTasks(int)),有一些參數(shù)是跟框架或者任務(wù)配置之間有微妙的互相影響并且復(fù)雜的設(shè)置。

Job?典型地用于指定Mapper、Combiner、Partitioner、Reducer、InputFormat、OutputFormat實(shí)現(xiàn)類(lèi)。?FileInputFormat指定輸入文檔的設(shè)定(FileInputFormat.setInputPaths(Job, Path...)/FileInputFormat.addInputPath(Job, Path))和(FileInputFormat.setInputPaths(Job, String...)/FileInputFormat.addInputPaths(Job, String))和輸出文件應(yīng)該寫(xiě)入通過(guò)(FileOutputFormat.setOutputPath(Path)).

隨意地,Job也常用來(lái)指定job的其他高級(jí)配置,例如比較器、文檔置于分布式緩存、中間記錄是否壓縮和怎樣壓縮, job任務(wù)是否已預(yù)測(cè)的方式去執(zhí)行,每個(gè)任務(wù)的最大處理量等等。

當(dāng)然,用戶(hù)可以使用來(lái)設(shè)置或者獲得應(yīng)用所需要的任何參數(shù)。然而,使用分布式緩存來(lái)存儲(chǔ)大量的可讀數(shù)據(jù)。

?

Task Execution & Environment

The?MRAppMaster?executes the?Mapper/Reducer?task?as a child process in a separate jvm.

The child-task inherits the environment of the parent?MRAppMaster. The user can specify additional options to the child-jvm via the?mapreduce.{map|reduce}.java.opts?and configuration parameter in the?Job?such as non-standard paths for the run-time linker to search shared libraries via?-Djava.library.path=<>?etc. If the?mapreduce.{map|reduce}.java.opts?parameters contains the symbol?@taskid@?it is interpolated with value of?taskid?of the MapReduce task.

Here is an example with multiple arguments and substitutions, showing jvm GC logging, and start of a passwordless(無(wú)密碼) JVM JMX agent so that it can connect with jconsole and the likes to watch child memory, threads and get thread dumps. It also sets the maximum heap-size of the map and reduce child jvm to 512MB & 1024MB respectively. It also adds an additional path to the?java.library.path?of the child-jvm.

1 <property> 2 3 <name>mapreduce.map.java.opts</name> 4 5 <value> 6 7 -Xmx512M -Djava.library.path=/home/mycompany/lib -verbose:gc -Xloggc:/tmp/@taskid@.gc 8 9 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false 10 11 </value> 12 13 </property> 14 15 16 17 <property> 18 19 <name>mapreduce.reduce.java.opts</name> 20 21 <value> 22 23 -Xmx1024M -Djava.library.path=/home/mycompany/lib -verbose:gc -Xloggc:/tmp/@taskid@.gc 24 25 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false 26 27 </value> 28 29 </property>

?

MRAppMaster?在一個(gè)單獨(dú)的jvm中運(yùn)行Mapper/Reducer任務(wù)做為一個(gè)子進(jìn)程。

子任務(wù)繼承父MRAppMaster的運(yùn)行環(huán)境。用戶(hù)可以通過(guò)(mapreduce.{map|reduce}.java.opts和配置參數(shù)例如通過(guò)?Djava.library.path=<>可以設(shè)置非標(biāo)準(zhǔn)的路徑用于運(yùn)行時(shí)搜索庫(kù))指定額外的設(shè)置。如果mapreduce.{map|reduce}.java.opts參數(shù)包含@taskid@?符號(hào)那么Mapreduce任務(wù)將會(huì)被修改為taskid的值。

下面有個(gè)例子;配置多個(gè)參數(shù)和代替,展示jvm gc?日志,和?JVM JMX?代理用于無(wú)密碼登錄以致可以連接JConsole來(lái)監(jiān)控子程序的內(nèi)存、線程和線程垃圾回收。也分別設(shè)置了map和reduce的最大堆內(nèi)存為512M和1024M。它也給子jvm添加了額外的路徑通過(guò)java.library.path參數(shù)。

?

Memory Management

Users/admins can also specify the maximum virtual memory of the launched child-task, and any sub-process it launches recursively, using?mapreduce.{map|reduce}.memory.mb. Note that the value set here is a per process limit. The value for?mapreduce.{map|reduce}.memory.mb?should be specified in mega bytes (MB). And also the value must be greater than or equal to the -Xmx passed to JavaVM, else the VM might not start.

Note:?mapreduce.{map|reduce}.java.opts?are used only for configuring the launched child tasks from MRAppMaster. Configuring the memory options for daemons is documented inConfiguring the Environment of the Hadoop Daemons.

The memory available to some parts of the framework is also configurable. In map and reduce tasks, performance may be influenced by adjusting parameters influencing the concurrency of operations and the frequency with which data will hit disk. Monitoring the filesystem counters for a job- particularly relative to byte counts from the map and into the reduce- is invaluable to the tuning of these parameters.

用戶(hù)或者管理員可以使用mapreduce.{map|reduce}.memory.mb指定子任務(wù)或者任何子進(jìn)程運(yùn)行的最大虛擬內(nèi)存。需要注意的這里的值是針對(duì)每個(gè)進(jìn)程的限制。{map|reduce}.memory.mb的值是以MB為單位的。并且這個(gè)值應(yīng)該大于等于傳給JavaVM的-Xmx的值,要不VM可能會(huì)無(wú)法啟動(dòng)。

  說(shuō)明:mapreduce.{map|reduce}.java.opts只用來(lái)設(shè)置MRAppMaster發(fā)出的子任務(wù)。守護(hù)線程的內(nèi)存選項(xiàng)配置在Configuring the Environment of the Hadoop Daemons.

  框架的一些組成部分的內(nèi)存也是可配置的。在map和reduce任務(wù)中,性能可能會(huì)受到并發(fā)數(shù)的調(diào)整和寫(xiě)入到磁盤(pán)的頻率的影響。文件系統(tǒng)計(jì)數(shù)器監(jiān)控作業(yè)的map輸出和輸入到reduce的字節(jié)數(shù)對(duì)于調(diào)整這 ? ? ? ? 些參數(shù)是寶貴的。

Map Parameters

A record emitted(發(fā)射) from a map will be serialized into a buffer and metadata will be stored into accounting buffers. As described in the following options, when either the serialization buffer or the metadata exceed(超過(guò)) a threshold(入口), the contents of the buffers will be sorted and written to disk in the background while the map continues to output records. If either buffer fills completely while the spill is in progress, the map thread will block. When the map is finished, any remaining records are written to disk and all on-disk segments are merged into a single file. Minimizing the number of spills to disk can decrease map time, but a larger buffer also decreases the memory available to the mapper.

Name

Type

Description

mapreduce.task.io.sort.mb

int

The cumulative(累積) size of the serialization and accounting buffers storing records emitted from the map, in megabytes.

mapreduce.map.sort.spill.percent

float

The soft limit in the serialization buffer. Once reached, a thread will begin to spill the contents to disk in the background.

Other notes

    • If either spill threshold is exceeded while a spill is in progress, collection will continue until the spill is finished. For example, if?mapreduce.map.sort.spill.percent?is set to 0.33, and the remainder(剩余) of the buffer is filled while the spill runs, the next spill will include all the collected records, or 0.66 of the buffer, and will not generate additional spills. In other words, the thresholds are defining triggers, not blocking.
    • A record larger than the serialization buffer will first trigger a spill, then be spilled to a separate file. It is undefined whether or not this record will first pass through the combiner.

Map發(fā)出的數(shù)據(jù)將會(huì)被序列化在緩存中和源數(shù)據(jù)將會(huì)儲(chǔ)存在統(tǒng)計(jì)緩存。正如接下來(lái)的配置所描述的,當(dāng)序列化緩存和元數(shù)據(jù)超過(guò)設(shè)定的臨界值,緩存中的內(nèi)容將會(huì)后臺(tái)中寫(xiě)入到磁盤(pán)中而map將會(huì)繼續(xù)輸出記錄。當(dāng)緩存完全滿(mǎn)了溢出之后,map線程將會(huì)阻塞。當(dāng)map任務(wù)結(jié)束,所有剩下的記錄都會(huì)被寫(xiě)到磁盤(pán)中并且磁盤(pán)中所有文件塊會(huì)被合并到一個(gè)單獨(dú)的文件。減小溢出值將減少map的時(shí)間,但更大的緩存會(huì)減少mapper的內(nèi)存消耗。

其他說(shuō)明:

    • 當(dāng)任何一個(gè)spill超出的臨界值,收集還會(huì)持續(xù)進(jìn)行直到結(jié)束。例如,當(dāng)mapreduce.map.sort.spill.percent?設(shè)置為0.33,那么剩余的緩存將會(huì)繼續(xù)填充而spill會(huì)繼續(xù)運(yùn)行,而下一個(gè)spill將會(huì)包含所有的收集的記錄,而當(dāng)值為0.66,將不會(huì)產(chǎn)生另一個(gè)spills。也就是說(shuō),臨界值會(huì)被觸發(fā),但不會(huì)阻塞。
    • 一個(gè)記錄大于序列化緩存將會(huì)第一時(shí)間觸發(fā)溢出,并且會(huì)被寫(xiě)到一個(gè)單獨(dú)的文件。無(wú)論是否有定義都會(huì)第一時(shí)間通過(guò)combiner進(jìn)行傳輸。

?

Shuffle/Reduce Parameters

As described previously, each reduce fetches the output assigned to it by the Partitioner via HTTP into memory and periodically merges these outputs to disk. If intermediate compression of map outputs is turned on, each output is decompressed into memory. The following options affect the frequency of these merges to disk prior to the reduce and the memory allocated to map output during the reduce.

Name

Type

Description

mapreduce.task.io.soft.factor

int

Specifies the number of segments on disk to be merged at the same time. It limits the number of open files and compression codecs during merge. If the number of files exceeds this limit, the merge will proceed in several passes. Though this limit also applies to the map, most jobs should be configured so that hitting this limit is unlikely there.

mapreduce.reduce.merge.inmem.thresholds

int

The number of sorted map outputs fetched into memory before being merged to disk. Like the spill thresholds in the preceding note, this is not defining a unit of partition, but a trigger. In practice, this is usually set very high (1000) or disabled (0), since merging in-memory segments is often less expensive than merging from disk (see notes following this table). This threshold influences only the frequency of in-memory merges during the shuffle.

mapreduce.reduce.shuffle.merge.percent

float

The memory threshold for fetched map outputs before an in-memory merge is started, expressed as a percentage of memory allocated to storing map outputs in memory. Since map outputs that can't fit in memory can be stalled, setting this high may decrease parallelism between the fetch and merge. Conversely, values as high as 1.0 have been effective for reduces whose input can fit entirely in memory. This parameter influences only the frequency of in-memory merges during the shuffle.

mapreduce.reduce.shuffle.input.buffer.percent

float

The percentage of memory- relative to the maximum heapsize as typically specified in?mapreduce.reduce.java.opts- that can be allocated to storing map outputs during the shuffle. Though some memory should be set aside for the framework, in general it is advantageous to set this high enough to store large and numerous map outputs.

mapreduce.reduce.input.buffer.percent

float

The percentage of memory relative to the maximum heapsize in which map outputs may be retained during the reduce. When the reduce begins, map outputs will be merged to disk until those that remain are under the resource limit this defines. By default, all map outputs are merged to disk before the reduce begins to maximize the memory available to the reduce. For less memory-intensive reduces, this should be increased to avoid trips to disk.

Other notes

    • If a map output is larger than 25 percent of the memory allocated to copying map outputs, it will be written directly to disk without first staging through memory.
    • When running with a combiner, the reasoning about high merge thresholds and large buffers may not hold. For merges started before all map outputs have been fetched, the combiner is run while spilling to disk. In some cases, one can obtain better reduce times by spending resources combining map outputs- making disk spills small and parallelizing spilling and fetching- rather than aggressively increasing buffer sizes.
    • When merging in-memory map outputs to disk to begin the reduce, if an intermediate merge is necessary because there are segments to spill and at leastmapreduce.task.io.sort.factor?segments already on disk, the in-memory map outputs will be part of the intermediate merge.

正如前面提到的,每個(gè)reduce都會(huì)通過(guò)HTTP在內(nèi)存中拿到Partitioner分配好的數(shù)據(jù)并且定期地合并數(shù)據(jù)寫(xiě)到磁盤(pán)中。如果map輸出的中間值都進(jìn)行壓縮,那么每個(gè)輸出都會(huì)減少內(nèi)存的壓力。下面這些設(shè)置將會(huì)影響reduce之前的數(shù)據(jù)合并到磁盤(pán)的頻率和reduce過(guò)程中分配給map輸出的內(nèi)存。

其他說(shuō)明:

    • 如果一個(gè)map輸出大于分配給用于復(fù)制map輸出的內(nèi)存的25%,那么將會(huì)直接寫(xiě)到磁盤(pán)不會(huì)通過(guò)內(nèi)存進(jìn)行臨時(shí)緩存。
    • 當(dāng)運(yùn)行一個(gè)combiner,高的臨界值和大的緩存的理由將沒(méi)有效果。在map輸出進(jìn)行合并之前,combiner將會(huì)進(jìn)行溢出寫(xiě)到磁盤(pán)的操作。在一些例子當(dāng)中,耗費(fèi)資源combine map輸出數(shù)據(jù)獲得更小的溢出會(huì)比粗暴地增加緩存大小使得recuder的時(shí)間更少。
    • 當(dāng)合并內(nèi)存中的map數(shù)據(jù)到磁盤(pán)來(lái)開(kāi)始recuder時(shí),如果磁盤(pán)中已經(jīng)存在部分切片數(shù)據(jù)的話,那么必須將內(nèi)存中的數(shù)據(jù)作為磁盤(pán)中間數(shù)據(jù)的一部分來(lái)進(jìn)行合并操作。

?

Configured Parameters

The following properties are localized in the job configuration for each task's execution:

Name

Type

Description

mapreduce.job.id

String

The job id

mapreduce.job.jar

String

job.jar location in job directory

mapreduce.job.local.dir

String

The job specific shared scratch space

mapreduce.task.id

String

The task id

mapreduce.task.attempt.id

String

The task attempt id

mapreduce.task.is.map

boolean

Is this a map task

mapreduce.task.partition

int

The id of the task within the job

mapreduce.map.input.file

String

The filename that the map is reading from

mapreduce.map.input.start

long

The offset of the start of the map input split

mapreduce.map.input.length

long

The number of bytes in the map input split

mapreduce.task.output.dir

String

The task's temporary output directory

Note:?During the execution of a streaming job, the names of the "mapreduce" parameters are transformed. The dots ( . ) become underscores ( _ ). For example, mapreduce.job.id becomes mapreduce_job_id and mapreduce.job.jar becomes mapreduce_job_jar. To get the values in a streaming job's mapper/reducer use the parameter names with the underscores.

說(shuō)明:流式任務(wù)的執(zhí)行過(guò)程中,名字以mapreduce開(kāi)頭的參數(shù)會(huì)被改變。符號(hào)(.)會(huì)變成(_)。例如,mapreduce.job.id會(huì)變成mapreduce_job_id和mapreduce.job.jar會(huì)變成mapreduce_job_jar。在Mapper/Reducer中使用帶下劃線的參數(shù)名來(lái)獲得對(duì)應(yīng)的值。

?

Task Logs

The standard output (stdout) and error (stderr) streams and the syslog of the task are read by the NodeManager and logged to?${HADOOP_LOG_DIR}/userlogs.

NodeManager 會(huì)讀取stdout、sterr和任務(wù)的syslog并寫(xiě)到${HADOOP_LOG_DIR}/userlogs。

?

Distributing Libraries

The?DistributedCache?can also be used to distribute both jars and native libraries for use in the map and/or reduce tasks. The child-jvm always has its?current working directory?added to the?java.library.path?and?LD_LIBRARY_PATH. And hence the cached libraries can be loaded via?System.loadLibrary?or?System.load. More details on how to load shared libraries through distributed cache are documented at?Native Libraries.

分布是緩存也可以在map/reduce任務(wù)中用來(lái)分不是存儲(chǔ)jars和本地庫(kù)。子JVM經(jīng)常將它的工作路徑添加到j(luò)ava.librarypath和LD_LIBRARY_PATH.因此緩存的庫(kù)能通過(guò)System.loadLibrary?或者?System.load 來(lái)加載。更多關(guān)于如何通過(guò)分布式緩存來(lái)加載第三方庫(kù)參考Native Libraries.

?

Job Submission and Monitoring

Job?is the primary interface by which user-job interacts with the?ResourceManager.

Job?provides facilities to submit jobs, track their progress, access component-tasks' reports and logs, get the MapReduce cluster's status information and so on.

The job submission process involves:

  • Checking the input and output specifications of the job.
  • Computing the?InputSplit?values for the job.
  • Setting up the requisite accounting information for the?DistributedCache?of the job, if necessary.
  • Copying the job's jar and configuration to the MapReduce system directory on the?FileSystem.
  • Submitting the job to the?ResourceManager?and optionally monitoring it's status.
  • Job history files are also logged to user specified directory?mapreduce.jobhistory.intermediate-done-dir?and?mapreduce.jobhistory.done-dir, which defaults to job output directory.

    User can view the history logs summary in specified directory using the following command?
    $ mapred job -history output.jhist?
    This command will print job details, failed and killed tip details.?
    More details about the job such as successful tasks and task attempts made for each task can be viewed using the following command?
    $ mapred job -history all output.jhist

    Normally the user uses?Job?to create the application, describe various facets of the job, submit the job, and monitor its progress.

    Job 是用戶(hù)Job與ResourceManager交互的主要接口。

    Job 提供工具去提交jobs、跟蹤他們的進(jìn)程、使用組成任務(wù)的報(bào)告和日志,獲得MapReduce集群的狀態(tài)信息和其他。

    Job的提交包含以下內(nèi)容:

  • 檢查Job的輸入輸出指定
  • 計(jì)算Job的InputSplit的值
  • 如果必要的話,設(shè)置分布式緩存的需求信息。
  • 將Job的jar和configuration復(fù)制到Mapreduce系統(tǒng)的文件系統(tǒng)路徑下。
  • 將Job提交到ResourceManger并且隨時(shí)監(jiān)控它的狀態(tài)。
  • Job的歷史文件也被記錄到用戶(hù)通過(guò)mapreduce.jobhistory.intermediate-done-dir?and?mapreduce.jobhistory.done-dir指定的路徑下,默認(rèn)是Job的輸出路徑。

    用戶(hù)可以通過(guò)下面的指令來(lái)查看指定路徑下的所有的歷史記錄。

    $ mapred job -history output.jhist?

    這個(gè)命令可以打印job的細(xì)節(jié),失敗和殺死Job的技巧。用以下的命令可以考到更多關(guān)于Job例如成功任務(wù)和每個(gè)任務(wù)的目的細(xì)節(jié)。

    $ mapred job -history all output.jhist

    Normally the user uses?Job?to create the application, describe various facets of the job, submit the job, and monitor its progress.

    一般來(lái)說(shuō)用戶(hù)使用Job來(lái)創(chuàng)建應(yīng)用,描述Job的各個(gè)方面,提交Job和監(jiān)控它的進(jìn)程。

    ?

    Job Control

    Users may need to chain MapReduce jobs to accomplish(實(shí)現(xiàn)) complex tasks which cannot be done via a single MapReduce job. This is fairly easy since the output of the job typically goes to distributed file-system, and the output, in turn(依次), can be used as the input for the next job.

    However, this also means that the onus on ensuring jobs are complete (success/failure) lies squarely on the clients. In such cases, the various job-control options are:

      • Job.submit()?: Submit the job to the cluster and return immediately.
      • Job.waitForCompletion(boolean)?: Submit the job to the cluster and wait for it to finish.

    用戶(hù)可能需要將多個(gè)任務(wù)串行實(shí)現(xiàn)復(fù)雜任務(wù)而沒(méi)辦法通過(guò)一個(gè)MapReduce任務(wù)實(shí)現(xiàn)。這是相當(dāng)容易,job的output通常是輸出到分布式緩存,而輸出,依次作為下一個(gè)任務(wù)的輸入。

    然而,這也意味確保任務(wù)的完成(成功/失敗)的義務(wù)是完全建立在客戶(hù)端上。在這種情況下,各種作業(yè)的控制選項(xiàng)有:

      • Job.submit()?:提交作業(yè)給集群并立刻回復(fù)
      • Job.waitForCompletion(boolean)?:提交作業(yè)給集群并且等待它完成。

    ?

    Job Input

    InputFormat?describes the input-specification for a MapReduce job.

    The MapReduce framework relies on the?InputFormat?of the job to:

  • Validate the input-specification of the job.
  • Split-up the input file(s) into logical?InputSplit?instances, each of which is then assigned to an individual?Mapper.
  • Provide the?RecordReader?implementation used to glean input records from the logical?InputSplit?for processing by the?Mapper.
  • The default behavior of file-based?InputFormat?implementations, typically sub-classes of?FileInputFormat, is to split the input into?logical?InputSplit?instances based on the total size, in bytes, of the input files. However, the?FileSystem?blocksize of the input files is treated as an upper bound for input splits. A lower bound on the split size can be set viamapreduce.input.fileinputformat.split.minsize.

    Clearly, logical splits based on input-size is insufficient for many applications since record boundaries must be respected. In such cases, the application should implement a?RecordReader, who is responsible for respecting record-boundaries and presents a record-oriented view of the logical?InputSplit?to the individual task.

    TextInputFormat?is the default?InputFormat.

    If?TextInputFormat?is the?InputFormat?for a given job, the framework detects input-files with the?.gz?extensions and automatically decompresses them using the appropriate CompressionCodec. However, it must be noted that compressed files with the above extensions cannot be?split?and each compressed file is processed in its entirety by a single mapper.

    InputFormat描述MapReduce Job的輸入規(guī)定。

    MapReduce框架依賴(lài)Job的InputFormat:

  • 使Job的輸入設(shè)定生效。
  • 將輸入文件分割成邏輯上的輸入塊實(shí)例,并將每一輸入塊分配給單獨(dú)的Mapper。
  • 提供RecordReader實(shí)現(xiàn)用于收集從邏輯輸入塊的記錄輸入到Mapper中。
  • 那些默認(rèn)的基于InputFormat的實(shí)現(xiàn),通常來(lái)說(shuō)FileInputForamt的子類(lèi),基于總字節(jié)數(shù)將輸入基于字節(jié)數(shù)分成邏輯輸入塊實(shí)例,然而,FileSystem的塊大小將是inputSplits的上限值,下限值可以通過(guò)mapreduce.input.fileinputformat.split.minsize來(lái)設(shè)置。

    很明顯,很多應(yīng)用必須重視記錄的邊界,因存在著輸入大小不足以邏輯分割。在這種情況,應(yīng)用應(yīng)當(dāng)實(shí)現(xiàn)一個(gè)RecordReader,負(fù)責(zé)在單獨(dú)任務(wù)中處理記錄邊界和顯示,面向記錄的邏輯視圖。

    TextInputForamt是默認(rèn)的InputForamt。

    如果job的InputForamt是TextInputFormat,框架會(huì)對(duì)輸入文件進(jìn)行檢測(cè),如果擴(kuò)展名為.gz那么會(huì)自動(dòng)用合適的壓縮編碼器進(jìn)行解壓。然而,必須說(shuō)明的是經(jīng)過(guò)壓縮的文件將不能被切割并且每一個(gè)壓縮文件都必須完全在一個(gè)Mapper單獨(dú)處理。

    ?

    InputSplit

    InputSplit?represents the data to be processed by an individual?Mapper.

    Typically?InputSplit?presents a byte-oriented view of the input, and it is the responsibility of?RecordReader?to process and present a record-oriented view.

    FileSplit?is the default?InputSplit. It sets?mapreduce.map.input.file?to the path of the input file for the logical split.

    輸入塊表示每個(gè)單獨(dú)Mapper處理的數(shù)據(jù)。

    通常來(lái)說(shuō),輸入塊代表輸入的面向字節(jié)視圖,而RecordReader代表的是面向記錄視圖。

    FileSplit是默認(rèn)的InputSplit。mapreduce.map.input.file設(shè)置用于邏輯分割的輸入路徑。

    ?

    RecordReader

    RecordReader?reads?<key, value>?pairs from an?InputSplit.

    Typically the?RecordReader?converts the byte-oriented view of the input, provided by the?InputSplit, and presents a record-oriented to the?Mapper?implementations for processing.RecordReader?thus assumes the responsibility of processing record boundaries and presents the tasks with keys and values.

    RecordReader從InputSplit讀取鍵值對(duì)。

    通常來(lái)說(shuō),RecordReader將輸入的面向字節(jié)視圖轉(zhuǎn)換成面向記錄視圖并輸入到Mapper的實(shí)現(xiàn)類(lèi)進(jìn)行處理。RecordReader因此承擔(dān)起處理記錄邊界和顯示任務(wù)的Keys和Values的責(zé)任。

    ?

    Job Output

    OutputFormat?describes the output-specification for a MapReduce job.

    The MapReduce framework relies on the?OutputFormat?of the job to:

  • Validate the output-specification of the job; for example, check that the output directory doesn't already exist.
  • Provide the?RecordWriter?implementation used to write the output files of the job. Output files are stored in a?FileSystem.
  • TextOutputFormat?is the default?OutputFormat.

    OutputFormat 描述MapReduce Job的輸出規(guī)定。

    MapReduce 框架依賴(lài)于Job的OutputFormat:

  • 使job的輸出設(shè)置生效;例如,檢查輸出路徑是否已經(jīng)存在。
  • 提供RecordWriter實(shí)現(xiàn)用于輸出文件。輸出文件儲(chǔ)存在FileSystem。
  • ?

    OutputCommitter

    OutputCommitter?describes the commit of task output for a MapReduce job.

    The MapReduce framework relies on the?OutputCommitter?of the job to:

  • Setup the job during initialization. For example, create the temporary output directory for the job during the initialization of the job. Job setup is done by a separate task when the job is in PREP state and after initializing tasks. Once the setup task completes, the job will be moved to RUNNING state.
  • Cleanup the job after the job completion. For example, remove the temporary output directory after the job completion. Job cleanup is done by a separate task at the end of the job. Job is declared SUCCEDED/FAILED/KILLED after the cleanup task completes.
  • Setup the task temporary output. Task setup is done as part of the same task, during task initialization.
  • Check whether a task needs a commit. This is to avoid the commit procedure if a task does not need commit.
  • Commit of the task output. Once task is done, the task will commit it's output if required.
  • Discard the task commit. If the task has been failed/killed, the output will be cleaned-up. If task could not cleanup (in exception block), a separate task will be launched with same attempt-id to do the cleanup.
  • FileOutputCommitter?is the default?OutputCommitter. Job setup/cleanup tasks occupy map or reduce containers, whichever is available on the NodeManager. And JobCleanup task, TaskCleanup tasks and JobSetup task have the highest priority, and in that order.

    OutputCommitter 描述著MapReduce的任務(wù)輸出的提交。

    MapReduce依賴(lài)于Job的輸出提交器:

  • 初始化時(shí)設(shè)置Job。例如,job的初始化過(guò)程中創(chuàng)建臨時(shí)輸出路徑。當(dāng)Job處于準(zhǔn)備階段和初始化任務(wù)之后,Job通過(guò)一個(gè)單獨(dú)的任務(wù)完成創(chuàng)建。,一旦任務(wù)的創(chuàng)建完成之后,job將會(huì)轉(zhuǎn)成運(yùn)行狀態(tài)。
  • Job完成之后清除Job。例如,Job完成后移除臨時(shí)輸出路徑。Job結(jié)束之時(shí)用一個(gè)單獨(dú)的任務(wù)完成Job的清除。在完成對(duì)任務(wù)的清除之后Job會(huì)聲明SUCCEDED/FAILED/KILLED.
  • 設(shè)置任務(wù)臨時(shí)輸出。在任務(wù)的初始化過(guò)程中,任務(wù)設(shè)置作為任務(wù)的一部分來(lái)完成。
  • 檢查一個(gè)任務(wù)是否需要提交。這將避免一個(gè)不需要提交的任務(wù)執(zhí)行提交程序。
  • 提交任務(wù)的輸出。一旦任務(wù)完成,任務(wù)將會(huì)提交它的輸出如果需要的話。
  • 放棄任務(wù)提交。如果任務(wù)已經(jīng)失敗或者被殺死,那么輸出將會(huì)被清除掉。如果任務(wù)因?yàn)橐馔鉀](méi)有被清除掉,那么一個(gè)單獨(dú)的任務(wù)將會(huì)被運(yùn)行來(lái)執(zhí)行清除工作。
  • FileOutputCommitter是默認(rèn)的OutputCommitter。Job 創(chuàng)建/清除任務(wù)占有map或者reduce容器,無(wú)論NodeManager是否可用。Job的清除任務(wù),任務(wù)的清除任務(wù)和Job的創(chuàng)建任務(wù)擁有最高的優(yōu)先級(jí)。

    ?

    Task Side-Effect Files?

    In some applications, component tasks need to create and/or write to side-files, which differ from the actual job-output files.

    In such cases there could be issues with two instances of the same?Mapper?or?Reducer?running simultaneously (for example, speculative tasks) trying to open and/or write to the same file (path) on the?FileSystem. Hence the application-writer will have to pick unique names per task-attempt (using the attemptid, say?attempt_200709221812_0001_m_000000_0), not just per task.

    To avoid these issues the MapReduce framework, when the?OutputCommitter?is?FileOutputCommitter, maintains a special${mapreduce.output.fileoutputformat.outputdir}/_temporary/_${taskid}?sub-directory accessible via?${mapreduce.task.output.dir}?for each task-attempt on the?FileSystem?where the output of the task-attempt is stored. On successful completion of the task-attempt, the files in the?${mapreduce.output.fileoutputformat.outputdir}/_temporary/_${taskid}?(only) are?promoted?to${mapreduce.output.fileoutputformat.outputdir}. Of course, the framework discards the sub-directory of unsuccessful task-attempts. This process is completely transparent to the application.

    The application-writer can take advantage of this feature by creating any side-files required in?${mapreduce.task.output.dir}?during execution of a task viaFileOutputFormat.getWorkOutputPath(Conext), and the framework will promote them similarly for succesful task-attempts, thus eliminating the need to pick unique paths per task-attempt.

    Note: The value of?${mapreduce.task.output.dir}?during execution of a particular task-attempt is actually?${mapreduce.output.fileoutputformat.outputdir}/_temporary/_{$taskid}, and this value is set by the MapReduce framework. So, just create any side-files in the path returned by?FileOutputFormat.getWorkOutputPath(Conext)?from MapReduce task to take advantage of this feature.

    The entire discussion holds true for maps of jobs with reducer=NONE (i.e. 0 reduces) since output of the map, in that case, goes directly to HDFS.

    在一些應(yīng)用當(dāng)中,組成的任務(wù)必須創(chuàng)建一些其他文檔,跟實(shí)際輸出不同的文檔。

    在這些情況當(dāng)中將會(huì)同時(shí)存在兩個(gè)Mapper或者Reducer實(shí)例去打開(kāi)或者寫(xiě)到FileSystem中相同的文檔。因此應(yīng)用開(kāi)發(fā)者將會(huì)獲取獨(dú)一無(wú)二的任務(wù)目的(使用目的ID,假如say?attempt_200709221812_0001_m_000000_0),不僅是每個(gè)任務(wù)。

    說(shuō)明:${mapreduce.task.output.dir}的值在一個(gè)特定任務(wù)執(zhí)行過(guò)程中實(shí)際上是${mapreduce.output.fileoutputformat.outputdir}/_temporary/_{$taskid}的值,而這個(gè)值是由MapReduce框架設(shè)定的。所以,MapReduce任務(wù)利用這個(gè)特性從FileOutputForamt.getWorkOutPath(Context)返回的路徑創(chuàng)建副文檔。

    整個(gè)討論適用于作業(yè)有map但沒(méi)有reduce的情況,因此map的output直接寫(xiě)到HDFS.

    ?

    RecordWriter

    RecordWriter?writes the output?<key, value>?pairs to an output file.

    RecordWriter implementations write the job outputs to the?FileSystem.

    RecordWriter將鍵值對(duì)的輸出寫(xiě)到輸出文件中。

    RecordWriter實(shí)現(xiàn)類(lèi)將job的輸出寫(xiě)到FileSytem。

    ?

    Other Useful Features?

    Submitting Jobs to Queues?

    Users submit jobs to Queues. Queues, as collection of jobs, allow the system to provide specific functionality. For example, queues use ACLs to control which users who can submit jobs to them. Queues are expected to be primarily used by Hadoop Schedulers.

    Hadoop comes configured with a single mandatory queue, called 'default'. Queue names are defined in the?mapreduce.job.queuename> property of the Hadoop site configuration. Some job schedulers, such as the?Capacity Scheduler, support multiple queues.

    A job defines the queue it needs to be submitted to through the?mapreduce.job.queuename?property, or through the Configuration.set(MRJobConfig.QUEUE_NAME, String) API. Setting the queue name is optional. If a job is submitted without an associated queue name, it is submitted to the 'default' queue.

    用戶(hù)提交job到隊(duì)列中。隊(duì)列,也就是job的集合,允許系統(tǒng)提供特定的功能。例如,隊(duì)列使用ACLS來(lái)控制哪些用戶(hù)可以提交隊(duì)列。Hadoop Schedulers是隊(duì)列的主要使用者。

    Hadoop設(shè)置一個(gè)單獨(dú)的強(qiáng)制的隊(duì)列,稱(chēng)之為“默認(rèn)”。隊(duì)列的名稱(chēng)是在Hadoop-site配置文件中的mapreduce.job.queuename>屬性決定的。一些作業(yè)調(diào)度器支持多個(gè)的隊(duì)列,例如容量調(diào)度器。

    一個(gè)作業(yè)通過(guò)mapreduce.job.queuename屬性或者Configuration.set(MRJobConfig.QUEUE_NAME, String)API來(lái)定義一個(gè)隊(duì)列。設(shè)置隊(duì)列的名字是可選的。如果一個(gè)作業(yè)被提交時(shí)并沒(méi)有設(shè)置隊(duì)列名稱(chēng),那么隊(duì)列名稱(chēng)為“默認(rèn)”。

    ?

    Counters?

    Counters?represent global counters, defined either by the MapReduce framework or applications. Each?Counter?can be of any?Enum?type. Counters of a particular?Enum?are bunched into groups of type?Counters.Group.

    Applications can define arbitrary?Counters?(of type?Enum) and update them via?Counters.incrCounter(Enum, long)?or Counters.incrCounter(String, String, long) in the?map?and/or?reducemethods. These counters are then globally aggregated by the framework.

    計(jì)數(shù)器是全局計(jì)數(shù)器,由MapReduce框架或者應(yīng)用定義。每一個(gè)計(jì)數(shù)器都可以是任何枚舉類(lèi)型。Counters of a particular?Enum?are bunched into groups of type?Counters.Group。

    應(yīng)用可以定義任意計(jì)數(shù)器和通過(guò) Counters.incrCounter(Enum, long)?或者Counters.incrCounter(String, String, long)來(lái)更新在map/reduce方法中。這些計(jì)數(shù)器是通過(guò)框架進(jìn)行全局計(jì)算的。

    ?

    DistributedCache?

    DistributedCache?distributes application-specific, large, read-only files efficiently.

    DistributedCache?is a facility provided by the MapReduce framework to cache files (text, archives, jars and so on) needed by applications.

    Applications specify the files to be cached via urls (hdfs://) in the?Job. The?DistributedCache?assumes that the files specified via hdfs:// urls are already present on the?FileSystem.

    The framework will copy the necessary files to the slave node before any tasks for the job are executed on that node. Its efficiency stems from the fact that the files are only copied once per job and the ability to cache archives which are un-archived on the slaves.

    DistributedCache?tracks the modification timestamps of the cached files. Clearly the cache files should not be modified by the application or externally while the job is executing.

    DistributedCache?can be used to distribute simple, read-only data/text files and more complex types such as archives and jars. Archives (zip, tar, tgz and tar.gz files) are?un-archived?at the slave nodes. Files have?execution permissions?set.

    The files/archives can be distributed by setting the property?mapreduce.job.cache.{files|archives}. If more than one file/archive has to be distributed, they can be added as comma separated paths. The properties can also be set by APIs?Job.addCacheFile(URI)/?Job.addCacheArchive(URI)?and?Job.setCacheFiles(URI[])/?Job.setCacheArchives(URI[])?where URI is of the form?hdfs://host:port/absolute-path#link-name. In Streaming, the files can be distributed through command line option?-cacheFile/-cacheArchive.

    The?DistributedCache?can also be used as a rudimentary software distribution mechanism for use in the map and/or reduce tasks. It can be used to distribute both jars and native libraries. The?Job.addArchiveToClassPath(Path)?or?Job.addFileToClassPath(Path)?api can be used to cache files/jars and also add them to the?classpath?of child-jvm. The same can be done by setting the configuration properties?mapreduce.job.classpath.{files|archives}. Similarly the cached files that are symlinked into the working directory of the task can be used to distribute native libraries and load them.

    分布式緩存有效地分布存儲(chǔ)應(yīng)用專(zhuān)用的、大的、只讀的文件。

    分布是緩存是MapReduce框架提供給應(yīng)用用于緩存文件(文本,檔案、jar包和其他)。

    應(yīng)用可以通過(guò)urls (hdfs://)在Job中指定文件的緩存路徑。分布式緩存假設(shè)通過(guò)hdfs:// urls指定的文件已經(jīng)存在現(xiàn)在的FileSystem。

    這個(gè)框架將在某個(gè)從屬節(jié)點(diǎn)執(zhí)行任何任務(wù)之前復(fù)制必要的文件到該節(jié)點(diǎn)上。它的高效源于這樣的事實(shí):每個(gè)作業(yè)只復(fù)制一次到那些能夠存檔但是還沒(méi)存檔的節(jié)點(diǎn)上。

    分布式緩存跟蹤緩存文件的修改時(shí)間戳。顯然當(dāng)作業(yè)在執(zhí)行時(shí)緩存文件不應(yīng)該被應(yīng)用或者外部修改。

    分布式緩存可以用來(lái)分布緩存簡(jiǎn)單的、只讀的的數(shù)據(jù)或者文本文檔和更復(fù)雜類(lèi)型例如檔案和Jar包。檔案(zip, tar, tgz and tar.gz files)指的是未存檔到從屬節(jié)點(diǎn)的。文檔是有執(zhí)行權(quán)限的。

    文件可以通過(guò)設(shè)置mapreduce.job.cache.{files|archives}屬性來(lái)分配存儲(chǔ)。如果有更多的文件需要存儲(chǔ),那么在用逗號(hào)隔開(kāi)路徑即可。該屬性還可以通過(guò)Job.addCacheFile(URI)/?Job.addCacheArchive(URI)?and?Job.setCacheFiles(URI[])/?Job.setCacheArchives(URI[])?來(lái)設(shè)置,URL的格式為hdfs://host:port/absolute-path#link-name。文件可以通過(guò)命令-cacheFile/-cacheArchive來(lái)實(shí)現(xiàn)分配存儲(chǔ)。

    分布式緩存也可以用作一個(gè)基本的軟件分發(fā)機(jī)制用于map/reduce任務(wù)。它也可以用來(lái)分布存儲(chǔ)jar包和本地庫(kù)。Job.addArchiveToClassPath(Path)?or?Job.addFileToClassPath(Path)?api可以用來(lái)緩存文件/jars并且子Jvm也會(huì)將它們添加到類(lèi)路徑下。通過(guò)設(shè)置mapreduce.job.classpath.{files|archives}屬性也可以達(dá)到同樣效果。同樣地緩存文件通過(guò)符號(hào)鏈接到任務(wù)的工作路徑來(lái)分布緩存本地庫(kù)和加載它們。

    ?

    Private and Public DistributedCache Files?

    DistributedCache files can be private or public, that determines how they can be shared on the slave nodes.

      • "Private" DistributedCache files are cached in a local directory private to the user whose jobs need these files. These files are shared by all tasks and jobs of the specific user only and cannot be accessed by jobs of other users on the slaves. A DistributedCache file becomes private by virtue of its permissions on the file system where the files are uploaded, typically HDFS. If the file has no world readable access, or if the directory path leading to the file has no world executable access for lookup, then the file becomes private.
      • "Public" DistributedCache files are cached in a global directory and the file access is setup such that they are publicly visible to all users. These files can be shared by tasks and jobs of all users on the slaves. A DistributedCache file becomes public by virtue of its permissions on the file system where the files are uploaded, typically HDFS. If the file has world readable access, AND if the directory path leading to the file has world executable access for lookup, then the file becomes public. In other words, if the user intends to make a file publicly available to all users, the file permissions must be set to be world readable, and the directory permissions on the path leading to the file must be world executable.

    分布式緩存文件可以是私有的或者公有的,以確定它們是否可以被分享到從屬節(jié)點(diǎn)。

      • ?私有分布式緩存文件被緩存在局部路徑屬于那些作業(yè)需要這些文件的用戶(hù)。這些文件只可以被指定用戶(hù)的所有任務(wù)和Job使用,而這些節(jié)點(diǎn)的其他用戶(hù)就不能使用。分布式緩存文檔在它所上傳的文件系統(tǒng)中通過(guò)他的權(quán)限變成私有的,文件系統(tǒng)通常為HDFS.如果這些文檔沒(méi)有全局讀取權(quán)限,或者它的路徑?jīng)]有全局的可執(zhí)行查找權(quán)限,那么這些文檔就是私有的。
      • 公有分布式緩存文檔被緩存在一個(gè)全局路徑并且文件被設(shè)置為對(duì)所有用戶(hù)都可見(jiàn)。這些文件可以被所有節(jié)點(diǎn)上的所有用戶(hù)分享。分布式緩存文件在它所上傳的文件系統(tǒng)上通過(guò)他的權(quán)限變成公有的,文件系統(tǒng)通常為HDFS。如果文件具有全局可讀權(quán)限,并且他的路徑具有全局的可執(zhí)行查找權(quán)限,那么它就是公有的。也就是說(shuō),如果用戶(hù)想要使文件對(duì)所有用戶(hù)可見(jiàn)可操作,那么文件權(quán)限必須是全局可讀和他的路徑權(quán)限必須是全局可執(zhí)行。

    ?

    Profiling

    Profiling is a utility to get a representative (2 or 3) sample of built-in java profiler for a sample of maps and reduces.

    User can specify whether the system should collect profiler information for some of the tasks in the job by setting the configuration property?mapreduce.task.profile. The value can be set using the api Configuration.set(MRJobConfig.TASK_PROFILE, boolean). If the value is set?true, the task profiling is enabled. The profiler information is stored in the user log directory. By default, profiling is not enabled for the job.

    Once user configures that profiling is needed, she/he can use the configuration property?mapreduce.task.profile.{maps|reduces}?to set the ranges of MapReduce tasks to profile. The value can be set using the api Configuration.set(MRJobConfig.NUM_{MAP|REDUCE}_PROFILES, String). By default, the specified range is?0-2.

    User can also specify the profiler configuration arguments by setting the configuration property?mapreduce.task.profile.params. The value can be specified using the api Configuration.set(MRJobConfig.TASK_PROFILE_PARAMS, String). If the string contains a?%s, it will be replaced with the name of the profiling output file when the task runs. These parameters are passed to the task child JVM on the command line. The default value for the profiling parameters is?-agentlib:hprof=cpu=samples,heap=sites,force=n,thread=y,verbose=n,file=%s.

    分析器是一個(gè)工具可以用來(lái)獲取2到3個(gè)Java內(nèi)置分析器關(guān)于map和reduce的分析樣本。

    用戶(hù)可以通過(guò)?mapreduce.task.profile來(lái)指定系統(tǒng)是否要收集某個(gè)作業(yè)的一些任務(wù)分析信息。這個(gè)值也可以通過(guò)Configuration.set(MRJobConfig.TASK_PROFILE, boolean) api來(lái)設(shè)置。如果這個(gè)值為真,那么任務(wù)分析將會(huì)生效。分析器的信息將儲(chǔ)存在用戶(hù)的log路徑下。該屬性默認(rèn)是不生效的。

    一旦用戶(hù)配置了該屬性,那么他/她就可以通過(guò)?mapreduce.task.profile.{maps|reduces}?來(lái)設(shè)置MapReduce任務(wù)的范圍。這個(gè)值也可以通過(guò)Configuration.set(MRJobConfig.NUM_{MAP|REDUCE}_PROFILES, String) api來(lái)設(shè)置。默認(rèn)的值為0-2。

    用戶(hù)也可以通過(guò)配置mapreduce.task.profile.params屬性來(lái)指定分析器的的參數(shù)。這個(gè)值也可以通過(guò)api Configuration.set(MRJobConfig.TASK_PROFILE_PARAMS, String)來(lái)設(shè)置。假如字符串里面包含%s,那么將會(huì)在任務(wù)執(zhí)行時(shí)被替換成分析輸出文件的名字。這些參數(shù)將會(huì)在命令行中傳輸給任務(wù)所在的子JVM。默認(rèn)的參數(shù)的值為-agentlib:hprof=cpu=samples,heap=sites,force=n,thread=y,verbose=n,file=%s。

    ?

    Debugging

    The MapReduce framework provides a facility to run user-provided scripts for debugging. When a MapReduce task fails, a user can run a debug script, to process task logs for example. The script is given access to the task's stdout and stderr outputs, syslog and jobconf. The output from the debug script's stdout and stderr is displayed on the console diagnostics and also as part of the job UI.

    In the following sections we discuss how to submit a debug script with a job. The script file needs to be distributed and submitted to the framework.

    MapReduce框架提供一個(gè)工具用來(lái)運(yùn)行用戶(hù)提供的腳本用于調(diào)試。當(dāng)一個(gè)MapReduce任務(wù)失敗,用戶(hù)可以運(yùn)行調(diào)試腳本,去處理任務(wù)log。腳本可以讀取任務(wù)的stdout、stderr輸出、syslog和jobconf。調(diào)試腳本的stdout和sterr輸出將會(huì)作為Job UI的一部分顯示出來(lái)。

    在接下來(lái)的部分我們將討論如何提交一個(gè)調(diào)試腳本到作業(yè)中。腳本文件需要提交和存儲(chǔ)在框架中。

    ?

    How to distribute the script file:

    The user needs to use?DistributedCache?to?distribute?and?symlink?the script file.

    用戶(hù)需要使用分布式緩存來(lái)分發(fā)和符號(hào)鏈接腳本文件。

    ?

    How to submit the script:

    A quick way to submit the debug script is to set values for the properties?mapreduce.map.debug.script?and?mapreduce.reduce.debug.script, for debugging map and reduce tasks respectively. These properties can also be set by using APIs?Configuration.set(MRJobConfig.MAP_DEBUG_SCRIPT, String)?and?Configuration.set(MRJobConfig.REDUCE_DEBUG_SCRIPT, String). In streaming mode, a debug script can be submitted with the command-line options?-mapdebug?and?-reducedebug, for debugging map and reduce tasks respectively.

    The arguments to the script are the task's stdout, stderr, syslog and jobconf files. The debug command, run on the node where the MapReduce task failed, is:?
    $script $stdout $stderr $syslog $jobconf

    Pipes programs have the c++ program name as a fifth argument for the command. Thus for the pipes programs the command is?
    $script $stdout $stderr $syslog $jobconf $program

    通過(guò)mapreduce.map.debug.script?和nd?mapreduce.reduce.debug.script屬性來(lái)分別設(shè)置map和reduce任務(wù)的調(diào)試腳本是一個(gè)快速的提交調(diào)試腳本的方法。這些屬性可以通過(guò)APIs?Configuration.set(MRJobConfig.MAP_DEBUG_SCRIPT, String)?和?Configuration.set(MRJobConfig.REDUCE_DEBUG_SCRIPT, String)來(lái)設(shè)置。在流式編程模式,可以通過(guò)命令行選項(xiàng)?-mapdebug?和?–reducedebug來(lái)分別設(shè)置map和reduce的調(diào)試腳本用于調(diào)試。

    腳本的參數(shù)是任務(wù)的標(biāo)準(zhǔn)輸出、標(biāo)準(zhǔn)錯(cuò)誤、系統(tǒng)日志和作業(yè)配置文檔。調(diào)試命令,運(yùn)行在某個(gè)Mapreduce任務(wù)失敗的節(jié)點(diǎn)上,是$script $stdout $stderr $syslog $jobconf $program。

    擁有C++程度的Pipes項(xiàng)目在命令中增加第五個(gè)參數(shù)。因此命令如下:$script $stdout $stderr $syslog $jobconf $program

    ?

    Default Behavior:

    For pipes, a default script is run to process core dumps under gdb, prints stack trace and gives info about running threads.

    在Pipes中,默認(rèn)的腳本是運(yùn)行在GDP的核心轉(zhuǎn)儲(chǔ),打印堆跟蹤和運(yùn)行線程的信息。

    ?

    Data Compression

    Hadoop MapReduce provides facilities for the application-writer to specify compression for both intermediate map-outputs and the job-outputs i.e. output of the reduces. It also comes bundled with?CompressionCodec?implementation for the?zlib?compression algorithm. The?gzip,?bzip2,?snappy, and?lz4?file format are also supported.

    Hadoop also provides native implementations of the above compression codecs for reasons of both performance (zlib) and non-availability of Java libraries. More details on their usage and availability are available?here.

    Hadoop MapReduce提供一個(gè)功能讓?xiě)?yīng)用開(kāi)發(fā)指定壓縮方式用于map輸出的中間數(shù)據(jù)和job-outputs也就是reduce的輸出。它也捆綁著實(shí)現(xiàn)zlib壓縮算法的壓縮編碼器。支持gzip、bzip2、snappy和lz4文件格式的文檔。

    Hadoop也提供上述編碼器的本地實(shí)現(xiàn),因?yàn)樾阅芎蚃ava庫(kù)不支持的原因。更多關(guān)于它們的使用細(xì)節(jié)和可用性可參考官方文檔。

    ?

    Intermediate Outputs

    Applications can control compression of intermediate map-outputs via the Configuration.set(MRJobConfig.MAP_OUTPUT_COMPRESS, boolean) api and the?CompressionCodec?to be used via the Configuration.set(MRJobConfig.MAP_OUTPUT_COMPRESS_CODEC, Class) api.

    應(yīng)用可以通過(guò)Configuration.set(MRJobConfig.MAP_OUTPUT_COMPRESS, boolean) api來(lái)設(shè)置是否對(duì)map的輸出進(jìn)行壓縮和Configuration.set(MRJobConfig.MAP_OUTPUT_COMPRESS_CODEC, Class) api指定壓縮編碼器。

    ?

    Job Outputs

    Applications can control compression of job-outputs via the?FileOutputFormat.setCompressOutput(Job, boolean)?api and the?CompressionCodec?to be used can be specified via the FileOutputFormat.setOutputCompressorClass(Job, Class) api.

    If the job outputs are to be stored in the?SequenceFileOutputFormat, the required?SequenceFile.CompressionType?(i.e.?RECORD?/?BLOCK?- defaults to?RECORD) can be specified via the SequenceFileOutputFormat.setOutputCompressionType(Job, SequenceFile.CompressionType) api.

    應(yīng)用可以通過(guò)FileOutputFormat.setCompressOutput(Job, boolean)?api來(lái)控制是否對(duì)作業(yè)輸出進(jìn)行壓縮和通過(guò)FileOutputFormat.setOutputCompressorClass(Job, Class)api來(lái)設(shè)置壓縮編碼器。

    如果作業(yè)的輸出是以SequenceFileOutputFormat格式存儲(chǔ)的,那么需要序列化。壓縮類(lèi)型通過(guò)SequenceFileOutputFormat.setOutputCompressionType(Job, SequenceFile.CompressionType) api來(lái)指定。

    ?

    Skipping Bad Records

    Hadoop provides an option where a certain set of bad input records can be skipped when processing map inputs. Applications can control this feature through the?SkipBadRecords?class.

    This feature can be used when map tasks crash deterministically on certain input. This usually happens due to bugs in the map function. Usually, the user would have to fix these bugs. This is, however, not possible sometimes. The bug may be in third party libraries, for example, for which the source code is not available. In such cases, the task never completes successfully even after multiple attempts, and the job fails. With this feature, only a small portion of data surrounding the bad records is lost, which may be acceptable for some applications (those performing statistical analysis on very large data, for example).

    By default this feature is disabled. For enabling it, refer to?SkipBadRecords.setMapperMaxSkipRecords(Configuration, long)?and?SkipBadRecords.setReducerMaxSkipGroups(Configuration, long).

    With this feature enabled, the framework gets into 'skipping mode' after a certain number of map failures. For more details, seeSkipBadRecords.setAttemptsToStartSkipping(Configuration, int). In 'skipping mode', map tasks maintain the range of records being processed. To do this, the framework relies on the processed record counter. See?SkipBadRecords.COUNTER_MAP_PROCESSED_RECORDS?and?SkipBadRecords.COUNTER_REDUCE_PROCESSED_GROUPS. This counter enables the framework to know how many records have been processed successfully, and hence, what record range caused a task to crash. On further attempts, this range of records is skipped.

    The number of records skipped depends on how frequently the processed record counter is incremented by the application. It is recommended that this counter be incremented after every record is processed. This may not be possible in some applications that typically batch their processing. In such cases, the framework may skip additional records surrounding the bad record. Users can control the number of skipped records through?SkipBadRecords.setMapperMaxSkipRecords(Configuration, long)?andSkipBadRecords.setReducerMaxSkipGroups(Configuration, long). The framework tries to narrow the range of skipped records using a binary search-like approach. The skipped range is divided into two halves and only one half gets executed. On subsequent failures, the framework figures out which half contains bad records. A task will be re-executed till the acceptable skipped value is met or all task attempts are exhausted. To increase the number of task attempts, use?Job.setMaxMapAttempts(int)?and?Job.setMaxReduceAttempts(int)

    Skipped records are written to HDFS in the sequence file format, for later analysis. The location can be changed through?SkipBadRecords.setSkipOutputPath(JobConf, Path).

    Hadoop提供一個(gè)選項(xiàng)當(dāng)執(zhí)行map輸入時(shí)可以跳過(guò)某一組確定的壞數(shù)據(jù)。應(yīng)用可以通過(guò)SkipBadRecords 類(lèi)來(lái)控制特性。

    當(dāng)map任務(wù)中某些輸入一定會(huì)導(dǎo)致崩潰時(shí)可以使用這個(gè)屬性。這通常發(fā)生在map函數(shù)中的bug。通常地,用戶(hù)會(huì)修復(fù)這些bug。然而,某些時(shí)候不一定有用。這個(gè)bug可能是第三方庫(kù)導(dǎo)致的,例如那些源代碼看不了的。在這些情況當(dāng)中,盡管經(jīng)過(guò)多次嘗試都沒(méi)有辦法完成任務(wù),作業(yè)也會(huì)失敗。通過(guò)這個(gè)屬性,只有一小部分的壞數(shù)據(jù)周邊數(shù)據(jù)會(huì)丟失,這對(duì)于某些應(yīng)用是可以接受的(那么數(shù)據(jù)量非常的統(tǒng)計(jì)分析)

    這個(gè)屬性默認(rèn)是失效的。可以通過(guò)SkipBadRecords.setMapperMaxSkipRecords(Configuration, long)?和SkipBadRecords.setReducerMaxSkipGroups(Configuration, long)。來(lái)使它生效。

    當(dāng)這個(gè)屬性生效,框架在一定數(shù)量的map失敗后會(huì)進(jìn)入“跳過(guò)模式”。在跳過(guò)模式中,map任務(wù)維持被處理數(shù)據(jù)的范圍,看看SkipBadRecords.setAttemptsToStartSkipping(Configuration, int)。為了達(dá)到這個(gè)目標(biāo),框架依賴(lài)于記錄計(jì)數(shù)器。看看SkipBadRecords.COUNTER_MAP_PROCESSED_RECORDS?and?SkipBadRecords.COUNTER_REDUCE_PROCESSED_GROUPS的說(shuō)明。這個(gè)計(jì)數(shù)器是的框架可以知道有多少條記錄被成功處理了,因此來(lái)找出哪些記錄范圍會(huì)引起任務(wù)崩潰。在進(jìn)一步的嘗試中,這些范圍的記錄會(huì)被跳過(guò)。

    跳過(guò)記錄的數(shù)目取決于運(yùn)行的記錄計(jì)數(shù)器的增長(zhǎng)頻率。建議這個(gè)計(jì)數(shù)器在每天記錄處理增加。這在批量處理中可已不太可能實(shí)現(xiàn)。在這些情況當(dāng)中,框架會(huì)跳過(guò)不良記錄附近的額外數(shù)據(jù)。用戶(hù)可以通過(guò)SkipBadRecords.setMapperMaxSkipRecords(Configuration, long)?andSkipBadRecords.setReducerMaxSkipGroups(Configuration, long)來(lái)控制跳過(guò)記錄的數(shù)量。框架會(huì)試圖使用二進(jìn)制搜索方式來(lái)縮窄跳過(guò)記錄的范圍。跳過(guò)范圍被分成兩部分并且只有其中一半會(huì)被拿來(lái)執(zhí)行。在接下來(lái)的錯(cuò)誤當(dāng)中,框架將會(huì)指出哪一半范圍包含不良數(shù)據(jù)。一個(gè)任務(wù)將會(huì)重新執(zhí)行直到跳過(guò)記錄或者嘗試次數(shù)用完。可以通過(guò)Job.setMaxMapAttempts(int)?and?Job.setMaxReduceAttempts(int).來(lái)增加嘗試次數(shù)。

    跳過(guò)的記錄將會(huì)以序列化的形式寫(xiě)到HDFS中。可以通過(guò)?SkipBadRecords.setSkipOutputPath(JobConf, Path)來(lái)修改路徑。

    ?

    *由于譯者本身能力有限,所以譯文中肯定會(huì)出現(xiàn)表述不正確的地方,請(qǐng)大家多多包涵,也希望大家能夠指出文中翻譯得不對(duì)或者不準(zhǔn)確的地方,共同探討進(jìn)步,謝謝。

    ?

    轉(zhuǎn)載于:https://www.cnblogs.com/simple-focus/p/6108737.html

    總結(jié)

    以上是生活随笔為你收集整理的Hadoop官方文档翻译——MapReduce Tutorial的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。

    如果覺(jué)得生活随笔網(wǎng)站內(nèi)容還不錯(cuò),歡迎將生活随笔推薦給好友。