日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問(wèn) 生活随笔!

生活随笔

當(dāng)前位置: 首頁(yè) > 编程资源 > 编程问答 >内容正文

编程问答

hadoop job 数量_Hadoop job任务分配

發(fā)布時(shí)間:2024/7/23 编程问答 44 豆豆
生活随笔 收集整理的這篇文章主要介紹了 hadoop job 数量_Hadoop job任务分配 小編覺(jué)得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

1. 必要性Hadoop提供了多個(gè)配置參數(shù)使得admin和user可以靈活設(shè)定內(nèi)存;有些參數(shù)有defaut-value, 有些選項(xiàng)是cluster specific以支持memory-intensive作業(yè)。

當(dāng)構(gòu)建一個(gè)cluster時(shí),admin可以先設(shè)定一些appropriate default value;其他一些參數(shù)設(shè)定可根據(jù)cluster硬件配置(如任務(wù)可獲得的物理內(nèi)存和虛擬內(nèi)存的總大小、slave配置的slots的數(shù)目、在slave上運(yùn)行的process的需求)和作業(yè)類型(如內(nèi)存密集型任務(wù))而確定。

2. 內(nèi)存監(jiān)控(1) 監(jiān)控任務(wù)內(nèi)存的目的防止MapReduce task占用了過(guò)量的內(nèi)存(consuming memory beyond a limit),從而導(dǎo)致同在該slave上運(yùn)行的其他進(jìn)程、其他任務(wù)、或者daemon(例如DataNode或者TaskTracker)。(2) virtual memory和physical memoryHadoop可以監(jiān)控節(jié)點(diǎn)的virtual memory和physical memory,兩者之間獨(dú)立。然而,在streaming應(yīng)用中,由于程序需要加載了libraries來(lái)執(zhí)行任務(wù),故virtual memory使用較多。在這種情況下,監(jiān)控physical memory會(huì)更準(zhǔn)確.

(3) hadoop允許為作業(yè)指定期望所需內(nèi)存的最大值。通過(guò)resource aware scheduling and monitoring, hadoop tries to確保滿足task數(shù)量,以滿足限制(a) an individual job's memory requirement

(b) the total amount of memory available for all MapReduce tasks(4) TaskTracker 對(duì)task的監(jiān)控(a) 周期性的監(jiān)控第一步:以防某個(gè)task及其child process累計(jì)使用的virtual memory和physical memory的量不超過(guò)specified的量。先查virtual memory, 接著physical memory. 若超過(guò),則kill該task及其child process。并標(biāo)記該task為failed.

第二步:檢查某個(gè)job的所有running tasks及其child processes累計(jì)使用的virtual memory和physical memory的量。若超過(guò)limit, 則kill以足夠量的task,直到累計(jì)內(nèi)存的使用量低于limit. (若virtual memory超限,則kill掉那些進(jìn)展最小的tasks;若physical memory超限,則kill掉那些占用physical memory最多的task)。被kill掉的task被標(biāo)記為killed.(5) Resource aware schedulingResource aware scheduling能確保:要調(diào)度task到某個(gè)slave上前,先要確保該slave能夠滿足task的memory requirement。

Capacity Scheduling在調(diào)度作業(yè)時(shí),把virtual memory的需求考慮進(jìn)去。見(jiàn)

(7) cluster相關(guān)的內(nèi)存配置這些配置與JobTracker和TaskTracker相關(guān),任何job不能修改這些參數(shù)。另外,配置參數(shù)在每個(gè)slave上相同。

mapreduce.cluster.{map|reduce}memory.mb: These options define the default amount of virtual memory that should be allocated for MapReduce tasks running in the cluster. They typically match the default values set for the options mapreduce.{map|reduce}.memory.mb. They help in the calculation of the total amount of virtual memory available for MapReduce tasks on a slave, using the following equation:

Total virtual memory for all MapReduce tasks = (mapreduce.cluster.mapmemory.mb * mapreduce.tasktracker.map.tasks.maximum) + (mapreduce.cluster.reducememory.mb * mapreduce.tasktracker.reduce.tasks.maximum)

Typically, reduce tasks require more memory than map tasks. Hence a higher value is recommended for mapreduce.cluster.reducememory.mb. The value is specified in MB. To set a value of 2GB for reduce tasks, set mapreduce.cluster.reducememory.mb to 2048.

mapreduce.jobtracker.max{map|reduce}memory.mb: These options define the maximum amount of virtual memory that can be requested by jobs using the parameters mapreduce.{map|reduce}.memory.mb. The system will reject any job that is submitted requesting for more memory than these limits. Typically, the values for these options should be set to satisfy the following constraint:

mapreduce.jobtracker.maxmapmemory.mb = mapreduce.cluster.mapmemory.mb * mapreduce.tasktracker.map.tasks.maximum

mapreduce.jobtracker.maxreducememory.mb = mapreduce.cluster.reducememory.mb * mapreduce.tasktracker.reduce.tasks.maximum

The value is specified in MB. If mapreduce.cluster.reducememory.mb is set to 2GB and there are 2 reduce slots configured in the slaves, the value formapreduce.jobtracker.maxreducememory.mb should be set to 4096.

mapreduce.tasktracker.reserved.physicalmemory.mb: This option defines the amount of physical memory that is marked for system and daemon processes. Using this, the amount of physical memory available for MapReduce tasks is calculated using the following equation:

Total physical memory for all MapReduce tasks = Total physical memory available on the system - mapreduce.tasktracker.reserved.physicalmemory.mb

The value is specified in MB. To set this value to 2GB, specify the value as 2048.

mapreduce.tasktracker.taskmemorymanager.monitoringinterval: This option defines the time the TaskTracker waits between two cycles of memory monitoring. The value is specified in milliseconds.

Note: The virtual memory monitoring function is only enabled if the variables mapreduce.cluster.{map|reduce}memory.mb andmapreduce.jobtracker.max{map|reduce}memory.mb are set to values greater than zero. Likewise, the physical memory monitoring function is only enabled if the variable mapreduce.tasktracker.reserved.physicalmemory.mb is set to a value greater than zero.

轉(zhuǎn)自http://blog.csdn.net/amaowolf/article/details/7188504

總結(jié)

以上是生活随笔為你收集整理的hadoop job 数量_Hadoop job任务分配的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。

如果覺(jué)得生活随笔網(wǎng)站內(nèi)容還不錯(cuò),歡迎將生活随笔推薦給好友。