日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當(dāng)前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

irqbalance

發(fā)布時(shí)間:2023/12/8 编程问答 52 豆豆
生活随笔 收集整理的這篇文章主要介紹了 irqbalance 小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

http://www.bubuko.com/infodetail-1129360.html

irqbalance 理論上:
啟用 irqbalance 服務(wù),既可以提升性能,又可以降低能耗。
irqbalance 用于優(yōu)化中斷分配,它會(huì)自動(dòng)收集系統(tǒng)數(shù)據(jù)以分析使用模式,并依據(jù)系統(tǒng)負(fù)載狀況將工作狀態(tài)置于 Performance mode 或 Power-save mode。
處于 Performance mode 時(shí),irqbalance 會(huì)將中斷盡可能均勻地分發(fā)給各個(gè) CPU core,以充分利用 CPU 多核,提升性能。
處于 Power-save mode 時(shí),irqbalance 會(huì)將中斷集中分配給第一個(gè) CPU,以保證其它空閑 CPU 的睡眠時(shí)間,降低能耗。
但實(shí)際中往往影響cpu的使用均衡,建議服務(wù)器環(huán)境中關(guān)閉。


http://www.it165.net/os/html/201301/4427.html

irqbalance項(xiàng)目的主頁在這里

irqbalance用于優(yōu)化中斷分配,它會(huì)自動(dòng)收集系統(tǒng)數(shù)據(jù)以分析使用模式,并依據(jù)系統(tǒng)負(fù)載狀況將工作狀態(tài)置于 Performance mode 或 Power-save mode。處于Performance mode 時(shí),irqbalance 會(huì)將中斷盡可能均勻地分發(fā)給各個(gè) CPU core,以充分利用 CPU 多核,提升性能。
處于Power-save mode 時(shí),irqbalance 會(huì)將中斷集中分配給第一個(gè) CPU,以保證其它空閑 CPU 的睡眠時(shí)間,降低能耗。

在RHEL發(fā)行版里這個(gè)守護(hù)程序默認(rèn)是開機(jī)啟用的,那如何確認(rèn)它的狀態(tài)呢?


view sourceprint? 1.# service irqbalance status 2.irqbalance (pid PID) is running…


?

然后在實(shí)踐中,我們的專用的應(yīng)用程序通常是綁定在特定的CPU上的,所以其實(shí)不可不需要它。如果已經(jīng)被打開了,我們可以用下面的命令關(guān)閉它:


view sourceprint? 1.# service irqbalance stop 2.Stopping irqbalance: [ OK ]

或者干脆取消開機(jī)啟動(dòng):


view sourceprint? 1.# chkconfig irqbalance off

下面我們來分析下這個(gè)irqbalance的工作原理,好準(zhǔn)確的知道什么時(shí)候該用它,什么時(shí)候不用它。

既然irqbalance用于優(yōu)化中斷分配,首先我們從中斷講起,文章很長(zhǎng),深吸一口氣,來吧!


摘抄重點(diǎn):

SMP affinity is controlled by manipulating files in the /proc/irq/ directory.
In /proc/irq/ are directories that correspond to the IRQs present on your
system (not all IRQs may be available). In each of these directories is
the “smp_affinity” file, and this is where we will work our magic.


說白了就是往/proc/irq/N/smp_affinity文件寫入你希望的親緣的CPU的mask碼! 關(guān)于如何手工設(shè)置中斷親緣性

接著普及下概念,我們?cè)賮砜聪翪PU的拓?fù)浣Y(jié)構(gòu),首先看下Intel CPU的各個(gè)部件之間的關(guān)系:

?


一個(gè)NUMA node包括一個(gè)或者多個(gè)Socket,以及與之相連的local memory。一個(gè)多核的Socket有多個(gè)Core。如果CPU支持HT,OS還會(huì)把這個(gè)Core看成 2個(gè)Logical Processor。

可以看拓?fù)涞墓ぞ吆芏鄉(xiāng)scpu或者intel的cpu_topology64工具都可以

這次用之前我們新介紹的Likwid工具箱里面的likwid-topology我們可以看到:

./likwid-topology

CPU的拓?fù)浣Y(jié)構(gòu)是各種高性能服務(wù)器CPU親緣性綁定必須理解的東西,有感覺了嗎?

有了前面的各種基礎(chǔ)知識(shí)和名詞的鋪墊,我們就可以來調(diào)查irqbalance的工作原理:


view sourceprint? 01.//irqbalance.c 02.int?main(int?argc,?char** argv) 03.{ 04./* ... */ 05.while?(keep_going) { 06.sleep_approx(SLEEP_INTERVAL);?//#define SLEEP_INTERVAL 10 07./* ... */ 08.clear_work_stats(); 09.parse_proc_interrupts(); 10.parse_proc_stat(); 11./* ... */ 12.calculate_placement(); 13.activate_mappings(); 14./* ... */ 15.} 16./* ... */ 17.}

從程序的主循環(huán)可以很清楚的看到它的邏輯,在退出之前每隔10秒它做了以下的幾個(gè)事情:
1. 清除統(tǒng)計(jì)
2. 分析中斷的情況
3. 分析中斷的負(fù)載情況
4. 根據(jù)負(fù)載情況計(jì)算如何平衡中斷
5. 實(shí)施中斷親緣性變跟

好吧,稍微看下irqbalance如何使用的:

man irqbalance

–oneshot
Causes irqbalance to be run once, after which the daemon exits
–debug
Causes irqbalance to run in the foreground and extra debug information to be printed

在診斷模型下運(yùn)行irqbalance可以給我們很多詳細(xì)的信息:

#./irqbalance –oneshot –debug

喝口水,我們接著來分析下各個(gè)步驟的詳細(xì)情況:

先了解下中斷在CPU上的分布情況:


view sourceprint? 01.$cat /proc/interrupts|tr -s?' '?'\t'|cut -f 1-3 02.CPU0??? CPU1 03.0:????? 2622846291 04.1:????? 7 05.4:????? 234 06.8:????? 1 07.9:????? 0 08.12:???? 4 09.50:???? 6753 10.66:???? 228 11.90:???? 497 12.98:???? 31 13.209:??? 2?????? 0 14.217:??? 0?????? 0 15.225:??? 29????? 556 16.233:??? 0?????? 0 17.NMI:??? 7395302 4915439 18.LOC:??? 2622846035????? 2622833187 19.ERR:??? 0 20.MIS:??? 0

輸出的第一列是中斷號(hào),后面的2列是在CPU0,CPU1的中斷次數(shù)。

但是我們?nèi)绾沃辣热缰袛嗍?8那個(gè)類型的設(shè)備呢?不廢話,上代碼!


view sourceprint? 01.//classify.c 02.char?*classes[] = { 03."other", 04."legacy", 05."storage", 06."timer", 07."ethernet", 08."gbit-ethernet", 09."10gbit-ethernet", 10.0 11.}; 12.? 13.#define MAX_CLASS 0x12 14./*??????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? 15.* Class codes lifted from pci spec, appendix D.????????????????????????????????????????????????????????????????????????? 16.* and mapped to irqbalance types here??????????????????????????????????????????????????????????????????????????????????? 17.*/ 18.static?short?class_codes[MAX_CLASS] = { 19.IRQ_OTHER, 20.IRQ_SCSI, 21.IRQ_ETH, 22.IRQ_OTHER, 23.IRQ_OTHER, 24.IRQ_OTHER, 25.IRQ_LEGACY, 26.IRQ_OTHER, 27.IRQ_OTHER, 28.IRQ_LEGACY, 29.IRQ_OTHER, 30.IRQ_OTHER, 31.IRQ_LEGACY, 32.IRQ_ETH, 33.IRQ_SCSI, 34.IRQ_OTHER, 35.IRQ_OTHER, 36.IRQ_OTHER, 37.}; 38.int?map_class_to_level[7] = 39.{ BALANCE_PACKAGE, BALANCE_CACHE, BALANCE_CACHE, BALANCE_NONE, BALANCE_CORE, BALANCE_CORE, BALANCE_CORE };

irqbalance把中斷分成7個(gè)類型,不同類型的中斷平衡的時(shí)候作用域不同,有的在PACKAGE,有的在CACHE,有的在CORE。
那么類型信息在那里獲取呢?不廢話,上代碼!


view sourceprint? 01.//#define SYSDEV_DIR "/sys/bus/pci/devices" 02.static?struct?irq_info *add_one_irq_to_db(const?char?*devpath,?int?irq,?structuser_irq_policy *pol) 03.{ 04.... 05.sprintf(path,?"%s/class", devpath); 06.? 07.fd =?fopen(path,?"r"); 08.? 09.if?(!fd) { 10.perror("Can't open class file: "); 11.goto?get_numa_node; 12.} 13.? 14.rc =?fscanf(fd,?"%x", &class); 15.fclose(fd); 16.? 17.if?(!rc) 18.goto?get_numa_node; 19.? 20./*??????????????????????????????????????????????????????????????????????????????????????????????????????????????? 21.* Restrict search to major class code??????????????????????????????????????????????????????????????????????????? 22.*/ 23.class?>>= 16; 24.? 25.if?(class?>= MAX_CLASS) 26.goto?get_numa_node; 27.? 28.new->class?= class_codes[class]; 29.if?(pol->level >= 0) 30.new->level = pol->level; 31.else 32.new->level = map_class_to_level[class_codes[class]]; 33.get_numa_node: 34.numa_node = -1; 35.sprintf(path,?"%s/numa_node", devpath); 36.fd =?fopen(path,?"r"); 37.if?(!fd) 38.goto?assign_node; 39.? 40.rc =?fscanf(fd,?"%d", &numa_node); 41.fclose(fd); 42.? 43.assign_node: 44.new->numa_node = get_numa_node(numa_node); 45.? 46.sprintf(path,?"%s/local_cpus", devpath); 47.fd =?fopen(path,?"r"); 48.if?(!fd) { 49.cpus_setall(new->cpumask); 50.goto?assign_affinity_hint; 51.} 52.lcpu_mask = NULL; 53.ret = getline(&lcpu_mask, &blen, fd); 54.fclose(fd); 55.if?(ret <= 0) { 56.cpus_setall(new->cpumask); 57.}?else?{ 58.cpumask_parse_user(lcpu_mask, ret,?new->cpumask); 59.} 60.free(lcpu_mask); 61.? 62.assign_affinity_hint: 63.cpus_clear(new->affinity_hint); 64.sprintf(path,?"/proc/irq/%d/affinity_hint", irq); 65.fd =?fopen(path,?"r"); 66.if?(!fd) 67.goto?out; 68.lcpu_mask = NULL; 69.ret = getline(&lcpu_mask, &blen, fd); 70.fclose(fd); 71.if?(ret <= 0) 72.goto?out; 73.cpumask_parse_user(lcpu_mask, ret,?new->affinity_hint); 74.free(lcpu_mask); 75.out: 76.... 77.}

#上面的c代碼翻譯成下面的腳本就是:


view sourceprint? 01.$cat>x.sh 02.SYSDEV_DIR="/sys/bus/pci/devices/" 03.for?dev in `ls $SYSDEV_DIR` 04.do 05.IRQ=`cat $SYSDEV_DIR$dev/irq` 06.CLASS=$(((`cat $SYSDEV_DIR$dev/class`)>>16)) 07.printf?"irq %s: class[%s] "?$IRQ $CLASS 08.if?[ -f?"/proc/irq/$IRQ/affinity_hint"?]; then 09.printf?"affinity_hint[%s] "?`cat /proc/irq/$IRQ/affinity_hint` 10.fi 11.if?[ -f?"$SYSDEV_DIR$dev/local_cpus"?]; then 12.printf?"local_cpus[%s] "?`cat $SYSDEV_DIR$dev/local_cpus` 13.fi 14.if?[ -f?"$SYSDEV_DIR$dev/numa_node"?]; then 15.printf?"numa_node[%s]"?`cat $SYSDEV_DIR$dev/numa_node` 16.fi 17.echo 18.done 19.CTRL+D 20.$ tree /sys/bus/pci/devices 21./sys/bus/pci/devices 22.|-- 0000:00:00.0 -> ../../../devices/pci0000:00/0000:00:00.0 23.|-- 0000:00:01.0 -> ../../../devices/pci0000:00/0000:00:01.0 24.|-- 0000:00:03.0 -> ../../../devices/pci0000:00/0000:00:03.0 25.|-- 0000:00:07.0 -> ../../../devices/pci0000:00/0000:00:07.0 26.|-- 0000:00:09.0 -> ../../../devices/pci0000:00/0000:00:09.0 27.|-- 0000:00:13.0 -> ../../../devices/pci0000:00/0000:00:13.0 28.|-- 0000:00:14.0 -> ../../../devices/pci0000:00/0000:00:14.0 29.|-- 0000:00:14.1 -> ../../../devices/pci0000:00/0000:00:14.1 30.|-- 0000:00:14.2 -> ../../../devices/pci0000:00/0000:00:14.2 31.|-- 0000:00:14.3 -> ../../../devices/pci0000:00/0000:00:14.3 32.|-- 0000:00:1a.0 -> ../../../devices/pci0000:00/0000:00:1a.0 33.|-- 0000:00:1a.7 -> ../../../devices/pci0000:00/0000:00:1a.7 34.|-- 0000:00:1d.0 -> ../../../devices/pci0000:00/0000:00:1d.0 35.|-- 0000:00:1d.1 -> ../../../devices/pci0000:00/0000:00:1d.1 36.|-- 0000:00:1d.2 -> ../../../devices/pci0000:00/0000:00:1d.2 37.|-- 0000:00:1d.7 -> ../../../devices/pci0000:00/0000:00:1d.7 38.|-- 0000:00:1e.0 -> ../../../devices/pci0000:00/0000:00:1e.0 39.|-- 0000:00:1f.0 -> ../../../devices/pci0000:00/0000:00:1f.0 40.|-- 0000:00:1f.2 -> ../../../devices/pci0000:00/0000:00:1f.2 41.|-- 0000:00:1f.3 -> ../../../devices/pci0000:00/0000:00:1f.3 42.|-- 0000:00:1f.5 -> ../../../devices/pci0000:00/0000:00:1f.5 43.|-- 0000:01:00.0 -> ../../../devices/pci0000:00/0000:00:01.0/0000:01:00.0 44.|-- 0000:01:00.1 -> ../../../devices/pci0000:00/0000:00:01.0/0000:01:00.1 45.|-- 0000:04:00.0 -> ../../../devices/pci0000:00/0000:00:09.0/0000:04:00.0 46.`-- 0000:05:00.0 -> ../../../devices/pci0000:00/0000:00:1e.0/0000:05:00.0 47.? 48.$chmod +x x.sh 49.$./x.sh|grep 98 50.irq 98:?class[2] local_cpus[00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000]

簡(jiǎn)單的分析下數(shù)字:class_codes[2]=IRQ_ETH 也就是說這個(gè)中斷是塊網(wǎng)卡。

那中斷的負(fù)載是怎么算出來的呢?繼續(xù)看代碼!


view sourceprint? 01.//procinterrupts.c 02.void?parse_proc_stat(void) 03.{ 04.... 05.file =?fopen("/proc/stat",?"r"); 06.if?(!file) { 07.log(TO_ALL, LOG_WARNING,?"WARNING cant open /proc/stat.? balacing is broken\n"); 08.return; 09.} 10.? 11./* first line is the header we don't need; nuke it */ 12.if?(getline(&line, &size, file)==0) { 13.free(line); 14.log(TO_ALL, LOG_WARNING,?"WARNING read /proc/stat. balancing is broken\n"); 15.fclose(file); 16.return; 17.} 18.cpucount = 0; 19.while?(!feof(file)) { 20.if?(getline(&line, &size, file)==0) 21.break; 22.? 23.if?(!strstr(line,?"cpu")) 24.break; 25.? 26.cpunr =?strtoul(&line[3], NULL, 10); 27.? 28.if?(cpu_isset(cpunr, banned_cpus)) 29.continue; 30.? 31.rc =?sscanf(line,?"%*s %*d %*d %*d %*d %*d %d %d", &irq_load, &softirq_load); 32.if?(rc < 2) 33.break; 34.? 35.cpu = find_cpu_core(cpunr); 36.? 37.if?(!cpu) 38.break; 39.? 40.cpucount++; 41./*??????????????????????????????????????????????????????????????????????????????????????????????????????? 42.* For each cpu add the irq and softirq load and propagate that?????????????????????????????????????????? 43.* all the way up the device tree???????????????????????????????????????????????????????????????????????? 44.*/ 45.if?(cycle_count) { 46.cpu->load = (irq_load + softirq_load) - (cpu->last_load); 47./*??????????????????????????????????????????????????????????????????????????????????????????????? 48.* the [soft]irq_load values are in jiffies, which are??????????????????????????????????????????? 49.* units of 10ms, multiply by 1000 to convert that to???????????????????????????????????????????? 50.* 1/10 milliseconds.? This give us a better integer????????????????????????????????????????????? 51.* distribution of load between irqs????????????????????????????????????????????????????????????? 52.*/ 53.cpu->load *= 1000; 54.} 55.cpu->last_load = (irq_load + softirq_load); 56.} 57.... 58.}

相當(dāng)于以下的命令:

$grep cpu015/proc/stat
cpu15 30068830 85841 22995655 3212064899 536154 91145 2789328 0

我們學(xué)習(xí)下 /proc/stat 的文件格式!

關(guān)于CPU這行摘抄如下:

cpu — Measures the number of jiffies (1/100 of a second for x86 systems) that the system has been in user mode, user mode with low priority (nice), system mode, idle task, I/O wait, IRQ (hardirq), and softirq respectively. The IRQ (hardirq) is the direct response to a hardware event. The IRQ takes minimal work for queuing the “heavy” work up for the softirq to execute. The softirq runs at a lower priority than the IRQ and therefore may be interrupted more frequently. The total for all CPUs is given at the top, while each individual CPU is listed below with its own statistics. The following example is a 4-way Intel Pentium Xeon configuration with multi-threading enabled, therefore showing four physical processors and four virtual processors totaling eight processors.

可以知道這行的第7,8項(xiàng)分別對(duì)應(yīng)著中斷和軟中斷的次數(shù),二者加起來就是我們所謂的CPU負(fù)載。
這個(gè)和結(jié)果和irqbalance報(bào)告的中斷的情況是吻合的,見圖:

是不是有點(diǎn)暈了,喝口水!
我們繼續(xù)來看下整個(gè)Package層面irqbalance是如何計(jì)算負(fù)載的,從下面的圖結(jié)合前面的那個(gè)CPU拓?fù)浜芮宄目吹?#xff1a;

每個(gè)CORE的負(fù)載是附在上面的中斷的負(fù)載的總和,每個(gè)DOMAIN是包含的CORE的總和,每個(gè)PACKAGE包含的DOMAIN的總和,就像樹層次一樣的計(jì)算。
知道了每個(gè)CORE, DOMAIN,PACKAGE的負(fù)載的情況,那么剩下的就是找個(gè)這個(gè)中斷類型所在作用域范圍內(nèi)最輕的對(duì)象把中斷遷移過去。

遷移的依據(jù)正是之前看過的這個(gè)東西:

int map_class_to_level[7] =
{ BALANCE_PACKAGE, BALANCE_CACHE, BALANCE_CACHE, BALANCE_NONE, BALANCE_CORE, BALANCE_CORE, BALANCE_CORE };

水喝多了,等等放下水先,回來繼續(xù)!

最后那irqbalance系統(tǒng)是如何實(shí)施中斷親緣性變跟的呢,繼續(xù)上代碼:


view sourceprint? 01.// activate.c 02.static?void?activate_mapping(struct?irq_info *info,?void?*data __attribute__((unused))) 03.{ 04.... 05.if?((hint_policy == HINT_POLICY_EXACT) && 06.(!cpus_empty(info->affinity_hint))) { 07.applied_mask = info->affinity_hint; 08.valid_mask = 1; 09.}?else?if?(info->assigned_obj) { 10.applied_mask = info->assigned_obj->mask; 11.valid_mask = 1; 12.if?((hint_policy == HINT_POLICY_SUBSET) && 13.(!cpus_empty(info->affinity_hint))) 14.cpus_and(applied_mask, applied_maskapplied_mask, info->affinity_hint); 15.} 16.? 17./*??????????????????????????????????????????????????????????????????????????????????????????????????????????????? 18.* only activate mappings for irqs that have moved??????????????????????????????????????????????????????????????? 19.*/ 20.if?(!info->moved && (!valid_mask || check_affinity(info, applied_mask))) 21.return; 22.? 23.if?(!info->assigned_obj) 24.return; 25.? 26.sprintf(buf,?"/proc/irq/%i/smp_affinity", info->irq); 27.file =?fopen(buf,?"w"); 28.if?(!file) 29.return; 30.? 31.cpumask_scnprintf(buf, PATH_MAX, applied_mask); 32.fprintf(file,?"%s", buf); 33.fclose(file); 34.info->moved = 0;?/*migration is done*/ 35.} 36.? 37.void?activate_mappings(void) 38.{ 39.for_each_irq(NULL, activate_mapping, NULL); 40.}

上面的代碼簡(jiǎn)單的翻譯成shell就是:

#echo MASK > /proc/irq/N/smp_affinity

當(dāng)然如果用戶設(shè)置的策略如果是HINT_POLICY_EXACT,那么我們會(huì)參照/proc/irq/N/affinity_hint設(shè)置
策略如果是HINT_POLICY_SUBSET, 那么我們會(huì)參照/proc/irq/N/affinity_hint | applied_mask 設(shè)置。

好吧,總算分析完成了! www.it165.net

總結(jié):
irqbalance根據(jù)系統(tǒng)中斷負(fù)載的情況,自動(dòng)遷移中斷保持中斷的平衡,同時(shí)會(huì)考慮到省電因素等等。 但是在實(shí)時(shí)系統(tǒng)中會(huì)導(dǎo)致中斷自動(dòng)漂移,對(duì)性能造成不穩(wěn)定因素,在高性能的場(chǎng)合建議關(guān)閉。




總結(jié)

以上是生活随笔為你收集整理的irqbalance的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò),歡迎將生活随笔推薦給好友。