linux中页缓冲和块缓冲之概念
頁(yè)緩沖在《linux內(nèi)核情景分析》一書的第5.6節(jié)文件的寫與讀一章中說(shuō)明的很詳細(xì),這里摘抄下來(lái);
在文件系統(tǒng)層中有三隔主要的數(shù)據(jù)結(jié)構(gòu),file結(jié)構(gòu)、dentry結(jié)構(gòu)和inode結(jié)構(gòu);
file結(jié)構(gòu):代表目標(biāo)文件的一個(gè)上下文,不同進(jìn)程可以在同一文件上建立不同的上下文,而且同一進(jìn)程也可以通過打開一個(gè)文件多次而建立起多個(gè)上下文。因此不能在file結(jié)構(gòu)上設(shè)置緩沖區(qū)隊(duì)列,因?yàn)檫@些file結(jié)構(gòu)體之間都不共享。
dentry結(jié)構(gòu)體:該結(jié)構(gòu)體是文件名結(jié)構(gòu)體,通過軟/硬鏈接可以得到多個(gè)dentry結(jié)構(gòu)體對(duì)應(yīng)一個(gè)文件,dentry結(jié)構(gòu)體和文件也不是一對(duì)一關(guān)系,所以也不能在該結(jié)構(gòu)體上建立緩沖區(qū)隊(duì)列;
inode結(jié)構(gòu)體:很顯然就只有inode結(jié)構(gòu)體了,inode結(jié)構(gòu)體和文件是一對(duì)一的關(guān)系,可以這么說(shuō)inode就是代表文件。在inode結(jié)構(gòu)體上設(shè)置了i_mapping指針,該指針指向了一個(gè)address_space數(shù)據(jù)結(jié)構(gòu),一般來(lái)說(shuō)該數(shù)據(jù)結(jié)構(gòu)就是inode->i_data,緩沖區(qū)隊(duì)列就是在該數(shù)據(jù)結(jié)構(gòu)中;
掛在緩沖區(qū)隊(duì)列中的不是記錄塊而是內(nèi)存頁(yè)面,因此當(dāng)一個(gè)進(jìn)程調(diào)用mmap()函數(shù)將一個(gè)文件映射到它用戶空間時(shí),它只要設(shè)置相應(yīng)的內(nèi)存映射表,就可以很自然的把這些緩存頁(yè)面映射到進(jìn)程的用戶空間。所以才又起名為i_mapping。
這里還要了解下基數(shù)樹概念,先看看圖(圖片來(lái)自《深入linux內(nèi)核架構(gòu)》)
基數(shù)樹不是不是平衡樹,樹本身由兩種不同的數(shù)據(jù)結(jié)構(gòu)組成,樹根節(jié)點(diǎn)和非葉子節(jié)點(diǎn),樹根節(jié)點(diǎn)由簡(jiǎn)單的數(shù)據(jù)結(jié)構(gòu)表示,其中包含了樹的高度和指向組成樹的第一個(gè)節(jié)點(diǎn)的數(shù)據(jù)結(jié)構(gòu)。節(jié)點(diǎn)本質(zhì)上是數(shù)組,count是該節(jié)點(diǎn)的指針計(jì)數(shù),其他的都是指向下一層節(jié)點(diǎn)的指針。而葉子節(jié)點(diǎn)是指向page的指針;
其中節(jié)點(diǎn)上的數(shù)據(jù)結(jié)構(gòu)還包含了搜索標(biāo)記,比如臟頁(yè)標(biāo)記和回寫標(biāo)記,可以很快的指定哪邊有標(biāo)記的頁(yè);
塊緩沖
塊緩沖在結(jié)構(gòu)上由兩個(gè)部分組成:
1、緩沖頭:包含與緩沖區(qū)狀態(tài)相關(guān)的所有管理數(shù)據(jù),塊號(hào)、長(zhǎng)度,訪問器等,這些緩沖頭不直接存儲(chǔ)在緩沖頭之后,而是由緩沖頭指針指向的物理內(nèi)存獨(dú)立區(qū)域中。
2、有用的數(shù)據(jù)保存在專門分配的頁(yè)中,這些頁(yè)也可以能同事存在頁(yè)緩沖中。
緩沖頭:
/** Historically, a buffer_head was used to map a single block* within a page, and of course as the unit of I/O through the* filesystem and block layers. Nowadays the basic I/O unit* is the bio, and buffer_heads are used for extracting block* mappings (via a get_block_t call), for tracking state within* a page (via a page_mapping) and for wrapping bio submission* for backward compatibility reasons (e.g. submit_bh).*/ struct buffer_head {unsigned long b_state; /* buffer state bitmap (see above) *///緩沖區(qū)狀態(tài)標(biāo)識(shí),看下面struct buffer_head *b_this_page;/* circular list of page's buffers *///指向下一個(gè)緩沖頭struct page *b_page; /* the page this bh is mapped to *///指向擁有該塊緩沖區(qū)的頁(yè)描述符指針sector_t b_blocknr; /* start block number *///塊設(shè)備的邏輯塊號(hào)size_t b_size; /* size of mapping *///塊大小char *b_data; /* pointer to data within the page *///塊在緩沖頁(yè)內(nèi)的位置struct block_device *b_bdev;//指向塊設(shè)備描述符bh_end_io_t *b_end_io; /* I/O completion *///i/o完成回調(diào)函數(shù)void *b_private; /* reserved for b_end_io *///指向i/o完成回調(diào)函數(shù)的數(shù)據(jù)參數(shù)struct list_head b_assoc_buffers; /* associated with another mapping */struct address_space *b_assoc_map; /* mapping this buffer isassociated with */atomic_t b_count; /* users using this buffer_head *///塊使用計(jì)算器 };
緩沖區(qū)頭部的通用標(biāo)志
enum bh_state_bits {BH_Uptodate, /* Contains valid data *///表示緩沖區(qū)包含有效數(shù)據(jù)BH_Dirty, /* Is dirty *///緩沖區(qū)是臟的BH_Lock, /* Is locked *///緩沖區(qū)被鎖住BH_Req, /* Has been submitted for I/O *///初始化緩沖區(qū)而請(qǐng)求數(shù)據(jù)傳輸BH_Uptodate_Lock,/* Used by the first bh in a page, to serialise* IO completion of other buffers in the page*/BH_Mapped, /* Has a disk mapping *///b_bdev和b_blocknr是有效的BH_New, /* Disk mapping was newly created by get_block *///剛分配還沒有訪問過BH_Async_Read, /* Is under end_buffer_async_read I/O *///異步讀該緩沖區(qū)BH_Async_Write, /* Is under end_buffer_async_write I/O *///異步寫該緩沖區(qū)BH_Delay, /* Buffer is not yet allocated on disk *///還沒有在磁盤上分配緩沖區(qū)BH_Boundary, /* Block is followed by a discontiguity *///BH_Write_EIO, /* I/O error on write *///i/o錯(cuò)誤BH_Unwritten, /* Buffer is allocated on disk but not written */BH_Quiet, /* Buffer Error Prinks to be quiet */BH_Meta, /* Buffer contains metadata */BH_Prio, /* Buffer should be submitted with REQ_PRIO */BH_PrivateStart,/* not a state bit, but the first bit available* for private allocation by other entities*/ };如果一個(gè)頁(yè)作為緩沖區(qū)頁(yè)使用,那么與它的塊緩沖區(qū)相關(guān)的所有緩沖區(qū)首部都被收集在一個(gè)單向循環(huán)鏈表中。緩沖頁(yè)描述符的private字段指向該頁(yè)中第一個(gè)塊的緩沖區(qū)首部;而每個(gè)緩沖區(qū)首部的b_this_page字段中,該字段是指向鏈表中下一個(gè)緩沖區(qū)首部的指針。每個(gè)緩沖區(qū)首部的b_page指向所屬的緩沖區(qū)頁(yè)描述符;
從上圖可以看出一個(gè)緩沖頁(yè)對(duì)應(yīng)了4個(gè)緩沖區(qū),這就統(tǒng)一了page cache和buffer cache了。修改緩沖區(qū)或者緩沖頁(yè),他們之間都會(huì)相互影響。
address_space結(jié)構(gòu)體:
struct address_space {
? ? struct inode ? ? ? ?*host; ? ? ?/* owner: inode, block_device *///指向宿主文件的inode
? ? struct radix_tree_root ?page_tree; ?/* radix tree of all pages *///基數(shù)樹的root
? ? spinlock_t ? ? ?tree_lock; ?/* and lock protecting it *///基數(shù)樹的鎖
? ? unsigned int ? ? ? ?i_mmap_writable;/* count VM_SHARED mappings *///vm_SHARED共享映射頁(yè)計(jì)數(shù)
? ? struct rb_root ? ? ?i_mmap; ? ? /* tree of private and shared mappings *///私有和共享映射的樹
? ? struct list_head ? ?i_mmap_nonlinear;/*list VM_NONLINEAR mappings *///匿名映射的鏈表元素
? ? struct mutex ? ? ? ?i_mmap_mutex; ? /* protect tree, count, list *///包含樹的mutex
? ? /* Protected by tree_lock together with the radix tree */
? ? unsigned long ? ? ? nrpages; ? ?/* number of total pages *///頁(yè)的總數(shù)
? ? pgoff_t ? ? ? ? writeback_index;/* writeback starts here *///回寫的開始
? ? const struct address_space_operations *a_ops; ? /* methods *///函數(shù)指針
? ? unsigned long ? ? ? flags; ? ? ?/* error bits/gfp mask *///錯(cuò)誤碼
? ? struct backing_dev_info *backing_dev_info; /* device readahead, etc *///設(shè)備預(yù)讀
? ? spinlock_t ? ? ?private_lock; ? /* for use by the address_space */
? ? struct list_head ? ?private_list; ? /* ditto */
? ? void ? ? ? ? ? ?*private_data; ?/* ditto */
} __attribute__((aligned(sizeof(long))));
struct inode *host和struct radix_tree_root page_tree關(guān)聯(lián)了文件和內(nèi)存頁(yè)。
總結(jié)
以上是生活随笔為你收集整理的linux中页缓冲和块缓冲之概念的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 电子地图开发方式
- 下一篇: iOS MRC下的setter方法