搭建glusterfs集群
搭建glusterfs集群
Glusterfs簡(jiǎn)介
GlusterFS是Scale-Out存儲(chǔ)解決方案Gluster的核心,它是一個(gè)開源的分布式文件系統(tǒng),具有強(qiáng)大的橫向擴(kuò)展能力,通過擴(kuò)展能夠支持?jǐn)?shù)PB存儲(chǔ)容量和處理數(shù)千客戶端。GlusterFS借助TCP/IP或InfiniBandRDMA網(wǎng)絡(luò)將物理分布的存儲(chǔ)資源聚集在一起,使用單一全局命名空間來(lái)管理數(shù)據(jù)。
說起glusterfs可能比較陌生,可能大家更多的聽說和使用的是NFS,GFS,HDFS之類的,這之中的NFS應(yīng)該是使用最為廣泛的,簡(jiǎn)單易于管理,但是NFS以及后邊會(huì)說到MooseFS都會(huì)存在單點(diǎn)故障,為了解決這個(gè)問題一般情況下都會(huì)結(jié)合DRBD進(jìn)行塊兒復(fù)制。但是glusterfs就完全不用考慮這個(gè)問題了,因?yàn)樗且粋€(gè)完全的無(wú)中心的系統(tǒng)。
glusterfs官網(wǎng)
glusterfs文檔
Glusterfs特點(diǎn)
-
擴(kuò)展性和高性能
GlusterFS利用雙重特性來(lái)提供幾TB至數(shù)PB的高擴(kuò)展存儲(chǔ)解決方案。Scale-Out架構(gòu)允許通過簡(jiǎn)單地增加資源來(lái)提高存儲(chǔ)容量和性能,磁盤、計(jì)算和I/O資源都可以獨(dú)立增加,支持10GbE和InfiniBand等高速網(wǎng)絡(luò)互聯(lián)。Gluster彈性哈希(ElasticHash)解除了GlusterFS對(duì)元數(shù)據(jù)服務(wù)器的需求,消除了單點(diǎn)故障和性能瓶頸,真正實(shí)現(xiàn)了并行化數(shù)據(jù)訪問。
-
高可用性
GlusterFS可以對(duì)文件進(jìn)行自動(dòng)復(fù)制,如鏡像或多次復(fù)制,從而確保數(shù)據(jù)總是可以訪問,甚至是在硬件故障的情況下也能正常訪問。自我修復(fù)功能能夠把數(shù)據(jù)恢復(fù)到正確的狀態(tài),而且修復(fù)是以增量的方式在后臺(tái)執(zhí)行,幾乎不會(huì)產(chǎn)生性能負(fù)載。GlusterFS沒有設(shè)計(jì)自己的私有數(shù)據(jù)文件格式,而是采用操作系統(tǒng)中主流標(biāo)準(zhǔn)的磁盤文件系統(tǒng)(如EXT3、ZFS)來(lái)存儲(chǔ)文件,因此數(shù)據(jù)可以使用各種標(biāo)準(zhǔn)工具進(jìn)行復(fù)制和訪問。
-
彈性卷管理
數(shù)據(jù)儲(chǔ)存在邏輯卷中,邏輯卷可以從虛擬化的物理存儲(chǔ)池進(jìn)行獨(dú)立邏輯劃分而得到。存儲(chǔ)服務(wù)器可以在線進(jìn)行增加和移除,不會(huì)導(dǎo)致應(yīng)用中斷。邏輯卷可以在所有配置服務(wù)器中增長(zhǎng)和縮減,可以在不同服務(wù)器遷移進(jìn)行容量均衡,或者增加和移除系統(tǒng),這些操作都可在線進(jìn)行。文件系統(tǒng)配置更改也可以實(shí)時(shí)在線進(jìn)行并應(yīng)用,從而可以適應(yīng)工作負(fù)載條件變化或在線性能調(diào)優(yōu)。
系統(tǒng)說明
GlusterFS主要分為Server端和Client端,其中server端主要包含glusterd和glusterfsd兩種進(jìn)程,分別用于管理GlusterFS系統(tǒng)進(jìn)程本身(監(jiān)聽24007端口)和存儲(chǔ)塊—brick(一個(gè)brick對(duì)應(yīng)一個(gè)glusterfsd進(jìn)程,并監(jiān)聽49152+端口);
GlusterFS的Client端支持NFS、CIFS、FTP、libgfapi以及基于FUSE的本機(jī)客戶端等多種訪問方式,生產(chǎn)中我們一般都會(huì)使用基于FUSE的客戶端(其他的可以自行嘗試)。
GlusterFS的配置文件保存在/var/lib/glusterd下,日志保存在/var/log下。生產(chǎn)中建議搭建6臺(tái)以上服務(wù)節(jié)點(diǎn),卷類型使用分布式復(fù)制卷,副本數(shù)設(shè)為3,基本文件系統(tǒng)類型使用xfs。(brick數(shù)為副本數(shù)的倍數(shù)時(shí),復(fù)制卷會(huì)自動(dòng)轉(zhuǎn)化為分布式復(fù)制卷)
系統(tǒng)安裝
環(huán)境說明
| 192.168.2.10 | server1 | /dev/sdb | ext4 |
| 192.168.2.11 | server2 | /dev/sdb | ext4 |
| 192.168.2.12 | server3 | /dev/sdb | ext4 |
| 192.168.2.13 | server4 | /dev/sdb | ext4 |
hostname設(shè)置及ssh免密登錄設(shè)置(以server1為例,每臺(tái)server都需要設(shè)置):
[root@server1 ~]# vim /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 192.168.2.10 server1 192.168.2.11 server2 192.168.2.12 server3 192.168.2.13 server4[root@server1 ~]# ssh-keygen [root@server1 ~]# for i in {10..13} > do > scp /etc/hosts root@192.168.2.$i:/etc/hosts > ssh-copy-id root@192.168.2.$i > done [root@server1 ~]# ssh root@server2 [root@server2 ~]# 登出 Connection to server2 closed.關(guān)閉防火墻
[root@server1 ~]# systemctl stop firewalld && systemctl disable firewalld Removed symlink /etc/systemd/system/multi-user.target.wants/firewalld.service. Removed symlink /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service.selinux設(shè)置
[root@server1 ~]# vim /etc/selinux/config SELINUX=disabled [root@server1 ~]# setenforce 0 [root@server1 ~]# getenforce Permissive [root@server1 ~]# yum install -y flex bison openssl-devel libacl-devel sqlite-devel libxml2-devel libtool automake autoconf gcc attr gcc gcc-c++ libuuid-devel# liburcu-bp需源碼安裝,yum源里面沒有 [root@server1 ~]# wget https://github.com/urcu/userspace-rcu/archive/v0.7.16.tar.gz -O userspace-rcu-0.7.16.tar.gz [root@server1 ~]# tar -xf userspace-rcu-0.7.16.tar.gz [root@server1 ~]# cd /root/userspace-rcu-0.7.16# 先執(zhí)行常規(guī)命令安裝,進(jìn)入源碼目錄后 [root@server1 userspace-rcu-0.7.16]# ./bootstrap [root@server1 userspace-rcu-0.7.16]# ./configure && make && make install# 執(zhí)行完常規(guī)安裝命令后需要執(zhí)行下面兩個(gè)命令,才可以讓系統(tǒng)找到urcu. [root@server1 userspace-rcu-0.7.16]# ldconfig # 進(jìn)行動(dòng)態(tài)緩存,一定要記得執(zhí)行,否則后面執(zhí)行啟動(dòng)glusterd時(shí)報(bào)錯(cuò)排錯(cuò)排到懷疑人生!!!!!!!!!!!!! [root@server1 userspace-rcu-0.7.16]# pkg-config --libs --cflags liburcu-bp.pc liburcu.pc -I/usr/local/include -L/usr/local/lib -lurcu-bp -lurcu # 此外如果要geo 復(fù)制功能,需要額外安裝,并開啟ssh服務(wù): [root@server1 ~]# yum -y install passwd openssh-client openssh-server安裝完以上依賴后,我們從官網(wǎng)下載源碼,再編譯glusterfs,gluserfs編譯命令為常規(guī)命令,配置時(shí)加上–enable-debug表示編譯為帶debug信息的調(diào)試版本在官網(wǎng)下載GlusterFS源碼包
[root@server1 ~]# wget https://download.gluster.org/pub/gluster/glusterfs/8/8.2/glusterfs-8.2.tar.gz [root@server1 ~]# tar -xf glusterfs-8.2.tar.gz [root@server1 ~]# cd glusterfs-8.2/ [root@server1 glusterfs-8.2]# ./autogen.sh ... GlusterFS autogen ...Running aclocal... Running autoheader... Running libtoolize... Running autoconf... Running automake...Please proceed with configuring, compiling, and installing. [root@server1 glusterfs-8.2]# ./configure --prefix=/usr/local GlusterFS configure summary =========================== FUSE client : yes epoll IO multiplex : yes fusermount : yes readline : no georeplication : yes Linux-AIO : no Enable Debug : no Enable ASAN : no Enable TSAN : no Use syslog : yes XML output : yes Unit Tests : no Track priv ports : yes POSIX ACLs : yes SELinux features : yes firewalld-config : no Events : yes EC dynamic support : x64 sse avx Use memory pools : yes Nanosecond m/atimes : yes Server components : yes Legacy gNFS server : no IPV6 default : no Use TIRPC : missing With Python : 2.7 Cloudsync : yes Link with TCMALLOC : no[root@server1 glusterfs-8.2]# make && make install在使用glusterfs的時(shí)候要注意,局域網(wǎng)內(nèi)的主機(jī)名不能相同,并且主機(jī)名可以解析
# 啟動(dòng)glusterd [root@server1 ~]# systemctl start glusterd.service [root@server1 ~]# systemctl enable glusterd.service Created symlink from /etc/systemd/system/multi-user.target.wants/glusterd.service to /usr/local/lib/systemd/system/glusterd.service. [root@server1 ~]# systemctl status glusterd.service ● glusterd.service - GlusterFS, a clustered file-system serverLoaded: loaded (/usr/local/lib/systemd/system/glusterd.service; enabled; vendor preset: disabled)Active: active (running) since 三 2020-11-11 11:16:54 CST; 34s agoDocs: man:glusterd(8)Main PID: 80492 (glusterd)CGroup: /system.slice/glusterd.service└─80492 /usr/local/sbin/glusterd -p /usr/local/var/run/glusterd.pid --log-level INFO[root@server1 ~]# ps -ef| grep gluster root 80492 1 0 11:16 ? 00:00:00 /usr/local/sbin/glusterd -p /usr/local/var/run/glusterd.pid --log-level INFO root 80571 2060 0 11:17 pts/0 00:00:00 grep --color=auto glusterGLUSTERFS集群規(guī)劃和配置
整體流程:分區(qū)----格式化—掛載
分區(qū)
[root@server1 ~]# fdisk -l # 查看當(dāng)前分區(qū)磁盤情況磁盤 /dev/sda:21.5 GB, 21474836480 字節(jié),41943040 個(gè)扇區(qū) Units = 扇區(qū) of 1 * 512 = 512 bytes 扇區(qū)大小(邏輯/物理):512 字節(jié) / 512 字節(jié) I/O 大小(最小/最佳):512 字節(jié) / 512 字節(jié) 磁盤標(biāo)簽類型:dos 磁盤標(biāo)識(shí)符:0x000c1619設(shè)備 Boot Start End Blocks Id System /dev/sda1 * 2048 2099199 1048576 83 Linux /dev/sda2 2099200 41943039 19921920 8e Linux LVM磁盤 /dev/mapper/centos-root:18.2 GB, 18249416704 字節(jié),35643392 個(gè)扇區(qū) Units = 扇區(qū) of 1 * 512 = 512 bytes 扇區(qū)大小(邏輯/物理):512 字節(jié) / 512 字節(jié) I/O 大小(最小/最佳):512 字節(jié) / 512 字節(jié)磁盤 /dev/mapper/centos-swap:2147 MB, 2147483648 字節(jié),4194304 個(gè)扇區(qū) Units = 扇區(qū) of 1 * 512 = 512 bytes 扇區(qū)大小(邏輯/物理):512 字節(jié) / 512 字節(jié) I/O 大小(最小/最佳):512 字節(jié) / 512 字節(jié)[root@server1 ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 20G 0 disk ├─sda1 8:1 0 1G 0 part /boot └─sda2 8:2 0 19G 0 part ├─centos-root 253:0 0 17G 0 lvm /└─centos-swap 253:1 0 2G 0 lvm [SWAP] sr0 11:0 1 9.6G 0 rom /etc/gz# 在本次實(shí)驗(yàn)我選擇新加一個(gè)磁盤,我是用的是VMware,直接關(guān)機(jī)添加磁盤即可 [root@server1 ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 20G 0 disk ├─sda1 8:1 0 1G 0 part /boot └─sda2 8:2 0 19G 0 part ├─centos-root 253:0 0 17G 0 lvm /└─centos-swap 253:1 0 2G 0 lvm [SWAP] sdb 8:16 0 20G 0 disk sr0 11:0 1 9.6G 0 rom /etc/gz# 接下來(lái)!對(duì)/dev/sdb進(jìn)行一系列各種各樣的操作!!!! [root@server1 ~]# fdisk /dev/sdb # 分區(qū) 歡迎使用 fdisk (util-linux 2.23.2)。更改將停留在內(nèi)存中,直到您決定將更改寫入磁盤。 使用寫入命令前請(qǐng)三思。Device does not contain a recognized partition table 使用磁盤標(biāo)識(shí)符 0x218b7141 創(chuàng)建新的 DOS 磁盤標(biāo)簽。命令(輸入 m 獲取幫助):n # 新建 Partition type:p primary (0 primary, 0 extended, 4 free)e extended Select (default p): # 添加分區(qū)類型,選擇默認(rèn)回車即可 Using default response p 分區(qū)號(hào) (1-4,默認(rèn) 1): # 回車 起始 扇區(qū) (2048-41943039,默認(rèn)為 2048): # 回車 將使用默認(rèn)值 2048 Last 扇區(qū), +扇區(qū) or +size{K,M,G} (2048-41943039,默認(rèn)為 41943039):+5G # 自定義分區(qū)大小,以M、G等為結(jié)尾 分區(qū) 1 已設(shè)置為 Linux 類型,大小設(shè)為 5 GiB命令(輸入 m 獲取幫助):p # 打印當(dāng)前分區(qū)磁盤 /dev/sdb:21.5 GB, 21474836480 字節(jié),41943040 個(gè)扇區(qū) Units = 扇區(qū) of 1 * 512 = 512 bytes 扇區(qū)大小(邏輯/物理):512 字節(jié) / 512 字節(jié) I/O 大小(最小/最佳):512 字節(jié) / 512 字節(jié) 磁盤標(biāo)簽類型:dos 磁盤標(biāo)識(shí)符:0x218b7141設(shè)備 Boot Start End Blocks Id System /dev/sdb1 2048 10487807 5242880 83 Linux命令(輸入 m 獲取幫助):w # 保存退出 The partition table has been altered!Calling ioctl() to re-read partition table. 正在同步磁盤。# 格式化 [root@server1 ~]# mkfs.ext4 /dev/sdb1 [root@server1 ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 20G 0 disk ├─sda1 8:1 0 1G 0 part /boot └─sda2 8:2 0 19G 0 part ├─centos-root 253:0 0 17G 0 lvm /└─centos-swap 253:1 0 2G 0 lvm [SWAP] sdb 8:16 0 20G 0 disk └─sdb1 8:17 0 5G 0 part sr0 11:0 1 9.6G 0 rom /etc/gz# 掛載 [root@server1 ~]# mkdir /node [root@server1 ~]# mount /dev/sdb1 /node/ [root@server1 ~]# df -h /node 文件系統(tǒng) 容量 已用 可用 已用% 掛載點(diǎn) /dev/sdb1 4.8G 20M 4.6G 1% /node# 開機(jī)自動(dòng)掛載 [root@server1 ~]# vim /etc/fstab /dev/sdb1 /node ext4 defaults 0 0 [root@server1 ~]# mount -a配置 glusterfs 集群
[root@server1 ~]# gluster peer status # 查看集群的狀態(tài),當(dāng)前信任池當(dāng)中沒有其他主機(jī) Number of Peers: 0 [root@server1 ~]# gluster peer probe server2 # 配置信任池(一端添加就行) peer probe: success. [root@server1 ~]# gluster peer probe server3 peer probe: success [root@server1 ~]# gluster peer probe server4 peer probe: success[root@server1 ~]# gluster peer status Number of Peers: 3Hostname: server2 Uuid: 687334ba-bec2-41eb-a51d-36779607bf59 State: Peer in Cluster (Connected)Hostname: server3 Uuid: 11cfe205-643e-4237-b171-7569f0cf1b57 State: Peer in Cluster (Connected)Hostname: server4 Uuid: 2282466a-14c8-4356-b645-b287c7929abd State: Peer in Cluster (Connected)[root@server1 ~]# gluster pool list # 查看存儲(chǔ)池 UUID Hostname State 687334ba-bec2-41eb-a51d-36779607bf59 server2 Connected 11cfe205-643e-4237-b171-7569f0cf1b57 server3 Connected 2282466a-14c8-4356-b645-b287c7929abd server4 Connected 37dc4314-61a3-4e37-be76-a4075ca59a71 localhost Connected # 創(chuàng)建卷 [root@server1 ~]# gluster volume list # 目前的集群沒有卷 No volumes present in cluster[root@server1 ~]#gluster volume create data replica 4 server1:/node server2:/node server3:/node server4:/node # 創(chuàng)建卷,但是!此時(shí)會(huì)發(fā)現(xiàn)報(bào)錯(cuò)!------------------------------------------------------------------------------------------------------------ ps: 報(bào)錯(cuò)如下: volume create: data: failed: The brick server1:/data/glusterfs is being created in the root partition. It is recommended that you don't use the system's root partition for storage backend. Or use 'force' at the end of the command if you want to override this behavior. 這是因?yàn)槲覀儎?chuàng)建的brick在系統(tǒng)盤,這個(gè)在gluster的默認(rèn)情況下是不允許的,生產(chǎn)環(huán)境下也盡可能的與系統(tǒng)盤分開,如果必須這樣請(qǐng)使用force,集群默認(rèn)是不可以在root下創(chuàng)建卷!!! [root@server1 ~]# gluster volume create data replica 4 server1:/node server2:/node server3:/node server4:/node force # 根據(jù)提示在末尾加上force volume create: data: success: please start the volume to access data或者也可以在執(zhí)行編譯時(shí)創(chuàng)建用戶,使用該用戶啟動(dòng)GlusterFS,讓他對(duì)集群有完全權(quán)限------------------------------------------------------------------------------------------------------------單磁盤模式,調(diào)試環(huán)境推薦 [root@server1 ~]# gluster vol create test server1:/test force volume create: test: success: please start the volume to access data多磁盤,無(wú)raid,試驗(yàn)、測(cè)試環(huán)境推薦 [root@server1 ~]# gluster vol create testdata server1:/testdata server2:/testdata server3:/testdata server4:/testdata force volume create: testdata: success: please start the volume to access data多磁盤,有raid1。線上高并發(fā)環(huán)境推薦。 [root@server1 ~]# gluster volume create data replica 4 server1:/node server2:/node server3:/node server4:/node volume create: data: success: please start the volume to access data注:以上命令中,磁盤數(shù)量必須為復(fù)制份數(shù)的整數(shù)倍。 此外有raid0,raid10,raid5,raid6等方法,但是在線上小文件集群不推薦使用。 ============================================================================================================[root@server1 ~]# gluster volume list # 集群中出現(xiàn)剛創(chuàng)建的卷 data test testdata[root@server1 ~]# gluster volume info # 查看卷的詳細(xì)信息Volume Name: data Type: Replicate Volume ID: 3c711bfd-599d-463b-bec4-51ef49a5be21 Status: Created Snapshot Count: 0 Number of Bricks: 1 x 4 = 4 Transport-type: tcp Bricks: Brick1: server1:/node Brick2: server2:/node Brick3: server3:/node Brick4: server4:/node Options Reconfigured: storage.fips-mode-rchecksum: on transport.address-family: inet nfs.disable: on performance.client-io-threads: offVolume Name: test Type: Distribute Volume ID: e0161069-8913-43f6-abb6-f172441bfe35 Status: Created Snapshot Count: 0 Number of Bricks: 1 Transport-type: tcp Bricks: Brick1: server1:/test Options Reconfigured: storage.fips-mode-rchecksum: on transport.address-family: inet nfs.disable: onVolume Name: testdata Type: Distribute Volume ID: 15874f41-0fab-4f28-885b-747536d8ba22 Status: Created Snapshot Count: 0 Number of Bricks: 4 Transport-type: tcp Bricks: Brick1: server1:/testdata Brick2: server2:/testdata Brick3: server3:/testdata Brick4: server4:/testdata Options Reconfigured: storage.fips-mode-rchecksum: on transport.address-family: inet nfs.disable: on啟動(dòng)卷: [root@server1 ~]# gluster volume start data # 啟動(dòng)卷 volume start: data: success [root@server1 ~]# gluster vol start test volume start: test: success [root@server1 ~]# gluster vol start testdata volume start: testdata: success掛載測(cè)試
[root@server1 ~]# mkdir /mount_data [root@server1 ~]# mount -t glusterfs -o acl server1:/data /mount_data/ # server1將卷掛載 [root@server1 ~]# mkdir /mount_test [root@server1 ~]# mount -t glusterfs -o acl server1:/test /mount_test/ [root@server1 ~]# mkdir /mount_testdata [root@server1 ~]# mount -t glusterfs -o acl server1:/testdata/ /mount_testdata/[root@server1 ~]# df -h 文件系統(tǒng) 容量 已用 可用 已用% 掛載點(diǎn) devtmpfs 475M 0 475M 0% /dev tmpfs 487M 0 487M 0% /dev/shm tmpfs 487M 7.7M 479M 2% /run tmpfs 487M 0 487M 0% /sys/fs/cgroup /dev/mapper/centos-root 37G 1.9G 36G 5% / /dev/sr0 9.6G 9.6G 0 100% /mnt/gz /dev/sda1 1014M 137M 878M 14% /boot tmpfs 98M 0 98M 0% /run/user/0 /dev/sdc1 4.8G 22M 4.6G 1% /node server1:/data 4.8G 71M 4.6G 2% /mount_data server1:/test 37G 2.2G 35G 6% /mount_test server1:/testdata 148G 8.8G 140G 6% /mount_testdata[root@server2 ~]# mount -t glusterfs -o acl server1:/data /mount_data/ # server1將卷掛載到mnt,但此處報(bào)錯(cuò)沒有attr包 WARNING: getfattr not found, certain checks will be skipped.. [root@server2 ~]# yum -y install attr # 下載軟件包 [root@server2 ~]# touch /mnt/{1..10}test.txt # 在server2上創(chuàng)建文件[root@server1 ~]# ls /mount_data/ # 創(chuàng)建成功后在集群內(nèi)的其他主機(jī)包括本機(jī)均可以查看到改文件夾下面創(chuàng)建的新文件 10test.txt 2test.txt 4test.txt 6test.txt 8test.txt lost+found 1test.txt 3test.txt 5test.txt 7test.txt 9test.txt [root@server4 ~]# ls /mount_data/ 10test.txt 2test.txt 4test.txt 6test.txt 8test.txt lost+found 1test.txt 3test.txt 5test.txt 7test.txt 9test.txt在線擴(kuò)容
隨著業(yè)務(wù)的增長(zhǎng),集群容量不夠時(shí),需要添加更多的機(jī)器和磁盤到集群中來(lái)。
a. 普通情況只需要增加分布的廣度就可以,增加的磁盤數(shù)量必須為最小擴(kuò)容單元的整數(shù)倍,即replica×stripe,或disperse數(shù)的整數(shù)倍:
在線收縮
可能原先配置比例不合理,打算將部分存儲(chǔ)機(jī)器用于其他用途時(shí),跟擴(kuò)容一樣,也分兩種情況。
a. 降低分布廣度,移除的磁盤必須是一整個(gè)或多個(gè)存儲(chǔ)單元,在volume info的結(jié)果列表中是連續(xù)的多塊磁盤。該命令會(huì)自動(dòng)均衡數(shù)據(jù)。 [root@server1 ~]# gluster vol remove-brick test server3:/datanode server4:/datanode start It is recommended that remove-brick be run with cluster.force-migration option disabled to prevent possible data corruption. Doing so will ensure that files that receive writes during migration will not be migrated and will need to be manually copied after the remove-brick commit operation. Please check the value of the option and update accordingly. Do you want to continue with your current cluster.force-migration settings? (y/n) y volume remove-brick start: success ID: 943f61e1-02da-4b79-a08a-55f06b7c468a啟動(dòng)后需要查看刪除的狀態(tài),實(shí)際是自動(dòng)均衡的狀態(tài),直到狀態(tài)從in progress變?yōu)閏ompleted。 [root@server1 ~]# gluster vol remove-brick test server3:/datanode server4:/datanode statusNode Rebalanced-files size scanned failures skipped status run time in h:m:s--------- ----------- ----------- ----------- ----------- ----------- ------------ --------------server3 0 0Bytes 0 0 0 completed 0:00:00server4 0 0Bytes 0 0 0 completed 0:00:00狀態(tài)顯示執(zhí)行完成后,提交該移除操作。 [root@server1 ~]# gluster vol remove-brick test server3:/datanode server4:/datanode commit volume remove-brick commit: success Check the removed bricks to ensure all files are migrated. If files with data are found on the brick path, copy them via a gluster mount point before re-purposing the removed brick. b. 降低備份數(shù),移除磁盤必須是符合要求(好難表達(dá))。在volume info的結(jié)果列表中一般是零散的多塊磁盤(ip可能是連續(xù)的)。該命令不需要均衡數(shù)據(jù)。 [root@server1 ~]# gluster vol remove-brick test server1:/datanode server2:/datanode force # 移除 Remove-brick force will not migrate files from the removed bricks, so they will no longer be available on the volume. Do you want to continue? (y/n) y volume remove-brick commit force: success[root@server1 ~]# gluster vol info testVolume Name: test Type: Distribute Volume ID: e0161069-8913-43f6-abb6-f172441bfe35 Status: Started Snapshot Count: 0 Number of Bricks: 1 Transport-type: tcp Bricks: Brick1: server1:/test Options Reconfigured: performance.client-io-threads: on storage.fips-mode-rchecksum: on transport.address-family: inet nfs.disable: on降低備份數(shù)時(shí),只是簡(jiǎn)單刪除,而且命令最后用的也是force參數(shù),如果原先系統(tǒng)數(shù)據(jù)沒有復(fù)制好,那么也就會(huì)出現(xiàn)部分丟失。因此該操作需要極其謹(jǐn)慎。必須先保證數(shù)據(jù)完整,執(zhí)行g(shù)luster volume heal vol_name full命令修復(fù),并執(zhí)行g(shù)luster volume heal vol_name info,和 gluster volume status檢查,確保數(shù)據(jù)正常情況下再進(jìn)行。配置負(fù)載均衡
[root@server1 ~]# gluster vol reblance status # 當(dāng)前沒有配置負(fù)載均衡 unrecognized word: reblance (position 1) [root@server1 ~]# gluster vol reblance testdata status # 查看卷的負(fù)載 unrecognized word: reblance (position 1) [root@server1 ~]# gluster vol rebalance testdata status volume rebalance: testdata: failed: Rebalance not started for volume testdata. [root@server1 ~]# gluster vol rebalance testdata start # 啟動(dòng)負(fù)載均衡 volume rebalance: testdata: success: Rebalance on testdata has been started successfully. Use rebalance status command to check status of the rebalance process. ID: 7c7dd7d7-1515-4637-805d-dc5dc43f471b[root@server1 ~]# gluster vol rebalance testdata status # 啟動(dòng)負(fù)載均衡后查看的狀態(tài)Node Rebalanced-files size scanned failures skipped status run time in h:m:s--------- ----------- ----------- ----------- ----------- ----------- ------------ --------------server2 0 0Bytes 0 0 0 completed 0:00:00server3 0 0Bytes 0 0 0 completed 0:00:00server4 0 0Bytes 0 0 0 completed 0:00:00localhost 0 0Bytes 0 0 0 completed 0:00:00 volume rebalance: testdata: success設(shè)置卷的參數(shù)
[root@server1 ~]# gluster vol set testdata performance.cache-size 256MB # 設(shè)置 cache 大小(此處要根據(jù)實(shí)際情況,如果設(shè)置太大可能導(dǎo)致后面客戶端掛載失敗) volume set: success [root@server1 ~]# gluster vol infoVolume Name: testdata Type: Distribute Volume ID: 15874f41-0fab-4f28-885b-747536d8ba22 Status: Started Snapshot Count: 0 Number of Bricks: 4 Transport-type: tcp Bricks: Brick1: server1:/testdata Brick2: server2:/testdata Brick3: server3:/testdata Brick4: server4:/testdata Options Reconfigured: performance.cache-size: 256MB storage.fips-mode-rchecksum: on transport.address-family: inet nfs.disable: on配置客戶端
要求:GFS 客戶端節(jié)點(diǎn)必須能連通GFS服務(wù)器節(jié)點(diǎn)
[root@client ~]# yum install -y glusterfs glusterfs-fuse# 在GFS Client節(jié)點(diǎn)上,創(chuàng)建一個(gè)本地目錄: [root@client ~]# mkdir -p /test/gluster-test# 將本地目錄掛載到GFS Volume: [root@client ~]# mount.glusterfs 192.168.2.10:/data /test/gluster-test/# 查看掛載情況: [root@client ~]# df -h /test/gluster-test/ 文件系統(tǒng) 容量 已用 可用 已用% 掛載點(diǎn) 192.168.2.10:/data 34G 5.1G 29G 15% /test/gluster-test測(cè)試
單文件測(cè)試
測(cè)試方法:在客戶端創(chuàng)建一個(gè)1G大小的文件- DHT模式,默認(rèn)模式,既DHT, 也叫 分布卷: 將文件已hash算法隨機(jī)分布到 一臺(tái)服務(wù)器節(jié)點(diǎn)中存儲(chǔ)。 [root@client ~]# time dd if=/dev/zero of=hello bs=1000M count=1 記錄了1+0 的讀入 記錄了1+0 的寫出 1048576000字節(jié)(1.0 GB)已復(fù)制,9.7207 秒,108 MB/秒real 0m9.858s user 0m0.002s sys 0m7.171s- AFR模式,復(fù)制模式,既AFR, 創(chuàng)建volume 時(shí)帶 replica x 數(shù)量: 將文件復(fù)制到 replica x 個(gè)節(jié)點(diǎn)中。 [root@client ~]# time dd if=/dev/zero of=hello.txt bs=1024M count=1 記錄了1+0 的讀入 記錄了1+0 的寫出 1073741824字節(jié)(1.1 GB)已復(fù)制,5.06884 秒,212 MB/秒real 0m5.206s user 0m0.001s sys 0m3.194s- Striped 模式,條帶模式,既Striped, 創(chuàng)建volume 時(shí)帶 stripe x 數(shù)量: 將文件切割成數(shù)據(jù)塊,分別存儲(chǔ)到 stripe x 個(gè)節(jié)點(diǎn)中 ( 類似raid 0 )。 [root@client ~]# time dd if=/dev/zero of=hello bs=1000M count=1 記錄了1+0 的讀入 記錄了1+0 的寫出 1048576000字節(jié)(1.0 GB)已復(fù)制,4.92539 秒,213 MB/秒real 0m5.047s user 0m0.001s sys 0m3.036s- 條帶復(fù)制卷模式 (Number of Bricks: 1 x 2 x 2 = 4),分布式條帶模式(組合型),最少需要4臺(tái)服務(wù)器才能創(chuàng)建。 創(chuàng)建volume 時(shí) stripe 2 server = 4 個(gè)節(jié)點(diǎn):是DHT 與 Striped 的組合型。 [root@client ~]# time dd if=/dev/zero of=hello bs=1000M count=1 記錄了1+0 的讀入 記錄了1+0 的寫出 1048576000字節(jié)(1.0 GB)已復(fù)制,5.0472 秒,208 MB/秒real 0m5.173s user 0m0.000s sys 0m3.098s- 分布式復(fù)制模式 (Number of Bricks: 2 x 2 = 4),分布式復(fù)制模式(組合型), 最少需要4臺(tái)服務(wù)器才能創(chuàng)建。 創(chuàng)建volume 時(shí) replica 2 server = 4 個(gè)節(jié)點(diǎn):是DHT 與 AFR 的組合型。 [root@client ~]# time dd if=/dev/zero of=haha bs=100M count=10 記錄了10+0 的讀入 記錄了10+0 的寫出 1048576000字節(jié)(1.0 GB)已復(fù)制,1.00275 秒,1.0 GB/秒real 0m1.018s user 0m0.001s sys 0m0.697s針對(duì) 分布式復(fù)制模式還做了如下測(cè)試:4K隨機(jī)測(cè)試: 寫測(cè)試: # 安裝fio [root@client ~]# yum -y install libaio-devel.x86_64 [root@client ~]# yum -y install fio [root@client ~]# fio -ioengine=libaio -bs=4k -direct=1 -thread -rw=randwrite -size=10G -filename=1.txt -name="EBS 4KB randwrite test" -iodepth=32 -runtime=60 write: IOPS=4111, BW=16.1MiB/s (16.8MB/s)(964MiB/60001msec) WRITE: bw=16.1MiB/s (16.8MB/s), 16.1MiB/s-16.1MiB/s (16.8MB/s-16.8MB/s), io=964MiB (1010MB), run=60001-60001msec讀測(cè)試: [root@client ~]# fio -ioengine=libaio -bs=4k -direct=1 -thread -rw=randread -size=10G -filename=1.txt -name="EBS 4KB randread test" -iodepth=8 -runtime=60 read: IOPS=77.5k, BW=303MiB/s (318MB/s)(10.0GiB/33805msec) READ: bw=303MiB/s (318MB/s), 303MiB/s-303MiB/s (318MB/s-318MB/s), io=10.0GiB (10.7GB), run=33805-33805msec512K順序?qū)憸y(cè)試 [root@client ~]# fio -ioengine=libaio -bs=512k -direct=1 -thread -rw=write -size=10G -filename=512.txt -name="EBS 512KB seqwrite test" -iodepth=64 -runtime=60 write: IOPS=1075, BW=531MiB/s (556MB/s)(2389MiB/4501msec) WRITE: bw=531MiB/s (556MB/s), 531MiB/s-531MiB/s (556MB/s-556MB/s), io=2389MiB (2505MB), run=4501-4501msec其他的維護(hù)命令
卸載volume 卸載與掛載操作是一對(duì)。雖然沒有卸載也可以停止volume,但是這樣做是會(huì)出問題,如果集群較大,可能導(dǎo)致后面volume啟動(dòng)失敗。 [root@server1 ~]# umount /mount_test停止volume 停止與啟動(dòng)操作是一對(duì)。停止前最好先卸載所有客戶端。 [root@server1 ~]# gluster vol stop test Stopping volume will make its data inaccessible. Do you want to continue? (y/n) y volume stop: test: success刪除volume [root@server1 ~]# gluster vol delete test Deleting volume will erase all information about the volume. Do you want to continue? (y/n) y volume delete: test: success注: 刪除 磁盤 以后,必須刪除 磁盤( /opt/gluster/data ) 中的 ( .glusterfs/ .trashcan/ )目錄。 否則創(chuàng)建新 volume 相同的 磁盤 會(huì)出現(xiàn)文件 不分布,或者 類型 錯(cuò)亂 的問題。------------------------------------------------------------------------------------------------------------ 卸載某個(gè)節(jié)點(diǎn)GlusterFS磁盤 [root@server1 ~]# gluster peer detach server4 # 提示如果要卸載該節(jié)點(diǎn)的磁盤就要先remove-brick All clients mounted through the peer which is getting detached need to be remounted using one of the other active peers in the trusted storage pool to ensure client gets notification on any changes done on the gluster configuration and if the same has been done do you want to proceed? (y/n) y peer detach: failed: Peer server4 hosts one or more bricks. If the peer is in not recoverable state then use either replace-brick or remove-brick command with force to remove all bricks from the peer and attempt the peer detach again.[root@server1 ~]# gluster vol info testdataVolume Name: testdata Type: Distribute Volume ID: 15874f41-0fab-4f28-885b-747536d8ba22 Status: Started Snapshot Count: 0 Number of Bricks: 4 Transport-type: tcp Bricks: Brick1: server1:/testdata Brick2: server2:/testdata Brick3: server3:/testdata Brick4: server4:/testdata Options Reconfigured: performance.cache-size: 256MB storage.fips-mode-rchecksum: on transport.address-family: inet nfs.disable: on[root@server1 ~]# gluster vol remove-brick testdata server4:/testdata start # remove-brick移除的時(shí)候不能選擇replica 復(fù)制數(shù)的卷,否則commit時(shí)會(huì)報(bào)錯(cuò) It is recommended that remove-brick be run with cluster.force-migration option disabled to prevent possible data corruption. Doing so will ensure that files that receive writes during migration will not be migrated and will need to be manually copied after the remove-brick commit operation. Please check the value of the option and update accordingly. Do you want to continue with your current cluster.force-migration settings? (y/n) y volume remove-brick start: success ID: 24084420-9676-4f10-ac43-72ddc35caccf [root@server1 ~]# gluster vol remove-brick testdata server4:/testdata statusNode Rebalanced-files size scanned failures skipped status run time in h:m:s--------- ----------- ----------- ----------- ----------- ----------- ------------ --------------server4 0 0Bytes 0 0 0 completed 0:00:00 [root@server1 ~]# gluster vol remove-brick testdata server4:/testdata commit volume remove-brick commit: success Check the removed bricks to ensure all files are migrated. If files with data are found on the brick path, copy them via a gluster mount point before re-purposing the removed brick. [root@server1 ~]# gluster peer detach server4 # 移除節(jié)點(diǎn) All clients mounted through the peer which is getting detached need to be remounted using one of the other active peers in the trusted storage pool to ensure client gets notification on any changes done on the gluster configuration and if the same has been done do you want to proceed? (y/n) y peer detach: success總結(jié)
以上是生活随笔為你收集整理的搭建glusterfs集群的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 使用docker搭建Hadoop
- 下一篇: 初识puppet!