日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

Greenplum学习实践-【安装部署】-2、 5.10集群部署

發布時間:2023/12/14 编程问答 35 豆豆
生活随笔 收集整理的這篇文章主要介紹了 Greenplum学习实践-【安装部署】-2、 5.10集群部署 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
  • Greenplum學習實踐-【安裝部署】-2、 5.10集群部署

?

?

第 1 章** 環境說明

1.1 官方文檔

Greenplum官方安裝說明:

https://gpdb.docs.pivotal.io/5160/install_guide/install_extensions.html

?

?

?

1.2 系統要求

?

Operating System*? Red Hat Enterprise Linux 64-bit 7.x (See the following Note)**? Red Hat Enterprise Linux 64-bit 6.x**? SuSE Linux Enterprise Server 64-bit 12 SP2 and SP3 with kernel version greater than 4.4.73-5. (See**the following Note)**? SuSE Linux Enterprise Server 64-bit 11 SP4 (See the following Note)**? CentOS 64-bit 7.x**? CentOS 64-bit 6.x*
File Systems? xfs required for data storage on SUSE Linux and Red Hat (ext3 supported for root file system)
Minimum CPUPentium Pro compatible (P3/Athlon and above)

?

Minimum Memory16 GB RAM per server
Disk Requirements? 150MB per host for Greenplum installation ? Approximately 300MB per segment instance for meta data ? Appropriate free space for data with disks at no more than 70% capacity ? High-speed, local storage
Network Requirements10 Gigabit Ethernet within the array Dedicated, non-blocking switch NIC bonding is recommended when multiple interfaces are present
Software and Utilitieszlib compression libraries bash shell GNU tars GNU zip GNU sed (used by Greenplum Database gpinitsystem) perl secure shell

*Important**:* SSL is supported only on the Greenplum Database master host system.

?

1.3 搭建環境介紹

?

操作系統:CentOS Linux release 7.4.1708 (Core)

?

機器型號PowerEdge R330 *4
CPUIntel(R) Xeon(R) CPU E3-1220 v5 @ 3.00GHz *4物理核心(16cores)
內存16G
磁盤4T
Swap32G

?

?

?

?

第 2 章 安裝架構

2.1 安裝大致步驟

Perform the following tasks in order: *1.* Make sure your systems meet the System Requirements *2.* Setting the Greenplum Recommended OS Parameters *3.* (master only) Creating the Greenplum Database Administrative User Account *4.* (master only) Installing the Greenplum Database Software *5.* Installing and Configuring Greenplum on all Hosts *6.* (Optional) Installing Oracle Compatibility Functions *7.* (Optional) Installing Optional Modules *8.* (Optional) Installing Greenplum Database Extensions *9.* (Optional) Installing and Configuring the Greenplum Platform Extension Framework (PXF) *10.*Creating the Data Storage Areas *11.*Synchronizing System Clocks *12.*Next Steps

?

2.2 軟件信息

greenplum-db-5.10.2-rhel6-x86_64.zip是mpp軟件,

greenplum-cc-web-4.3.1-LINUX-x86_64.zip是web監控平臺

?

2.3 架構說明

準備 4 臺服務器, 1 臺做 master, 3臺都做存儲共部署 6 個segment 及其鏡像

IP主機名cpu內存組件規劃
10.102.254.24sdw116164*segment
10.102.254.25sdw216164*segment
10.102.254.26sdw316164*segment
10.102.254.27mdw11616master

?

=架構目標==

?

mdwsdw1sdw2sdw3
masterseg0p seg1p seg5m seg4mseg2p seg3p seg0m seg1mseg4p seg5p seg2m seg3m smdw

?

第 3 章 安裝架構

3.1 關閉防火墻和selinux-所有節點

?systemctl stop firewalld.service ?systemctl disable firewalld.service ?iptables -F ?vi /etc/selinux/config SELINUX=DISABLED?# sestatus SELinuxstatus: disabled ? ??

?

3.2 修改host文件 -所有節點

?

  • ?1. vi /etc/hosts 2. 3. 10.102.254.24 ? ? sdw1 4. 10.102.254.25 ? ? sdw2 5. 10.102.254.26 ? ? sdw3 6. 10.102.254.27 ? ? mdw

    ?

  • ?

    ?

    ?

    3.2 修改主機名

    ?

    ?hostnamectl set-hostname sdw1?hostnamectl status 狀態?/etc/sysconfig/network??

    ?

    3.3 系統內核參數優化

    ?

    ???vi /etc/sysctl.conf?kernel.shmmax = 500000000?kernel.shmmni = 4096?kernel.shmall = 4000000000?kernel.sem = 250 512000 100 2048?kernel.sysrq = 1?kernel.core_uses_pid = 1?kernel.msgmnb = 65536?kernel.msgmax = 65536?kernel.msgmni = 2048?net.ipv4.tcp_syncookies = 1?net.ipv4.ip_forward = 0?net.ipv4.conf.default.accept_source_route = 0?net.ipv4.tcp_tw_recycle = 1?net.ipv4.tcp_max_syn_backlog = 4096?net.ipv4.conf.all.arp_filter = 1?net.ipv4.ip_local_port_range = 1025 65535?net.core.netdev_max_backlog = 10000?net.core.rmem_max = 2097152?net.core.wmem_max = 2097152?vm.overcommit_memory = 2 ??sysctl -p???cat > /etc/sysctl.conf << EOF?\# sysctl settings are defined through files in?\# /usr/lib/sysctl.d/, /run/sysctl.d/, and /etc/sysctl.d/.?\#?\# Vendors settings live in /usr/lib/sysctl.d/.?\# To override a whole file, create a new file with the same in?\# /etc/sysctl.d/ and put new settings there. To override?\# only specific settings, add a file with a lexically later?\# name in /etc/sysctl.d/ and put new settings there.?\#?\# For more information, see sysctl.conf(5) and sysctl.d(5).??kernel.shmmax = 500000000?kernel.shmmni = 4096?kernel.shmall = 4000000000?kernel.sem = 500 1024000 200 4096?kernel.sysrq = 1?kernel.core_uses_pid = 1?kernel.msgmnb = 65536?kernel.msgmax = 65536?kernel.msgmni = 2048?net.ipv4.tcp_syncookies = 1?net.ipv4.ip_forward = 0?net.ipv4.conf.default.accept_source_route = 0?net.ipv4.tcp_tw_recycle = 1?net.ipv4.tcp_max_syn_backlog = 4096?net.ipv4.conf.all.arp_filter = 1?net.ipv4.ip_local_port_range = 1025 65535?net.core.netdev_max_backlog = 10000?net.core.rmem_max = 2097152?net.core.wmem_max = 2097152?vm.overcommit_memory = 2?vm.swappiness = 1?kernel.pid_max = 655350?EOF??sysctl -p

    ?

    3.4 修改Linux最大限制

    ?

    ?cat /etc/security/limits.conf??vi /etc/security/limits.conf??* soft nofile 65536?* hard nofile 65536?* soft nproc 131072?* hard nproc 131072??cat > /etc/security/limits.conf << EOF?* soft nofile 65536?* hard nofile 65536?* soft nproc 131072?* hard nproc 131072?EOF

    ?

    如何是rhel 6.x 請注意 /etc/security/limits.d/90-nproc.conf,詳細情況請見文檔

    ?

    3.5 設備與IO-文件系統

    設置XFS文件系統并掛載

    EXT4是第四代擴展文件系統(英語:Fourth EXtended filesystem,縮寫為ext4)是Linux系統下的日志文件系統,是ext3文件系統的后繼版本。

    Ext4的文件系統容量達到1EB,而文件容量則達到16TB,這是一個非常大的數字了。對一般的臺式機和服務器而言,這可能并不重要,但對于大型磁盤陣列的用戶而言,這就非常重要了。

    XFS是一個64位文件系統,最大支持8EB減1字節的單個文件系統,實際部署時取決于宿主操作系統的最大塊限制。對于一個32位Linux系統,文件和文件系統的大小會被限制在16TB。

    ?

    二者各有特點,而性能表現基本上是差不多的。例如,谷歌公司就考慮將EXT2系統升級,最終確定為EXT4系統。谷歌公司表示,他們還考慮過XFS和JFS。結果顯示,EXT4和XFS的表現類似,不過從EXT2升級到EXT4比升級到XFS容易。

    ?

    例子:

    cat >> /etc/fstab << EOF

    /dev/sdb1 /greenplum xfs rw,nodev,noatime,inode64,allocsize=16m 0 0

    EOF

    rw,nodev,noatime,nobarrier,inode64

    ?

    cat /etc/fstab

    ?

    ?

    ?

    3.6 磁盤訪問策略

    Linux磁盤I/O調度器對磁盤的訪問支持不同的策略,默認的為CFQ,GP建議設置為deadline

    ?

    查看磁盤的I/O調度策略,看到默認的為[cfq]

    ?

    The deadline scheduler option is recommended. To specify a scheduler until the next system reboot,

    run the following:

    # echo schedulername > /sys/block/devname/queue/scheduler

    ?

    echo deadline > /sys/block/sda/queue/scheduler

    ?

    linux 7

    # grubby --update-kernel=ALL --args="elevator=deadline"

    grubby --info=ALL

    ?

    ?

    3.7 調整磁盤預讀扇區數

    ?

    fdisk -l

    ?

    檢查

    /sbin/blockdev --getra /dev/sda

    設置

    /sbin/blockdev --setra 16384 /dev/sda

    ?

    ?

    在參數文件 /etc/rc.d/rc.local 中增加

    DELL : blockdev --setra 16384 /dev/sd* (紅色部分為硬盤設備標識) HP:blockdev --setra 16384 /dev/cciss/c?d?*

    ?

    ?

    3.8 禁用THP

    On systems that use grub2 such as RHEL 7.x or CentOS 7.x, use the system utility grubby. This

    command adds the parameter when run as root.

    # grubby --update-kernel=ALL --args="transparent_hugepage=never"

    After adding the parameter, reboot the system.

    This cat command checks the state of THP. The output indicates that THP is disabled.

    $ cat /sys/kernel/mm/*transparent_hugepage/enabled

    always [never]

    ?

    服務方式注冊

    # 創建 init.d 腳本

    echo '#!/bin/sh

    case $1 in

    start)

    if [ -d /sys/kernel/mm/transparent_hugepage ]; then

    thp_path=/sys/kernel/mm/transparent_hugepage

    elif [ -d /sys/kernel/mm/redhat_transparent_hugepage ]; then

    thp_path=/sys/kernel/mm/redhat_transparent_hugepage

    else

    exit 0

    fi

    ?

    echo never > ${thp_path}/enabled

    echo never > ${thp_path}/defrag

    ?

    unset thp_path

    ;;

    esac' > /etc/init.d/disable-transparent-hugepages

    ?

    # 注冊systemd文件

    echo '[Unit]

    Description=Disable Transparent Hugepages

    After=multi-user.target

    ?

    [Service]

    ExecStart=/etc/init.d/disable-transparent-hugepages start

    Type=simple

    ?

    [Install]

    WantedBy=multi-user.target' > /etc/systemd/system/disable-thp.service

    ?

    # 磁盤預讀扇區數

    /sbin/blockdev --getra /dev/sdb1 # 查看大小

    /sbin/blockdev --setra 65535 /dev/sdb1 # 設置大小

    ?

    # 創建 init.d 腳本

    echo '#!/bin/sh

    device_name=/dev/sdb1

    case $1 in

    start)

    if mount | grep "^${device_name}" > /dev/null;then

    /sbin/blockdev --setra 65535 ${device_name}

    else

    exit 0

    fi

    ?

    unset device_name

    ;;

    esac' > /etc/init.d/blockdev-setra-sdb

    ?

    # 注冊systemd文件

    echo '[Unit]

    Description=Blocdev --setra N

    After=multi-user.target

    ?

    [Service]

    ExecStart=/etc/init.d/blockdev-setra-sdb start

    Type=simple

    ?

    [Install]

    WantedBy=multi-user.target' > /etc/systemd/system/blockdev-setra-sdb.service

    ?

    # 授權并設置開機啟動

    chmod 755 /etc/init.d/disable-transparent-hugepages

    chmod 755 /etc/init.d/blockdev-setra-sdb

    chmod 755 /etc/systemd/system/disable-thp.service

    chmod 755 /etc/systemd/system/blockdev-setra-sdb.service

    systemctl enable disable-thp blockdev-setra-sdb

    ?

    ?

    ?

    3.9 Disable IPC object removal for RHEL 7 or CentOS 7

    ?

    Set this parameter in /etc/systemd/logind.conf on the Greenplum

    Database host systems.

    RemoveIPC=no

    The setting takes effect after restarting the systemd-login service or rebooting the system. To

    restart the service, run this command as the root user.

    service systemd-logind restart

    ?

    cat /etc/systemd/logind.conf

    ?

    3.10 時間同步

    ?

    /etc/chrony.conf

    ?

    systemctl status chronyd.service --查看狀態

    systemctl start chronyd.service --啟動

    systemctl enable chronyd.service --使其開機自啟

    systemctl status chronyd.service

    server 10.1.3.1 prefer

    ?

    查看時間同步源

    chronyc sources -v

    chronyc sourcestats -v

    ?

    3.11 控制ssh連接數

    /etc/ssh/sshd_config

    MaxStartups 10:30:200

    ?

    systemctl restart sshd.service

    ?

    ?

    ?

    3.12 系統依賴包

    yum -y install epel-release

    yum -y install wget cmake3 git gcc gcc-c++ bison flex libedit-devel zlib zlib-devel perl-devel perl-ExtUtils-Embed python-devel libevent libevent-devel libxml2 libxml2-devel libcurl libcurl-devel bzip2 bzip2-devel net-tools libffi-devel openssl-devel

    ?

    第 4章 安裝軟件

    4.1 創建用戶和組

    master only

    ?# groupadd gpadmin?# useradd gpadmin -g gpadmin?# passwd gpadmin?New password: <changeme>?Retype new password: <changeme>???echo gpadmin | passwd gpadmin --stdin

    ?

    ?

    ?

    4.2 root用戶解壓縮和安裝

    ?

    ?./greenplum-db-5.10.2-rhel6-x86_64.bin???I HAVE READ AND AGREE TO THE TERMS OF THE ABOVE PIVOTAL SOFTWARE?LICENSE AGREEMENT.?????********************************************************************************?Do you accept the Pivotal Database license agreement? [yes|no]?********************************************************************************???yes???********************************************************************************?Provide the installation path for Greenplum Database or press ENTER to?accept the default installation path: /usr/local/greenplum-db-5.10.2?********************************************************************************???????********************************************************************************?Install Greenplum Database into /usr/local/greenplum-db-5.10.2? [yes|no]?********************************************************************************???yes???********************************************************************************?/usr/local/greenplum-db-5.10.2 does not exist.?Create /usr/local/greenplum-db-5.10.2 ? [yes|no]?(Selecting no will exit the installer)?********************************************************************************???????安裝完成后授權?????\# chown -R gpadmin /usr/local/greenplum*(在創建gpadmin后執行)???\# chgrp -R gpadmin /usr/local/greenplum*(在創建gpadmin后執行)

    ?

    ?

    4.3 編輯環境變量

    ?

    ?cat >> .bashrc << EOF?export MASTER_DATA_DIRECTORY=/greenplum/gpdata/master/gpseg-1?source /usr/local/greenplum-db/greenplum_path.sh?EOF?source .bashrc?????cat >> /home/gpadmin/.bash_profile <<EOF ??export MASTER_DATA_DIRECTORY=/greenplum/gpdata/master/gpseg-1?source /usr/local/greenplum-db/greenplum_path.sh?export PGPORT=5432?export PGDATABASE=archdata?EOF???source /home/gpadmin/.bash_profile

    ?

    ?

    ?

    4.4 進行文件配置

    ?

    切換root

    ?

    source /usr/local/greenplum-db/greenplum_path.sh

    ?

    ?

    ?

    ------只在mdw,smdw執行

    mkdir /home/gpadmin/gpconfig

    chown -R gpadmin:gpadmin /home/gpadmin/gpconfig

    ?

    ------只在mdw,smdw執行

    cat >> /home/gpadmin/gpconfig/all_host <<EOF

    mdw

    sdw1

    sdw2

    sdw3

    EOF

    ?

    ------只在mdw,smdw執行

    cat >> /home/gpadmin/gpconfig/all_segment <<EOF

    sdw1

    sdw2

    sdw3

    EOF

    ?

    chown -R gpadmin:gpadmin /home/gpadmin/gpconfig/all_host

    chown -R gpadmin:gpadmin /home/gpadmin/gpconfig/all_segment

    ?

    4.5 設置主機免密碼登陸 -

    ?

    source /usr/local/greenplum-db/greenplum_path.sh

    ?

    /usr/local/greenplum-db/bin/gpssh-exkeys -f /home/gpadmin/gpconfig/all_host

    ?

    4.6 確認檢查主機連接狀態

    ?

    gpssh -f /home/gpadmin/gpconfig/all_host -e "ls -l"

    ?

    ?

    4.7 批量創建其他節點的用戶

    ?

    gpssh -f /home/gpadmin/gpconfig/all_segment

    ?

    groupadd gpadmin

    useradd gpadmin -g gpadmin

    passwd gpadmin

    echo gpadmin | passwd gpadmin --stdin

    ?

    ?

    4.8 gpadmin用戶-互信

    ?

    source /usr/local/greenplum-db/greenplum_path.sh

    ?

    /usr/local/greenplum-db/bin/gpssh-exkeys -f /home/gpadmin/gpconfig/all_host

    ?

    gpssh -f /home/gpadmin/gpconfig/all_host -e "ls -l"

    ?

    4.9 檢查時間同步

    gpssh -f /home/gpadmin/gpconfig/all_host -e "date"

    ?

    4.10 分發所有seg節點軟件

    ?

    root執行

    ?

    source /usr/local/greenplum-db/greenplum_path.sh

    ?

    gpseginstall -f /home/gpadmin/gpconfig/all_host -u gpadmin -p gpadmin

    ?

    ?

    4.11 檢查安裝情況

    ?

    o Log in as the gpadmin user and source

    ? source /usr/local/greenplum-db/greenplum_path.sh

    o Use the gpssh utility to see if you can login to all hosts without a password prompt

    ?

    ?

    4.12 創建相關目錄(root用戶)

    ?

    mkdir -p /greenplum/gpdata/master

    chown gpadmin:gpadmin /greenplum/gpdata/master

    ?

    gpssh -f /home/gpadmin/gpconfig/all_segment -e 'mkdir -p /greenplum/gpdata/primary1'

    gpssh -f /home/gpadmin/gpconfig/all_segment -e 'mkdir -p /greenplum/gpdata/primary2'

    gpssh -f /home/gpadmin/gpconfig/all_segment -e 'chown -R gpadmin:gpadmin /greenplum/gpdata'

    ?

    ?

    gpssh -f /home/gpadmin/gpconfig/all_segment -e 'mkdir -p /greenplum/gpdata/mirror1'

    gpssh -f /home/gpadmin/gpconfig/all_segment -e 'mkdir -p /greenplum/gpdata/mirror2'

    gpssh -f /home/gpadmin/gpconfig/all_segment -e 'chown -R gpadmin:gpadmin /greenplum/gpdata'

    ?

    or 批量創建

    gpssh -f /home/gpadmin/gpconfig/all_segment -e 'mkdir -p /greenplum/gpdata/primary{1..2}'

    gpssh -f /home/gpadmin/gpconfig/all_segment -e 'chown -R gpadmin:gpadmin /greenplum/gpdata'

    ?

    4.13 驗證系統

    ?

    檢查系統參數和測試性能

    檢查命令:gpcheck -f host_file -m mdw -ssmdw

    Validating Hardware Performance

    o gpcheckperf can be used to identify hardware and system-level issues on the machines in your Greenplum

    Database array.

    o Network Performance (gpnetbench*)

    ? gpcheckperf -f hostfile_gpchecknet_ic1 -r N -d /tmp > subnet1.out

    o Disk I/O Performance (dd test) & Memory Bandwidth (stream test)

    ? gpcheckperf -f hostfile_gpcheckperf -r ds -D -d /data/primary -d /data/mirror

    ?

    ?

    4.14 驗證OS配置

    ?

    source /usr/local/greenplum-db/greenplum_path.sh

    gpcheck -f /home/gpadmin/gpconfig/all_host -m mdw

    ?

    ?

    驗證硬件性能--這個需要確認(網絡和IO)

    ?

    ? gpcheckperf -f /home/gpadmin/gpconfig/all_host -r N -d /tmp > checknetwork.out?[root@mdw greenplum-db]# cat checknetwork.out?/usr/local/greenplum-db/./bin/gpcheckperf -f /home/gpadmin/gpconfig/all_host -r N -d /tmp???\-------------------?-- NETPERF TEST?\-------------------???====================?== RESULT?====================?Netperf bisection bandwidth test?mdw -> sdw1 = 112.340000?sdw2 -> sdw3 = 112.340000?sdw1 -> mdw = 112.330000?sdw3 -> sdw2 = 112.330000???Summary:?sum = 449.34 MB/sec?min = 112.33 MB/sec?max = 112.34 MB/sec?avg = 112.33 MB/sec?median = 112.34 MB/sec???gpcheckperf -f /home/gpadmin/gpconfig/all_host -r ds -D -d /greenplum/gpdata/primary1 -d /greenplum/gpdata/mirror1 > checkDISKIO.out?[root@mdw greenplum-db]# gpcheckperf -f /home/gpadmin/gpconfig/all_host -r ds -D -d /greenplum/gpdata/primary1 -d /greenplum/gpdata/mirror1?/usr/local/greenplum-db/./bin/gpcheckperf -f /home/gpadmin/gpconfig/all_host -r ds -D -d /greenplum/gpdata/primary1 -d /greenplum/gpdata/mirror1???\--------------------?-- DISK WRITE TEST?\--------------------???\--------------------?-- DISK READ TEST?\--------------------???\--------------------?-- STREAM TEST?\--------------------???====================?== RESULT?====================???disk write avg time (sec): 20.88?disk write tot bytes: 132920115200?disk write tot bandwidth (MB/s): 6074.65?disk write min bandwidth (MB/s): 1476.04 [ mdw]?disk write max bandwidth (MB/s): 1551.18 [sdw3]?-- per host bandwidth --?disk write bandwidth (MB/s): 1476.04 [ mdw]?disk write bandwidth (MB/s): 1537.63 [sdw1]?disk write bandwidth (MB/s): 1509.80 [sdw2]?disk write bandwidth (MB/s): 1551.18 [sdw3]?????disk read avg time (sec): 59.80?disk read tot bytes: 132920115200?disk read tot bandwidth (MB/s): 2175.57?disk read min bandwidth (MB/s): 454.54 [sdw2]?disk read max bandwidth (MB/s): 700.04 [sdw1]?-- per host bandwidth --?disk read bandwidth (MB/s): 520.03 [ mdw]?disk read bandwidth (MB/s): 700.04 [sdw1]?disk read bandwidth (MB/s): 454.54 [sdw2]?disk read bandwidth (MB/s): 500.96 [sdw3]?????stream tot bandwidth (MB/s): 49348.52?stream min bandwidth (MB/s): 12297.76 [ mdw]?stream max bandwidth (MB/s): 12388.57 [sdw2]?-- per host bandwidth --?stream bandwidth (MB/s): 12297.76 [ mdw]?stream bandwidth (MB/s): 12321.47 [sdw1]?stream bandwidth (MB/s): 12388.57 [sdw2]?stream bandwidth (MB/s): 12340.73 [sdw3]

    ?

    第 5 章 初始化database

  • copy配置文件

  • ?cp $GPHOME/docs/cli_help/gpconfigs/gpinitsystem_config /home/gpadmin/gpconfig/gpinitsystem_config

    ?

  • 設置初始化系統文件列表

  • ?cat >> /home/gpadmin/gpconfig/hostfile_gpinitsystem <<EOF?sdw1?sdw2?sdw3?EOF???chown -R gpadmin:gpadmin /home/gpadmin/gpconfig/gpinitsystem_config?chown -R gpadmin:gpadmin /home/gpadmin/gpconfig/hostfile_gpinitsystem

    ?

  • 調整參數

  • ?ARRAY_NAME="EMC Greenplum DW"?PORT_BASE=40000?SEG_PREFIX=gpseg?declare -a DATA_DIRECTORY=(/greenplum/gpdata/primary1 /greenplum/gpdata/primary2)?MASTER_HOSTNAME=mdw?MASTER_DIRECTORY=/greenplum/gpdata/master?MASTER_PORT=5432?TRUSTED_SHELL=ssh?CHECK_POINT_SEGMENTS=8?ENCODING=UNICODE?MIRROR_PORT_BASE=50000?REPLICATION_PORT_BASE=41000?MIRROR_REPLICATION_PORT_BASE=51000?declare -a MIRROR_DATA_DIRECTORY=(/greenplum/gpdata/mirror1 /greenplum/gpdata/mirror2)???vim /home/gpadmin/gpconfig/gpinitsystem_config???修改如下???[gpadmin@mdw ~]$ cat /home/gpadmin/gpconfig/gpinitsystem_config?\# FILE NAME: gpinitsystem_config???\# Configuration file needed by the gpinitsystem???\################################################?\#### REQUIRED PARAMETERS?\################################################???\#### Name of this Greenplum system enclosed in quotes.?ARRAY_NAME="Greenplum Data Platform"???\#### Naming convention for utility-generated data directories.?SEG_PREFIX=gpseg???\#### Base number by which primary segment port numbers?\#### are calculated.?PORT_BASE=40000???\#### File system location(s) where primary segment data directories?\#### will be created. The number of locations in the list dictate?\#### the number of primary segments that will get created per?\#### physical host (if multiple addresses for a host are listed in?\#### the hostfile, the number of segments will be spread evenly across?\#### the specified interface addresses).???declare -a DATA_DIRECTORY=(/greenplum/gpdata/primary1 /greenplum/gpdata/primary2)?\#### OS-configured hostname or IP address of the master host.?MASTER_HOSTNAME=mdw???\#### File system location where the master data directory?\#### will be created.?MASTER_DIRECTORY=/greenplum/gpdata/master???\#### Port number for the master instance.?MASTER_PORT=5432???\#### Shell utility used to connect to remote hosts.?TRUSTED_SHELL=ssh???\#### Maximum log file segments between automatic WAL checkpoints.?CHECK_POINT_SEGMENTS=8???\#### Default server-side character set encoding.?ENCODING=UNICODE???\################################################?\#### OPTIONAL MIRROR PARAMETERS?\################################################???\#### Base number by which mirror segment port numbers?\#### are calculated.?MIRROR_PORT_BASE=50000???\#### Base number by which primary file replication port?\#### numbers are calculated.?REPLICATION_PORT_BASE=41000???\#### Base number by which mirror file replication port?\#### numbers are calculated.?MIRROR_REPLICATION_PORT_BASE=51000???\#### File system location(s) where mirror segment data directories?\#### will be created. The number of mirror locations must equal the?\#### number of primary locations as specified in the?\#### DATA_DIRECTORY parameter.???declare -a MIRROR_DATA_DIRECTORY=(/greenplum/gpdata/mirror1 /greenplum/gpdata/mirror2)???\################################################?\#### OTHER OPTIONAL PARAMETERS?\################################################???\#### Create a database of this name after initialization.?\#DATABASE_NAME=name_of_database???\#### Specify the location of the host address file here instead of?\#### with the the -h option of gpinitsystem.?\#MACHINE_LIST_FILE=/home/gpadmin/gpconfigs/hostfile_gpinitsystem

    ?

    ?

  • 初始化database

  • ?

    ?

    gpadmin用戶

    ?

    ?gpinitsystem -c /home/gpadmin/gpconfig/gpinitsystem_config -h /home/gpadmin/gpconfig/hostfile_gpinitsystem???如何添加master standby和修改mirror分布策略spread mirror?gpinitsystem -c gpconfigs/gpinitsystem_config -h gpconfigs/hostfile_gpinitsystem -s?standby_master_hostname -S (with a standby master and a spread mirror configuration)

    ?

  • 檢查環境變量

  • ?

    ???MASTER_DATA_DIRECTORY=/data/master/gpseg-1???GPHOME=/usr/local/greenplum-db???PGDATABASE=gpadmin???[gpadmin@mdw ~]$ cat .bash_profile?\# .bash_profile???\# Get the aliases and functions?if [ -f ~/.bashrc ]; then?? ? . ~/.bashrc?fi???\# User specific environment and startup programs???PATH=$PATH:$HOME/.local/bin:$HOME/bin???export PATH?export MASTER_DATA_DIRECTORY=/greenplum/gpdata/master/gpseg-1?source /usr/local/greenplum-db/greenplum_path.sh?export PGPORT=5432?export PGDATABASE=archdata??

    ?

    第 6 章 連接測試

    ?

    ?

    6.1設置gpadmin遠程訪問密碼

    psql postgres gpadmin

    alter user gpadmin encrypted password 'gpadmin';

    \q

    ?

    6.2查詢測試

    psql -hmdw -p 5432 -d postgres -U gpadmin -c 'select dfhostname, dfspace,dfdevice from gp_toolkit.gp_disk_free order by dfhostname;'

    ?

    [gpadmin@mdw ~]$ psql -hmdw -p 5432 -d postgres -U gpadmin -c 'select dfhostname, dfspace,dfdevice from gp_toolkit.gp_disk_free order by dfhostname;'

    dfhostname | dfspace | dfdevice

    ------------+----------+----------------------------

    sdw1 | 98708120 | /dev/mapper/VolGroup-root

    sdw1 | 98708120 | /dev/mapper/VolGroup-root

    sdw2 | 98705600 | /dev/mapper/VolGroup-root

    sdw2 | 98705600 | /dev/mapper/VolGroup-root

    sdw3 | 98705144 | /dev/mapper/VolGroup-root

    sdw3 | 98705144 | /dev/mapper/VolGroup-root

    (6 rows)

    ?

    psql -h hmdw -p 5432 -d postgres -U gpadmin -c '\l+'

    [gpadmin@mdw ~]$ psql -h mdw -p 5432 -d postgres -U gpadmin -c '\l+'

    List of databases

    Name | Owner | Encoding | Access privileges | Size | Tablespace | Description

    -----------+---------+----------+---------------------+-------+------------+---------------------------

    postgres | gpadmin | UTF8 | | 73 MB | pg_default |

    template0 | gpadmin | UTF8 | =c/gpadmin | 72 MB | pg_default |

    : gpadmin=CTc/gpadmin

    template1 | gpadmin | UTF8 | =c/gpadmin | 73 MB | pg_default | default template database

    : gpadmin=CTc/gpadmin

    (3 rows)

    ?

    [gpadmin@mdw ~]$

    ?

    第 7 章 常用命令

    ?

  • 數據庫啟動:gpstart 常用可參數: -a : 直接啟動,不提示終端用戶輸入確認 -m:只啟動master 實例,主要在故障處理時使用

  • 數據庫停止:gpstop: 常用可參數:-a:直接停止,不提示終端用戶輸入確認 -m:只停止master 實例,與gpstart –m 對應使用 -M fast:停止數據庫,中斷所有數據庫連接,回滾正在運 行的事務 -u:不停止數據庫,只加載pg_hba.conf 和postgresql.conf中運行時參數,當改動參數配置時候使用。 評:-a用在shell里,最多用的還是-M fast。

  • 查看實例配置和狀態 select * from gp_configuration order by 1 ; 主要字段說明: Content:該字段相等的兩個實例,是一對P(primary instance)和M(mirror Instance) Isprimary:實例是否作為primary instance 運行 Valid:實例是否有效,如處于false 狀態,則說明該實例已經down 掉。 Port:實例運行的端口 Datadir:實例對應的數據目錄

  • gpstate :顯示Greenplum數據庫運行狀態,詳細配置等信息 常用可參數:-c:primary instance 和 mirror instance 的對應關系 -m:只列出mirror 實例的狀態和配置信息 -f:顯示standby master 的詳細信息 -Q:顯示狀態綜合信息 該命令默認列出數據庫運行狀態匯總信息,常用于日常巡檢。 評:最開始由于網卡驅動的問題,做了mirror后,segment經常down掉,用-Q參數查詢綜合信息還是比較有用的。

  • 查看用戶會話和提交的查詢等信息 select * from pg_stat_activity 該表能查看到當前數據庫連接的IP 地址,用戶名,提交的查詢等。另外也可以在master 主機上查看進程,對每個客戶端連接,master 都會創建一個進程。ps -ef |grep -i postgres |grep -i con 評:常用的命令,我經常用這個查看數據庫死在那個sql上了。

  • 查看數據庫、表占用空間 select pg_size_pretty(pg_relation_size('schema.tablename')); select pg_size_pretty(pg_database_size('databasename')); 必須在數據庫所對應的存儲系統里,至少保留30%的自由空間,日常巡檢,要檢查存儲空間的剩余容量。 評:可以查看任何數據庫對象的占用空間,pg_size_pretty可以顯示如mb之類的易讀數據,另外,可以與pg_tables,pg_indexes之類的系統表鏈接,統計出各類關于數據庫對象的空間信息。

  • 收集統計信息,回收空間 定期使用Vacuum analyze tablename 回收垃圾和收集統計信息,尤其在大數據量刪除,導入以后,非常重要 評:這個說的不全面,vacuum分兩種,一種是analize,優化查詢計劃的,還有一種是清理垃圾數據,postres刪除工作,并不是真正刪除數據,而是在被刪除的數據上,坐一個標記,只有執行vacuum時,才會真正的物理刪除,這個非常重用,有些經常更新的表,各種查詢、更新效率會越來越慢,這個多是因為沒有做vacuum的原因。

  • 查看數據分布情況 兩種方式: l Select gp_segment_id,count(*) from tablename group by 1 ; l 在命令運行:gpskew -t public.ate -a postgres 如數據分布不均勻,將發揮不了并行計算的優勢,嚴重影響性能。 評:非常用,gp要保障數據分布均勻。

  • 實例恢復:gprecoverseg 通過gpstate 或gp_configuration 發現有實例down 掉以后,使用該命令進行回復。

  • 查看鎖信息: SELECT locktype, database, c.relname, l.relation, l.transactionid, l.transaction, l.pid, l.mode, l.granted, a.current_query FROM pg_locks l, pg_class c, pg_stat_activity a WHERE l.relation=c.oid AND l.pid=a.procpid ORDER BY c.relname; 主要字段說明: relname: 表名 locktype、mode 標識了鎖的類型

  • explain:在提交大的查詢之前,使用explain分析執行計劃、發現潛在優化機會,避免將系統資源熬盡。 評:少寫了個analyze,如果只是explain,統計出來的執行時間,是非常坑爹的,如果希望獲得準確的執行時間,必須加上analyze。

  • 數據庫備份 gp_dump 常用參數:-s: 只導出對象定義(表結構,函數等) -n: 只導出某個schema gp_dump 默認在master 的data 目錄上產生這些文件: gp_catalog_1__ :關于數據庫系統配置的備份文件 gp_cdatabase_1:數據庫創建語句的備份文件 gp_dump_1:數據庫對象ddl語句 gp_dump_status_1:備份操作的日志 在每個segment instance 上的data目錄上產生的文件: gp_dump_0:用戶數據備份文件 gp_dump_status_0:備份日志

  • 數據庫恢復 gp_restore 必參數:--gp-k=key :key 為gp_dump 導出來的文件的后綴時間戳 -d dbname :將備份文件恢復到dbname

  • 登陸與退出Greenplum #正常登陸 psql gpdb psql -d gpdb -h gphostm -p 5432 -U gpadmin #使用utility方式 PGOPTIONS="-c gp_session_role=utility" psql -h -d dbname hostname -p port #退出 在psql命令行執行\q

  • 參數查詢 psql -c 'SHOW ALL;' -d gpdb gpconfig --show max_connections 評:這個用,可以管道給grep。

  • 創建數據庫 createdb -h localhost -p 5432 dhdw

  • 創建GP文件系統 # 文件系統名 gpfsdw # 子節點,視segment數創建目錄 mkdir -p /gpfsdw/seg1 mkdir -p /gpfsdw/seg2 chown -R gpadmin:gpadmin /gpfsdw # 主節點 mkdir -p /gpfsdw/master chown -R gpadmin:gpadmin /gpfsdw gpfilespace -o gpfilespace_config gpfilespace -c gpfilespace_config

  • 創建GP表空間 psql gpdb create tablespace TBS_DW_DATA filespace gpfsdw; SET default_tablespace = TBS_DW_DATA;

  • 刪除GP數據庫 gpdeletesystem -d /gpmaster/gpseg-1 -f

  • 查看segment配置 select * from gp_segment_configuration;

  • 文件系統 select * from pg_filespace_entry;

  • 磁盤、數據庫空間 SELECT * FROM gp_toolkit.gp_disk_free ORDER BY dfsegment; SELECT * FROM gp_toolkit.gp_size_of_database ORDER BY sodddatname; 日志 SELECT * FROM gp_toolkit.gp_log_master_ext; SELECT * FROM gp_toolkit.gp_log_segment_ext;

  • 表數據分布 SELECT gp_segment_id, count(*) FROM GROUP BY gp_segment_id;

  • 表占用空間 SELECT relname as name, sotdsize/1024/1024 as size_MB, sotdtoastsize as toast, sotdadditionalsize as other FROM gp_toolkit.gp_size_of_table_disk as sotd, pg_class WHERE sotd.sotdoid = pg_class.oid ORDER BY relname;

  • 索引占用空間 SELECT soisize/1024/1024 as size_MB, relname as indexname FROM pg_class, gp_toolkit.gp_size_of_index WHERE pg_class.oid = gp_size_of_index.soioid AND pg_class.relkind='i';

  • OBJECT的操作統計 SELECT schemaname as schema, objname as table, usename as role, actionname as action, subtype as type, statime as time FROM pg_stat_operations WHERE objname = '';

  • 鎖 SELECT locktype, database, c.relname, l.relation, l.transactionid, l.transaction, l.pid, l.mode, l.granted, a.current_query FROM pg_locks l, pg_class c, pg_stat_activity a WHERE l.relation=c.oid AND l.pid=a.procpid ORDER BY c.relname;

  • 隊列 SELECT * FROM pg_resqueue_status;

  • gpfdist外部表 # 啟動服務 gpfdist -d /share/txt -p 8081 –l /share/txt/gpfdist.log & # 創建外部表,分隔符為’/t’ drop EXTERNAL TABLE TD_APP_LOG_BUYER; CREATE EXTERNAL TABLE TD_APP_LOG_BUYER ( IP text, ACCESSTIME text, REQMETHOD text, URL text, STATUSCODE int, REF text, name text, VID text) LOCATION ('gpfdist://gphostm:8081/xxx.txt') FORMAT 'TEXT' (DELIMITER E'/t' FILL MISSING FIELDS) SEGMENT REJECT LIMIT 1 percent;

  • 創建普通表 create table test select * from TD_APP_LOG_BUYER; # 索引 # CREATE INDEX idx_test ON test USING bitmap (ip); # 查詢數據 select ip , count() from test group by ip order by count(); gpload # 創建控制文件 # 加載數據 gpload -f my_load.yml copy COPY country FROM '/data/gpdb/country_data' WITH DELIMITER '|' LOG ERRORS INTO err_country SEGMENT REJECT LIMIT 10 ROWS;

  • gpfdist外部表 創建可寫外部表 CREATE WRITABLE EXTERNAL TABLE unload_expenses ( LIKE expenses ) LOCATION ('gpfdist://etlhost-1:8081/expenses1.out', 'gpfdist://etlhost-2:8081/expenses2.out') FORMAT 'TEXT' (DELIMITER ',') DISTRIBUTED BY (exp_id); # 寫權限 GRANT INSERT ON writable_ext_table TO ; # 寫數據 INSERT INTO writable_ext_table SELECT * FROM regular_table; copy COPY (SELECT * FROM country WHERE country_name LIKE 'A%') TO '/home/gpadmin/a_list_countries.out'; 執行sql文件 psql gpdbname –f yoursqlfile.sql 或者psql登陸后執行 \i yoursqlfile.sql

  • 總結

    以上是生活随笔為你收集整理的Greenplum学习实践-【安装部署】-2、 5.10集群部署的全部內容,希望文章能夠幫你解決所遇到的問題。

    如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。