GaussDB T 1.0.2 一主一备集群部署实战

2020 年 3 月 19 日

GaussDB T 1.0.2 一主一备集群部署实战

本文由 dbaplus 社群授权转载。


GaussDB T 1.0.2 版本已经发布了,此版本为 GA 版本,相比 1.0.1 版本稳定性和功能方面做了加强,华为后续也将要推出 RAC 版本。本文主要介绍当前 GA 版本下的主备集群方式安装。


主备(一主一备)部署方案无 CN,其中 DN 主机所在的节点为主节点,对外提供业务。



集群中各组件功能:


  • CM:集群管理模块(ClusterManager)。管理和监控分布式系统中各个功能单元和物理资源的运行状况,确保整个系统的稳定运行。

  • DN:数据节点(Datanode)。负责存储业务数据,执行数据查询任务以及返回执行结果。

  • ETCD:高可用分布式键值(key-value)数据库。负责存储集群各个节点和实例集群状态,便于集群CM管理各个实例。

  • Storage:服务器的本地存储资源,持久化存储数据。


一、环境准备


  • 操作系统:CentOS Linux release 7.5.1804 (Core)

  • 数据库软件:GaussDB_T_1.0.2-CLUSTER-CentOS-64bit.tar.gz



二、安装步骤


1、root 远程登录权限修改(两个节点均需操作)


sed -i 's/#PermitRootLogin yes/PermitRootLogin yes/g' /etc/ssh/sshd_configsed -i 's/#PasswordAuthentication yes/ PasswordAuthentication yes/g' /etc/ssh/sshd_config
复制代码


修改完重启 sshd 服务:


/bin/systemctl restart sshd.service
复制代码


2、关闭防火墙和 SELinux(两个节点均需操作)


systemctl stop firewalld.servicesystemctl disable firewalld.servicesed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config
复制代码


3、配置 corefile(两个节点均需操作)


echo 'ulimit -c unlimited'>>/etc/profileecho 'kernel.core_pattern =/opt/gdb/corefile/core-%e-%p-%t'>>/etc/sysctl.conf
复制代码


使配置生效:


source /etc/profilesysctl -p
复制代码


4、检查 ntp 和 lsof(两个节点均需操作)


rpm -qa|grep ntp
复制代码


若没有,使用 yum 安装 ntp:


yum install ntpwhich lsof
复制代码


若没有,使用 yum 安装 lsof:


yum install lsof
复制代码


5、创建数据库安装用户组和用户(两个节点均需操作)


groupadd dbgrpuseradd -g dbgrp -d /home/omm -m -s /bin/bash omm
复制代码


设置密码:


passwd omm
复制代码


6、创建存放软件包的目录/opt/software/gaussdb 并上传安装包


mkdir -p /opt/software/gaussdb
[root@gauss1 gaussdb]# tar -zxvf GaussDB_T_1.0.2-CENTOS7.5-X86.tar.gz[root@gauss1 gaussdb]# tar -zxvf GaussDB_T_1.0.2-CLUSTER-CentOS-64bit.tar.gz
[root@gauss1 gaussdb]# chmod 755 /opt/software/[root@gauss1 gaussdb]# chmod -R 755 /opt/software/gaussdb/
复制代码


7、从/opt/software/gaussdb/template 获取预置的配置文件,根据实际部署需求修改配置


[root@gauss1 ~]# vi /opt/software/gaussdb/clusterconfig.xml<?xml version="1.0" encoding="UTF-8"?><ROOT><!--  --> <CLUSTER>  <PARAM name="clusterName" value="GaussDB_100"/>  <PARAM name="nodeNames" value="gauss1,gauss2"/>  <PARAM name="gaussdbAppPath" value="/opt/gaussdb/app"/>  <PARAM name="gaussdbLogPath" value="/opt/gaussdb/log/"/>  <PARAM name="tmpMppdbPath" value="/opt/gaussdb/tmp/gaussdb_mppdb"/>  <PARAM name="gaussdbToolPath" value="/opt/gaussdb/huawei/wisequery"/>  <PARAM name="datanodeType" value="DN_ZENITH_HA"/>  <PARAM name="coordinatorType" value="CN_ZENITH_ZSHARDING"/>  <PARAM name="replicationCount" value="2"/>  <PARAM name="clusterType" value="mutil-AZ"/>
<!-- HA2 --> <PARAM name="Ha2Node" value="true"/> <PARAM name="GatewayIP" value="192.168.238.2"/> <PARAM name="CMAgentPingTryTime" value="3"/> <PARAM name="CMAgentPingInterval" value="5"/><!--floatip--> <PARAM name="ServiceType" value="SingleService"/></CLUSTER><!-- --> <DEVICELIST>
<!-- plat1 --> <DEVICE sn="1000001"> <PARAM name="name" value="gauss1"/> <PARAM name="azName" value="AZ1"/> <PARAM name="azPriority" value="1"/> <!-- IP --> <PARAM name="backIp1" value="192.168.238.130"/> <PARAM name="sshIp1" value="192.168.238.130"/> <PARAM name="agentlsnPort" value="7020"/> <!-- GTS --> <!-- ETCD --> <PARAM name="etcdNum" value="2"/> <PARAM name="etcdListenPort" value="22100"/> <PARAM name="etcdHaPort" value="22200"/> <PARAM name="etcdListenIp1" value="192.168.238.130"/> <PARAM name="etcdHaIp1" value="192.168.238.130"/> <PARAM name="etcdDir1" value="/opt/gaussdb/data/data_etcd"/> <PARAM name="etcdDir2" value="/opt/gaussdb/data/data_etcd1"/> <!--cn--> <!-- dn --> <PARAM name="dataNum" value="1"/> <PARAM name="dataPortBase" value="15402"/> <PARAM name="dataNode1" value="/opt/gaussdb/data/data_dn1,gauss2,/opt/gaussdb/data/data_dn1"/> </DEVICE> <!-- plat2 --> <DEVICE sn="1000002"> <PARAM name="name" value="gauss2"/> <PARAM name="azName" value="AZ1"/> <PARAM name="azPriority" value="1"/> <!--IP --> <PARAM name="backIp1" value="192.168.238.131"/> <PARAM name="sshIp1" value="192.168.238.131"/> <PARAM name="agentlsnPort" value="7020"/> <!-- ETCD --> <PARAM name="etcdNum" value="1"/> <PARAM name="etcdListenPort" value="22100"/> <PARAM name="etcdHaPort" value="22200"/> <PARAM name="etcdListenIp1" value="192.168.238.131"/> <PARAM name="etcdHaIp1" value="192.168.238.131"/> <PARAM name="etcdDir1" value="/opt/gaussdb/data/data_etcd"/> <!--cn--> <!-- dn --> <!-- cm --> <PARAM name="cmsNum" value="1"/> <PARAM name="cmServerPortBase" value="21900"/> <PARAM name="cmServerListenIp1" value="192.168.238.131,192.168.238.130"/> <PARAM name="cmServerHaIp1" value="192.168.238.131,192.168.238.130"/> <PARAM name="cmServerlevel" value="1"/> <PARAM name="cmServerRelation" value="gauss2,gauss1"/> </DEVICE> </DEVICELIST></ROOT>
复制代码


8、安装 python3.7(如果环境已安装请忽略)


安装依赖包:


yum -y groupinstall "Development tools"yum -y install zlib-devel bzip2-devel openssl-devel ncurses-devel sqlite-devel readline-devel tk-devel gdbm-devel db4-devel libpcap-devel xz-develyum install libffi-devel -yyum install -y gcc
复制代码


下载 python3 安装包:


cd /rootwget https://www.python.org/ftp/python/3.7.0/Python-3.7.0.tar.xztar -xvJf  Python-3.7.0.tar.xz
复制代码


编译安装:


mkdir /usr/local/python3 #创建编译安装目录cd Python-3.7.0./configure --prefix=/usr/local/python3make && make install
复制代码


创建软连接:


ln -s /usr/local/python3/bin/python3 /usr/local/bin/python3
复制代码


验证下安装:


python3 -Vpip3 –V
复制代码


9、执行 gs_preinstall 准备环境


[root@gauss1 gaussdb]# cd /opt/software/gaussdb/script/[root@gauss1 script]# ./gs_preinstall -U omm -G dbgrp –X /opt/software/gaussdb/clusterconfig.xml
复制代码




10、检测集群时间一致性


[root@gauss1 script]# ./gs_checkos -i B -h gauss1,gauss2 -X /opt/software/gaussdb/clusterconfig.xml
复制代码



11、执行安装脚本


[root@gauss1 script]# su - omm[omm@gauss1 ~]$ gs_install -X /opt/software/gaussdb/clusterconfig.xml
复制代码



12、检查集群状态



安装完成!


三、安装过程中报错处理


1、执行./gs_preinstall 报 GAUSS-50204


[root@gauss1 script]# ./gs_preinstall -U omm -G dbgrp -X /opt/software/gaussdb/clusterconfig.xml[GAUSS-50204] : Failed to read cmServerRelation. The first item of cmServerRelation must be nodename.
复制代码


解决:修改 clusterconfig.xml 文件中 cmServerRelation 值为 gauss2,gauss1 与 nodeNames 值对应。


2、执行./gs_preinstall 报 GAUSS-53011


[root@gauss1 script]# ./gs_preinstall -U omm -G dbgrp -X /opt/software/gaussdb/clusterconfig.xmlParsing the configuration file.Successfully parsed the configuration file.Installing the tools on the local node.Successfully installed the tools on the local node.Are you sure you want to create trust for root (yes/no)? yesPlease enter password for root.Password: Creating SSH trust for the root permission user.Successfully created SSH trust for the root permission user.All host RAM is consistentDistributing package.Successfully distributed package.Are you sure you want to create the user[omm] and create trust for it (yes/no)? yesInstalling the tools in the cluster.Successfully installed the tools in the cluster.Checking system resource.[FAILURE] gauss1:[GAUSS-53011] : Failed to check gatewayIP [192.168.238.10]. Please check gatewayIP config is vaild in your environment.[FAILURE] gauss2:[GAUSS-53011] : Failed to check gatewayIP [192.168.238.10]. Please check gatewayIP config is vaild in your environment.
复制代码


解决:修改 clusterconfig.xml 文件中 GatewayIP 为虚拟机网关地址 192.168.238.2。


3、检测集群时间一致性有警告


[root@gauss1 script]# ./gs_checkos -i A12 -h gauss1,gauss2 -X /opt/software/gaussdb/clusterconfig.xml --detailRoot permission user has not SSH trust, create it when do checkos in remote node.Creating SSH trust for the root permission user.Please enter password for root.Password: Successfully creating SSH trust for the root permission user.Checking items    A12.[ Time consistency status ]                             : Warning            [gauss1]        The current system time = (2020-03-06 16:37:43")        [gauss2]        The current system time = (2020-03-06 16:37:45")
Total numbers:1. Abnormal numbers:0. Warning numbers:1.
Clean SSH trust for the root permission user.Successfully clean SSH trust for the root permission user.
复制代码


解决:重新同步系统时间。


[root@gauss1 ~]# cd /opt/software/gaussdb/script[root@gauss1 script]# ./gs_checkos -i C1 -h gauss1,gauss2 -X /opt/software/gaussdb/clusterconfig.xml
Root permission user has not SSH trust, create it when do checkos in remote node.Creating SSH trust for the root permission user.Please enter password for root.Password: Successfully creating SSH trust for the root permission user.
C1. [ Set NTP Service ] : NormalNOTICE: MTU value and some warning items can NOT be set. Please do it manually.Total numbers:1. Abnormal numbers:0. Warning numbers:0.
Clean SSH trust for the root permission user.Successfully clean SSH trust for the root permission user.
复制代码


四、集群卸载


1、omm 用户执行 gs_uninstall --delete-data -X


/opt/software/gaussdb/clusterconfig.xml
[omm@gauss1 ~]$ gs_uninstall --delete-data -X /opt/software/gaussdb/clusterconfig.xmlThe data will be deleted and cannot be recovered. Are you sure you want to uninstall the cluster(yes/no)?yesCheck preinstall on every node.Successfully checked preinstall on every node.Stop cluster.Check logfile path.Clean crontab.Clean crontab successfully.Kill process for components.Kill process for components successfully.Uninstall componentsUninstall components successfully.Modifying user's environmental variable.Successfully modified user's environmental variable.Clean tmp files and logs.Successfully clean cluster's tmp and logs.Successful uninstallation
复制代码


2、清理主机的环境


[root@gauss1 script]# ./gs_postuninstall -U omm -X /opt/software/gaussdb/clusterconfig.xmlThe environment will be cleaned up and cannot be recovered.Are you sure you want to clean up the environment(yes/no)?yesParsing the configuration file.Successfully parsed the configuration file.Checking unpreinstallation.Success checking unpreinstallation.Deleting the instance paths.Deleting the instance paths successfully.Clean log and dependency.Clean up the user environment variablesClean up the user environment variable successfullyClean up the remote system tool environment variablesClean up the remote system tool environment variable successfullyClean up the local system tool environment variablesClean up the local system tool environment variable successfullyClean log and dependency successfully.Postuninstall successfully.Please close the terminal and login again, ensure that the environment variable takes effect.
复制代码


作者介绍


刘滨,中移信息数据库运维专家。拥有 Oracle OCP、OCM 认证,曾多年在银行金融保险行业一线运维,目前正在积极参与国产数据库的研究以及在移动场景下的推广等工作。


原文链接


https://mp.weixin.qq.com/s?__biz=MzI4NTA1MDEwNg==&mid=2650786311&idx=2&sn=4402b6dea5a887805a1e1022f0cb7d99&chksm=f3f97f92c48ef6842996101c0ae60f04074535dc35690878275f3f3514a5734c7c7d3aa657f7&scene=27#wechat_redirect


2020 年 3 月 19 日 10:00687

评论

发布
暂无评论
发现更多内容

架构师 week 1 作业一

iLeGeND

作业一:食堂就餐卡系统设计

carol

食堂就餐卡 最用心

架构师训练营Week1总结

sunnywhy

人工智能之机械基

码农神说

人工智能 程序员 加班

第一周学习总结

Young

第一章作业

李白

食堂就餐卡系统设计

慢慢来的比较快

【架构师训练营】1 - 食堂就餐卡系统设计

悬浮

架构 UML 部署图

食堂就餐卡系统设计

Linkin

作业一:食堂就餐卡系统设计

晨光

架构作业-UML图

铁血杰克

GitHub 热榜:轻量级无 Agent 的自动化运维平台!

JackTian

GitHub spug 运维自动化 开源项目 监控管理平台

食堂就餐卡系统设计(第一周作业)

Geek_237932

UML作业

王志祥

极客大学架构师训练营

week1学习总结

慢慢来的比较快

第一章作业-学习总结

李白

食堂就餐卡系统设计

Thrine

架构师训练营 第一周 命题作业

RZC

【架构思维-学习总结】week01

chun1123

学习 架构 思维方式

架构建模总结

任鉴非

一篇文章快速搞懂 Atomic(原子整数/原子引用/原子数组/LongAdder)

学习Java的小姐姐

Java 并发编程 并发 synchronized Atomic

学习总结--Week1

吴炳华

极客大学架构师训练营

架构师成长心得

熊威

【架构】— 一个简单系统的UML模型

不二架构

极客大学架构师训练营 UML 架构总结

架构师训练营 第一周 学习总结

RZC

【架构思维学习】week01

chun1123

软件架构 UML

架构师如何做架构-开篇

铁血杰克

Week1命题作业

星河寒水

部署图 时序图 组件图 用例图

食堂就餐卡系统设计

Young

孤狼王兴 | 互联网大佬往事

刘燕

AI 企业管理 美团

UML用例图组件图部署图

熊威

GaussDB T 1.0.2 一主一备集群部署实战-InfoQ