0
点赞
收藏
分享

微信扫一扫

openGauss一主二备模拟宕机故障

飞鸟不急 2022-06-04 阅读 62

节点1、节点2关机


# 同时关机
[root@opengauss01 <sub>]# halt -p
[root@opengauss02 </sub>]# halt -p


查看关机前后集群状态


关机前


[opengauss@opengauss03:/home/opengauss]$gs_om -t status --detail
[ CMServer State ]

node node_ip instance state
------------------------------------------------------------------------------------------
1 opengauss01 192.168.75.61 1 /opengauss/opengaussdb/data/cmserver/cm_server Primary
2 opengauss02 192.168.75.62 2 /opengauss/opengaussdb/data/cmserver/cm_server Standby
3 opengauss03 192.168.75.63 3 /opengauss/opengaussdb/data/cmserver/cm_server Standby

[ Cluster State ]

cluster_state : Normal
redistributing : No
balanced : Yes
current_az : AZ_ALL

[ Datanode State ]

node node_ip instance state
------------------------------------------------------------------------------------
1 opengauss01 192.168.75.61 6001 /opengauss/opengaussdb/data/dn P Primary Normal
2 opengauss02 192.168.75.62 6002 /opengauss/opengaussdb/data/dn S Standby Normal
3 opengauss03 192.168.75.63 6003 /opengauss/opengaussdb/data/dn S Standby Normal
[opengauss@opengauss03:/home/opengauss]$


关机后


[opengauss@opengauss03:/home/opengauss]$gs_om -t status --detail
[ CMServer State ]

node node_ip instance state
------------------------------------------------------------------------------------------
1 opengauss01 192.168.75.61 1 /opengauss/opengaussdb/data/cmserver/cm_server Down
2 opengauss02 192.168.75.62 2 /opengauss/opengaussdb/data/cmserver/cm_server Down
3 opengauss03 192.168.75.63 3 /opengauss/opengaussdb/data/cmserver/cm_server Standby

cm_ctl: can't connect to cm_server.
Maybe cm_server is not running, or timeout expired. Please try again.
[opengauss@opengauss03:/home/opengauss]$


结论


1主2备的集群宕机两个节点后不能对外提供服务了

这种情况下剩余的节点不能对数据库进行写操作了

[opengauss@opengauss03:/home/opengauss]$gsql -p 15400 -d postgres
gsql ((openGauss 3.0.0 build 02c14696) compiled at 2022-04-01 18:12:34 commit 0 last mr )
Non-SSL connection (SSL connection is recommended when requiring high-security)
Type "help" for help.

openGauss=# \d
No relations found.
openGauss=# \l
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
-----------+-----------+-----------+---------+-------+-------------------------
postgres | opengauss | SQL_ASCII | C | C |
template0 | opengauss | SQL_ASCII | C | C | =c/opengauss +
| | | | | opengauss=CTc/opengauss
template1 | opengauss | SQL_ASCII | C | C | =c/opengauss +
| | | | | opengauss=CTc/opengauss
test | opengauss | SQL_ASCII | C | C |
(4 rows)

openGauss=# \c test;
Non-SSL connection (SSL connection is recommended when requiring high-security)
You are now connected to database "test" as user "opengauss".
test=#
test=#
test=#
test=# \d
List of relations
Schema | Name | Type | Owner | Storage
--------+---------+-------+-----------+----------------------------------
public | testlzf | table | opengauss | {orientation=row,compression=no}
(1 row)

test=# create table testlzf1(id int);
ERROR: cannot execute CREATE TABLE in a read-only transaction
test=#


启动节点1,节点2主机


服务会被自动启动,当前CMServer的primary在节点3,Datanode的primary在节点1

[opengauss@opengauss03:/home/opengauss]$gs_om -t status --detail
[ CMServer State ]

node node_ip instance state
------------------------------------------------------------------------------------------
1 opengauss01 192.168.75.61 1 /opengauss/opengaussdb/data/cmserver/cm_server Standby
2 opengauss02 192.168.75.62 2 /opengauss/opengaussdb/data/cmserver/cm_server Standby
3 opengauss03 192.168.75.63 3 /opengauss/opengaussdb/data/cmserver/cm_server Primary

[ Cluster State ]

cluster_state : Normal
redistributing : No
balanced : Yes
current_az : AZ_ALL

[ Datanode State ]

node node_ip instance state
------------------------------------------------------------------------------------
1 opengauss01 192.168.75.61 6001 /opengauss/opengaussdb/data/dn P Primary Normal
2 opengauss02 192.168.75.62 6002 /opengauss/opengaussdb/data/dn S Standby Normal
3 opengauss03 192.168.75.63 6003 /opengauss/opengaussdb/data/dn S Standby Normal
[opengauss@opengauss03:/home/opengauss]$


第二次节点1,节点2关机


测试CMServer的primary在节点3上是否能正常切换Datanode的primary到节点3

[root@opengauss01 <sub>]# halt -p
[root@opengauss02 </sub>]# halt -p


查看关机前后集群状态


关机前


[opengauss@opengauss03:/home/opengauss]$gs_om -t status --detail                
[ CMServer State ]

node node_ip instance state
------------------------------------------------------------------------------------------
1 opengauss01 192.168.75.61 1 /opengauss/opengaussdb/data/cmserver/cm_server Standby
2 opengauss02 192.168.75.62 2 /opengauss/opengaussdb/data/cmserver/cm_server Standby
3 opengauss03 192.168.75.63 3 /opengauss/opengaussdb/data/cmserver/cm_server Primary

[ Cluster State ]

cluster_state : Normal
redistributing : No
balanced : Yes
current_az : AZ_ALL

[ Datanode State ]

node node_ip instance state
------------------------------------------------------------------------------------
1 opengauss01 192.168.75.61 6001 /opengauss/opengaussdb/data/dn P Primary Normal
2 opengauss02 192.168.75.62 6002 /opengauss/opengaussdb/data/dn S Standby Normal
3 opengauss03 192.168.75.63 6003 /opengauss/opengaussdb/data/dn S Standby Normal
[opengauss@opengauss03:/home/opengauss]$


关机后


[opengauss@opengauss03:/home/opengauss]$gs_om -t status --detail
[ CMServer State ]

node node_ip instance state
------------------------------------------------------------------------------------------
1 opengauss01 192.168.75.61 1 /opengauss/opengaussdb/data/cmserver/cm_server Down
2 opengauss02 192.168.75.62 2 /opengauss/opengaussdb/data/cmserver/cm_server Down
3 opengauss03 192.168.75.63 3 /opengauss/opengaussdb/data/cmserver/cm_server Standby

cm_ctl: can't connect to cm_server.
Maybe cm_server is not running, or timeout expired. Please try again.
[opengauss@opengauss03:/home/opengauss]$


启动节点1,节点2主机


恢复初始主备关系


#任意节点执行
[opengauss@opengauss02:/home/opengauss]$cm_ctl switchover -a
cm_ctl: cmserver is rebalancing the cluster automatically.
......
cm_ctl: switchover successfully.
[opengauss@opengauss02:/home/opengauss]$


恢复后


注:cm_ctl switchover -a只恢复了 Datanode 的初始主备关系,不会对CMServer的主备关系产生影响

[opengauss@opengauss02:/home/opengauss]$gs_om -t status --detail
[ CMServer State ]

node node_ip instance state
------------------------------------------------------------------------------------------
1 opengauss01 192.168.75.61 1 /opengauss/opengaussdb/data/cmserver/cm_server Standby
2 opengauss02 192.168.75.62 2 /opengauss/opengaussdb/data/cmserver/cm_server Standby
3 opengauss03 192.168.75.63 3 /opengauss/opengaussdb/data/cmserver/cm_server Primary

[ Cluster State ]

cluster_state : Normal
redistributing : No
balanced : Yes
current_az : AZ_ALL

[ Datanode State ]

node node_ip instance state
------------------------------------------------------------------------------------
1 opengauss01 192.168.75.61 6001 /opengauss/opengaussdb/data/dn P Primary Normal
2 opengauss02 192.168.75.62 6002 /opengauss/opengaussdb/data/dn S Standby Normal
3 opengauss03 192.168.75.63 6003 /opengauss/opengaussdb/data/dn S Standby Normal
[opengauss@opengauss02:/home/opengauss]$


节点2,节点3关机


测试Datanode的standby节点都宕机的话,单primary是否还能继续对外提供写服务

关机前

[opengauss@opengauss01:/home/opengauss]$gs_om -t status --detail
[ CMServer State ]

node node_ip instance state
------------------------------------------------------------------------------------------
1 opengauss01 192.168.75.61 1 /opengauss/opengaussdb/data/cmserver/cm_server Standby
2 opengauss02 192.168.75.62 2 /opengauss/opengaussdb/data/cmserver/cm_server Standby
3 opengauss03 192.168.75.63 3 /opengauss/opengaussdb/data/cmserver/cm_server Primary

[ Cluster State ]

cluster_state : Normal
redistributing : No
balanced : Yes
current_az : AZ_ALL

[ Datanode State ]

node node_ip instance state
------------------------------------------------------------------------------------
1 opengauss01 192.168.75.61 6001 /opengauss/opengaussdb/data/dn P Primary Normal
2 opengauss02 192.168.75.62 6002 /opengauss/opengaussdb/data/dn S Standby Normal
3 opengauss03 192.168.75.63 6003 /opengauss/opengaussdb/data/dn S Standby Normal
[opengauss@opengauss01:/home/opengauss]$


关机后

[opengauss@opengauss01:/home/opengauss]$gs_om -t status --detail
[ CMServer State ]

node node_ip instance state
------------------------------------------------------------------------------------------
1 opengauss01 192.168.75.61 1 /opengauss/opengaussdb/data/cmserver/cm_server Standby
2 opengauss02 192.168.75.62 2 /opengauss/opengaussdb/data/cmserver/cm_server Down
3 opengauss03 192.168.75.63 3 /opengauss/opengaussdb/data/cmserver/cm_server Down

cm_ctl: can't connect to cm_server.
Maybe cm_server is not running, or timeout expired. Please try again.
[opengauss@opengauss01:/home/opengauss]$
[opengauss@opengauss01:/home/opengauss]$gs_om -t status --detail
[ CMServer State ]

node node_ip instance state
------------------------------------------------------------------------------------------
1 opengauss01 192.168.75.61 1 /opengauss/opengaussdb/data/cmserver/cm_server Standby
2 opengauss02 192.168.75.62 2 /opengauss/opengaussdb/data/cmserver/cm_server Down
3 opengauss03 192.168.75.63 3 /opengauss/opengaussdb/data/cmserver/cm_server Down

cm_ctl: can't connect to cm_server.
Maybe cm_server is not running, or timeout expired. Please try again.
[opengauss@opengauss01:/home/opengauss]$
# 旧连接能够读写数据库
test=# create table testlzf2(id int);
CREATE TABLE
test=# create table testlzf3(id int);
CREATE TABLE
test=# \dt
List of relations
Schema | Name | Type | Owner | Storage
--------+----------+-------+-----------+----------------------------------
public | testlzf | table | opengauss | {orientation=row,compression=no}
public | testlzf1 | table | opengauss | {orientation=row,compression=no}
public | testlzf2 | table | opengauss | {orientation=row,compression=no}
public | testlzf3 | table | opengauss | {orientation=row,compression=no}
(4 rows)

test=# \q
[opengauss@opengauss01:/home/opengauss]$
# 新连接也能读写数据库
[opengauss@opengauss01:/home/opengauss]$gsql -p 15400 -d postgres
gsql ((openGauss 3.0.0 build 02c14696) compiled at 2022-04-01 18:12:34 commit 0 last mr )
Non-SSL connection (SSL connection is recommended when requiring high-security)
Type "help" for help.

openGauss=# \c test;
Non-SSL connection (SSL connection is recommended when requiring high-security)
You are now connected to database "test" as user "opengauss".
test=# create table testlzf4(id int);
CREATE TABLE
test=#


结论:

这种情况下剩一个节点也可以对外提供写服务

总结论

综上在某种情况下一单节点不能对外提供服务,大部分情况下可对外提供服务

1、只有在包括primary的Datanode节点和另外一个Datanode的standby节点同时宕机的时候,单节点才不能对外提供写服务;

2、如果顺序有时间间隔的宕机两个节点,剩余的那个节点可对外提供写服务;

3、非primary节点不管怎样宕机,只要primary节点不宕机都能对外提供写服务。

举报

相关推荐

0 条评论