Switch backup to main
This page explains how to
At HA node start, or if the main node fails, an operator (the cluster manager) is in charge of switching backup and main roles.
A REST API is provided to control HA status. It is secured by an access token in the header (specified by the property com.systar.electron.ha.token
).
Resource |
Verb |
Description |
Parameters |
/rest/ha/status
|
GET |
Get HA status |
- |
|
POST |
Change HA status |
change |
New status. Allowed values :
|
When switching
- At main node start-up: a HA node is always a backup to avoid split brain situation.
- If main node fails
The following table lists potential causes of main node failure.
Failure |
Detection |
Remediation |
Hard disk failure |
Hardware monitoring |
Switch to backup. |
Network failure |
Hardware monitoring |
If backup and load balancer are on another network layer, switch to backup.
Make sure the old main node is shut down. |
Software failure (not due to DI) |
Process monitoring |
Switch to backup. |
Software failure (due to DI) |
Computing heart beat |
Switch to backup. |
Manual / Maintenance |
|
Switch to backup if you need to perform operations on the main node computer. |
Steps to start HA cluster
1. Start main node
2. Switch main node
Turn your node into the main node with the cluster manager REST API.
curl -H "Authorization: token myAccessToken" -X GET http: //node2host :8080 /rest/ha/status BACKUP
curl -H "Authorization: token myAccessToken" -X POST http: //node2host :8080 /rest/ha/status ?change=MAIN
curl -H "Authorization: token myAccessToken" -X GET http: //node2host :8080 /rest/ha/status MAIN
|
3. Start consumer nodes
Backup and Replicas can now connect to the main.
Steps to switch after failure
1. Passivate failing main
If the failing main node is still reachable, turn it into a backup node using the cluster manager REST API.
curl -H "Authorization: token myAccessToken" -X GET http: //node1host :8080 /rest/ha/status MAIN
curl -H "Authorization: token myAccessToken" -X POST http: //node1host :8080 /rest/ha/status ?change=BACKUP
curl -H "Authorization: token myAccessToken" -X GET http: //node1host :8080 /rest/ha/status BACKUP
|
If not, ensure the main node is shut down.
2. Turn backup into main
Turn your backup node into the main node with the cluster manager REST API.
curl -H "Authorization: token myAccessToken" -X GET http: //node2host :8080 /rest/ha/status BACKUP
curl -H "Authorization: token myAccessToken" -X POST http: //node2host :8080 /rest/ha/status ?change=MAIN
curl -H "Authorization: token myAccessToken" -X GET http: //node2host :8080 /rest/ha/status MAIN
|
3. Redirect replicas to new main
The replica nodes should now synchronize with the new main node. For each replica node in the cluster, :
com.systar.electron.host
|
Host / IP of the new main node, acting as primary
|
4. Restart consumer nodes
All replicas (and optionally new backup) must be restarted in order to take new configuration into account. Automatic after exit if nodes are configured to restart automatically.