Roll back from a failed node update/ upgrade

This page describes how to revert back from:

  • A failed node update – You tried to update your Decision Insight application but it failed.
  • A failed node upgrade – You tried to upgrade your Decision Insight node to a newer version but it failed. 


Prerequisite: Have a locked backup checkpoint

When a checkpoint is locked, it is prevented from being accidentally deleted (for example, because of an automatic process), and it will allow you to revert your node back to its previous stable state.  

  • Application update – From release 20190418 onward, a locked checkpoint is created automatically whenever you start the process of updating your DI application.
  • Node upgrade – In the case of a node upgrade, create a locked checkpoint yourself before attempting your upgrade.  Information about how to create this checkpoint is provided as part of all Upgrade procedures.

What is a checkpoint?

A checkpoint is a state of storage. The goal of creating a checkpoint is to store data securely.

Use checkpoints to securely back up the data of a node. If the node crashes, the node restarts from the state it was in at the creation date of the checkpoint.

A node is represented by a transaction time. It contains all the data at that time (dump of memory in memory and list of files flushed on disk).

For a primary/replica cluster, you only need to create a checkpoint for the primary node not for the replica nodes. 

For more information, see Checkpoint.

Revert the node to the backup checkpoint

If your node is unstable following your node upgrade/update, then you must rollback the node to the state it was in when you created your locked checkpoint. To do so, follow the instructions at How to rollback a node state to a specific checkpoint?

What happens when you revert your node to a backup checkpoint?

Keep in mind that when you revert your node to a previous state using a checkpoint, you will lose all the data that entered DI since the checkpoint time stamp. For example, if your checkpoint is from March 05th at 10.55AM, and you rollback the node to this checkpoint on March 8th 4:32PM, then you roll back your database to the state and contents it held on March 05th at 10.55AM.

Important

When planning a rollback, make sure you have mechanism in place to reabsorb the missing data into DI. 

For example, if you are reading files from a folder, put back the read files from the absorbed folder to the incoming folder. If you are using Decision Insight Messaging System (DIMS) to absorb data, your node should automatically start to catch up data from DIMS. However, please note that DIMS does not have an infinite history and thus, if your data absorption relies on DIMS, you will still experience data loss if the restoration checkpoint is older than the time set in the retention policy.

What to do if you did not create a backup locked checkpoint beforehand?

If you did not create a locked checkpoint before attempting a node upgrade or update, you can check whether you have a suitable checkpoint, even an unlocked one that could be used as a suitable restoration point.  If you don't have any suitable checkpoint, we recommend that you contact the Axway Support team to help you solve your issue.

Do not reimport your old application to DI as this will not fix anything and, in fact, may actually worsen the situation and lead to a node fail-fast.

How long should you keep your locked backup checkpoint?

You may keep your backup checkpoints as long as you deem useful.

If the node seems to work fine, you can delete the locked checkpoint. However, sometimes, issues with your node may only occur a few days after your upgrade/update. Therefore, we recommend that you keep your locked checkpoint for as long as you need to assess that DI is working as expected, including, for example:

  • After you've assessed that indicators that compute only once a day or once a week are working as expected
  • After all your DI absorption routes have been triggered at least once. 










Related Links