Disaster Recovery (part 4 of 4)

By Ashwin Venugopal - October 22, 2021

To read part 1, please click here

To read part 2, please click here

To read part 3, please click here

Disaster recovery failover procedure

The following cases are considered for failover to a DR site:

SAP HANA database is required to go back to the latest status of data. Here, failover can be performed with the help of a self-service script without any Microsoft contact. But for the failback you have to work with Microsoft.
You will require the help of Microsoft to restore to a storage snapshot instead of the latest replicated snapshot.

If you want to test multiple SAP HANA instances, then you have to run the script several times and when requested, you have to enter the SAP HANA SID of the instance you want to test for failover.

Note- This approach works when there is a requirement to failover to the DR site to rescue some old deleted data and the DR volumes are required to set an earlier snapshot.

Shut down the non-production instance of of HANA on the disaster recovery unit of HANA Large Instances that are running while a dormant HANA production instance is preinstalled.

Check that no SAP HANA processes are running with the help of following command:

/usr/sap/hostctrl/exe/sapcontrol -nr <HANA instance number> - function GetProcessList

Select the snapshot name or SAP HANA backup ID at which the disaster recovery site is restored. Ideally, this snapshot is the latest and to recover the lost data you have to choose an earlier snapshot.

Contact Azure Support through a high-priority support request and ask for the restoration of that snapshot along with the name as well as date of the snapshot or the HANA backup ID on the DR site.

Mount the disaster recovery volumes to the HANA Large Instance unit in the disaster recovery site.

Start the dormant SAP HANA production instance.

If copy transaction log backup logs are selected to reduce the RPO time, then, the transaction log backups can be merged into a newly mounted DR /hana/logbackups directory. You should copy new backups that weren't replicated with the latest replication of a storage snapshot instead of a storage snapshot.

Single files can also be restored from the snapshots that weren't replicated to the /hana/shared/PRD volume in the DR Azure region.

To recover SAP HANA production instance based on the restored storage snapshot as well as the available transaction log backups, you have to follow the following steps:

Change the backup location to /hana/logbackups with the help of SAP HANA Studio.
SAP HANA scans through the backup file locations as well as suggests the most recent transaction log backup to restore to and require a few minutes to until screen like below appears-
Adjust some of the default settings-

Clear Use Delta Backups
Select Initialize Log Area

4. Select Finish

If the restore doesn't respond at the Finish screen and not show the progress screen, then, make sure that all the SAP HANA instances on the worker nodes are running. Start SAP HANA instances manually if required.

Failback from a DR to a production site

The following steps should be followed:

The SAP HANA on Azure operations team can trigger to synchronize the production storage volumes from the disaster recovery storage volumes, which now represent the production state. In this state, the HANA Large Instance unit in the production site is shut down.

The SAP HANA on Azure operations team monitors the replication and makes sure that it's caught up before they inform you.

Shut down the applications using the production HANA Instance in the disaster recovery site and perform a HANA transaction log backup. After that, stop the HANA instance running on the HANA Large Instance units in the disaster recovery site.

After that, the operations team can manually synchronize the disk volumes again.

The SAP HANA on Azure operations team will start the HANA Large Instance unit in the production site again and hand it over to you after confirming that the SAP HANA instance is already in a shut down state at the at the startup time of the HANA Large Instance unit.

Repeat the same database restore steps that you did when you previously failed over to the disaster recovery site.

To read part 1, please click here

To read part 2, please click here

To read part 3, please click here

Search This Blog

Blogs by Ashwin