SAP HANA on Azure (Large Instances) HA & DR Considerations

 



SAP HANA on Azure (Large Instances) HA & DR

High Availability (HA) and Discovery Recovery (DR) are important to work with SAP, your system integrator, or Microsoft to properly architect and implement the correct high-availability as well as disaster recovery strategies; also to consider the Recovery Point Objective (RPO) as well as the Recovery Time Objective (RTO), which are specific to your environment.

Microsoft readily supports some SAP HANA High-availability capabilities with HANA Large Instances which includes:

  • Storage Replication- The storage system's ability to replicate all data to another HANA Large Instances also stamp in another Azure region. SAP HANA operates independently of this method and it is the default disaster recovery mechanism offered for HANA Large Instances.

  • HANA System Replication- It is the replication of all data in SAP HANA to a separate SAP HANA system which minimizes the recovery time objective at regular intervals. SAP HANA can support both asynchronous and synchronous modes.

  • Host-auto Failover- It is a local fault-recovery solution for SAP HANA that's an alternative to HANA system replication. If the master node is not available, you can also configure one or more standby SAP HANA nodes in scale-out mode, and SAP HANA automatically falls over to standby node.

You can also request the Microsoft Service Management team to setup a STONITH device for your existing servers whenever you want to setup HANA Large Instances HSR with an automatic failover. 

There are two ways to support DR with HANA Large Instances:

  • Storage replication- The primary storage contents can be constantly replicated to the remote DR storage systems that are easily available on the designated DR HANA Large Instances server. It also allows you to perform point-in-time recovery and if multipurpose DR is set-up, you should buy extra of the same size at the DR location. Microsoft also offers self-services storage snapshot as well as failover scripts for HANA failover as a part of the HANA Large Instances offering. 

  • Multi-tier HSR with a third replica in a DR region- This option easily allows a faster recovery time but it doesn't support a point-in-time recovery. HSR needs s secondary system and the HANA system replication for DR site ie handled through proxies like nginx or IP tables.  

Storage replication depends upon the usage of the storage snapshots for HANA Large Instances and it's not possible to choose an Azure region as a DR region that can be a different geopolitical area. 

The following table shows the currently supported high-availability disaster recovery methods as well as combinations:

Scenario supported in HANA Large Instances

High availability option

Disaster recovery option

Comments

Single node

Not available

Dedicated DR setup.

Multipurpose DR setup.

 

Host auto-failover: Scale-out (with or without standby) including 1+1

Possible with the standby taking the active role. HANA controls the role switch.

Dedicated DR setup.

Multipurpose DR setup.

DR synchronization by using storage replication.

HANA volume sets are attached to all the nodes. DR site must have the same number of nodes.  

HANA system replication

Possible with primary or secondary setup. Secondary moves to primary role in a failover case. HANA system replication and OS control failover.

Dedicated DR setup.

Multipurpose DR setup.

DR synchronization by using storage replication. DR by using HANA system replication is not yet possible without the third-party components.

Separate set of disk volumes are attached to each node. Only disk volumes of secondary replica in the production site get replicated to the DR location. One set of volumes is required at the DR site.

A dedicated DR setup is that in which the HANA Large Instances unit in DR site is not used for running any other workload or non-production system. This unit is passive and can only be deployed when a disaster failover is executed. But, this setup is not a preferred choice for many customers. 

In a multipurpose DR setup a HANA Large Instances unit on the DR site runs a non-production workload. If there is a case of a disaster, you should immediately shut down the non-production system, mount the storage-replicated (additional) volume sets, and then start the production HANA instance. Many customers who uses the HANA Large Instances disaster recovery functionality always use this configuration.      







Comments

Popular posts from this blog

Deployment (Part 3)

Deployment (Part 1)

Design Planning (Part 3)