Overview

The new capabilities and rapid innovation in AWS allows IT architects to fundamentally alter how they view Disaster Recovery for mission critical workloads. AWS allows you to dramatically improve many of the underlying fundamentals of Disaster Recovery

Conventional Disaster Recovery Disaster Recovery on AWS
High Cost Unprecedented capabilities to build DR sites
Low ROI Easily setup DR sites in different geographic regions
Implemented for critical systems only Cut down DR site cost by 70%
Usually scaled down – 50% of production capacity Substantial savings on software licenses
Systems in a remote region can be challenging Ability to have multiple, global DR sites
Costly software licenses based on hardware usage  

Most importantly, AWS allows you to focus on three important Disaster-related business parameters:

  1. RPO – this describes the acceptable amount of data loss measured in time.
  2. RTO – is the duration of time and a service level within which a business process must be restored after a disaster (or disruption) in order to avoid unacceptable consequences associated with a break in business continuity.
  3. Cost

In turn, this focused approach enables the packaging of Disaster Recovery as a Service (DRaaS) for mission critical workloads, in this case specifically for SAP.

SAP Production Landscapes

SAP products that are based the SAP NetWeaver architecture (ECC, BW, BPC, CRM etc.) can either have a 3-tier application architecture consisting of a client (web or smart client), application server and database or a 2-tier architecture where all application server and database functions reside on a single server. Moreover, all the application code and configuration information for NetWeaver-based products is stored in the database. Therefore, a logical implication for the design of DRaaS for SAP, is to enable proper efficient recovery for the database as this is the most critical component. Other components such as application servers, web servers, load-balancing, network etc. are mapped into AWS services such as EC2, EBS, VPC etc.

SAP customers are increasingly migrating to the SAP HANA in memory database platform; HANA is certified on AWS, so DRaaS will need to accommodate HANA-based systems as a design pattern.

The two most common database types for SAP NetWeaver-based products is Oracle and SQL Server.

Some common profiles for SAP ECC customers are as follows:

Customer Type Small Medium Large
# of application servers 1 2-4 4-8
Database Size < 1 TB 2-3 TB 5+ TB
Database growth rate 5% 10% 15%
OS Windows Server / Linux Windows Server / Linux Windows Server / Linux
SAP DR Approach Options

There are three major approaches to implementing DRaaS on AWS:

Approach / Method RPO / RTO Cost
Backup / Recovery RPO = 30 minutes, RTO = 1 hour Low
Pilot Light RPO = < 5 minutes, RTO = 15 minutes Moderate
Warm Standby RPO = < 5 minutes, RTO = 5 minutes Higher

There is another, more expensive / complex approach – Hot Standby – where the recovery database is kept synchronously replicated with the source resulting in zero data loss. Even in conventional DR environments, this approach is rarely implemented and will not be considered as one of the primary options for iTMethods DRaaS, as it not a mainstream technique and is inconsistent with our DRaaS principles of simplicity and cost effectiveness.

Let us consider each of these approaches by examining some common use cases.

Use Case #1 – SAP ECC 6 on Oracle 11g R2 Database, on premise data center

The on premise SAP landscape consists of Production, QA and DEV systems:

System Application Server(s) Database Server OS
ECC Production 2 Oracle 11g R2 RHEL 6
ECC QA 2 Oracle 11g R2 RHEL 6
ECC DEV 1 Oracle 11g R2 RHEL 6

Replication Technique: Oracle Data Guard allows for setting up one or more standby databases. A standby database is a transactionally consistent copy of an Oracle production database that is initially created from a backup copy of the primary database. In this use case, Data Guard is configured for asynchronous replication to the physical standby database i.e. in Maximum Performance mode.

Use Case #2: SAP ECC, BW on SQL Server 2008 R2, hosted colocation data center

The co-located SAP landscape consists of Production, QA and DEV systems:

System Application Server(s) Database Server OS
ECC Production 1 SQL 2K8 R2 W2K8 R2
ECC QA 1 SQL 2K8 R2 W2K8 R2
ECC DEV 1 SQL 2K8 R2 W2K8 R2
BW Production   SQL 2K8 R2 W2K8 R2
BW QA   SQL 2K8 R2 W2K8 R2
BW DEV   SQL 2K8 R2 W2K8 R2

In this case, the customer has decided to implement Pilot Light DRaaS for Production environments only. Should the DEV and QA systems for ECC and BW be required during a disaster, those environments can be recovered in AWS using the Backup / Recovery method. QA and DEV environments are securely backed up to AWS S3 storage using a 3rd party backup tool that supports direct backup to S3; new EBS volumes can then be created from S3 to restore individual systems with the QA / DEV AMIs.

Replication Technique: In this use case, asynchronous SQL Server mirroring is used. This technique is used as it is more tolerant of network latency, an important consideration in geographically distributed DRaaS environments. Asynchronous mirroring also enables better performance as transactions on the principal database can be committed prior transaction records being successfully copied to the mirror.

Backup & Recovery: for non-critical systems in the SAP landscape (QA / DEV), the SQL Server database can be backed up to Amazon S3 using a 3rd party tool, CloudBerry Lab Online Backup. Amazon S3 provides cost effective, highly durable backup capabilities while ensuring that these SAP systems can be recovered in any AWS Region.

SAP Disaster Recovery Security

Controlling access to SAP data in the AWS cloud is ensured by using the following capabilities:

AWS VPC

Amazon Virtual Private Cloud (Amazon VPC) lets you provision a logically isolated section of the Amazon Web Services (AWS) Cloud where you can launch AWS resources in a virtual network that you define. You can create your own subnets and you can configure routing tables, networking gateways and network Access Control Lists (ACL) similar to how you would secure an on premise data center.

VPN and Direct Connect

You can establish IPSEC VPN connections between the VPC and your own datacenters, which helps to safely encrypt the network traffic between AWS and the customer premises / data center or use the AWS Direct Connect feature. Both techniques allow your administrators and your client applications to securely access your databases from your internal network.

For SAP 3-tier environments, you should create different security groups for different tiers of the application infrastructure architecture inside your VPC. For Application Server and Database tiers, it is a best practice to create different security group for each of them. Creating “tier wise” security groups will increase the infrastructure security inside Amazon VPC.

Database Security Features

To encrypt SAP data at rest, Transparent Data Encryption (TDE) is recommended. TDE provides transparent encryption of stored data to support your privacy and compliance efforts and is supported by both Oracle and SQL Server databases. SAP applications do not have to be modified and will continue to work as before. Data is automatically encrypted before it is written to disk and automatically decrypted when reading from storage. Key management is built-in, which eliminates the task of creating, managing, and securing encryption keys. You can choose to encrypt tablespaces or specific table columns using industry standard encryption algorithms including Advanced Encryption Standard (AES) and Data Encryption Standard (Triple DES).

Pilot Light Scenario: AWS Cloud Advantages

There are a number of advantages that the AWS platform confers on iTMethods DRaaS, resulting in dramatically reduced secondary site costs:

  • Standby SAP database instances can be the smallest possible AWS instances; this dramatically reduces the recurring annual DR cost. If a disaster is declared, the database – or for that matter, any other SAP instance – can be dynamically resized in minutes to restore full production level capacity. All of these operations can be scripted and automated to ensure reliable Disaster Recovery operations.
  • Non-critical SAP application components like application servers, Solution Manager etc. can be cost effectively stored in AWS as AMIs. This effectively “freeze dries” SAP servers until they are required in the event of disaster or a DR rehearsal, at which point they are activated as running instances. Best Practices for DRaaS on AWS strongly recommend that SAP AMIs be refreshed according to normal SAP patching cycles or on a regular timed basis, to ensure that the primary application instances are kept in sync with the AWS DR site(s). AWS Best Practices also suggest the use of Cloud Formation to script the startup and shutdown of SAP landscapes on AWS.

About iTMethods

iTMethods is a leading Amazon Web Services (AWS) Advanced Consulting Partner and an Atlassian Expert partner. The company helps clients move critical workloads to the AWS cloud and successfully manage them at any scale. Grounded in years of managed services success, iTMethods understands the technological and human complexities of deploying, automating, and securing workloads in AWS 24/7.

iTMethods team members are forward-thinking individuals, dedicated to learning and sharing successes with their local and global communities. Founded in 2005, they are headquartered in Toronto, Canada.

Read more from iTMethods