FlashArray und Amazon AWS: Snap to AWS / CloudSnap to AWS
[ NOTE: machine translation with the help of DeepL translator without additional proofreading and spell checking ]
As already introduced in my post "FlashArray and FlashBlade: Snap to NFS / Snap to Flashblade" from August 17, 2019 the topic Snapshot offloading to NFS storage, with Purity 5.2 the feature "Snap to AWS" was released, another feature based on Pure Storage's portable snapshot technology. With the latest Purity 5.3, functionality for Microsoft Azure Blob has also been added.
CloudSnap works like Snap to NFS, the only difference is that the interface is different.
As already known, snapshots are also used for test/dev scenarios or cloning operations. With "CloudSnap" this functionality is extended to not "besiege" your systems with snapshot capacity and to offload this capacity to cheap cloud storage. The transferred snapshots are compressed (not deduplicated) but in a user/application format that is not directly readable. However, a FlashArray system is required to restore snapshots. A restore is therefore also possible to any supported FlashArray system (also other model, but min. Purity 5.2).
As always, there are no additional software licenses/costs to use this functionality.
CloudSnap also works agentless, meaning no Pure Software is needed in AWS. Data compression is already performed during the transfer, saving network bandwidth and increasing the efficiency of the target. After the initial transfer of the baseline, only deltas of subsequent volume snaps (incremental snapshot) are transferred, this process of matching is done within Purity.
During a restore, Purity already knows which data blocks are present on the FlashArray and only needs to transfer changed/missing blocks. Likewise, deduplication-optimized restores are performed, meaning restored data from the offload destination is deduplicated directly during the transfer and thus does not occupy valuable space.
"CloudSnap" is an app/integration which runs in the "heads" of the FlashArray controllers - known as PurityRUN. The overhead required for this is minimal and does not have a large impact (max. 10% of performance) on the primary storage traffic.
A reservation (with lower prio) of 4-8 cores and 8-16 GB RAM is created. If the load on the system can no longer guarantee proper operation of the front IO traffic, the PurityRUN functionalities are throttled.
CloudSnap can be managed through the FlashArray GUI or CLI, but also monitored through Pure1. Of course, supported tools can also control operations into the arrays via the REST API.
Similar to the asynchronous Pure Storage replication, CloudSnap uses so-called "Protection Groups".
A dedicated Amazon AWS S3 bucket is required. Dedicated in the sense that it must not contain any other data that is not in use with CloudSnap.
Annoying for me at this point was that Purity currently allows a maximum of one offload target. Conversely, for me that meant I had to disconnect my snap to NFS target. The limits for the maximum offload to volume snapshots should not be a problem at 100,000. To an offload target currently 4 * FlashArrays can be attached at most.
* 2 FlashArrays for backup and restore + 2 FlashArrays for restore only.
Base - Setup
The initial configuration must be done by Pure Storage Support. To do this, simply create a ticket (subject: "Pure Storage PurityRUN CloudSnap enablement"). In advance, you should prepare a free IP address (with connection to the network). This IP address is needed by the support for setting up the offload.
After enabling the Remote Assist/RA, the support staff can now perform the configuration. The prepared IP address is placed as a virtual interface over a replication interface of both controllers (ct0-eth2, ct1-eth2). Here it is important to know: this has no influence on the operation of functionalities like ActiveCluster!
Finally, both controllers must be restarted one after the other (no downtime) and "CloudSnap" is fully usable.
INFO: since Purity 5.2.0 PurityRUN* is already active by default and contains prepared but deactivated apps. This means that no resources are wasted (when not in use).
In the Settings > Software > App Catalog tab, two prepared apps are displayed by default. However, you can install them, but not configure them.
PurityRUN* = a KVM virtualization platform for deploying integrations/apps on the Pure Storage system.
Setting up the AWS Bucket is quick, and those who just want to test the feature can take advantage of Amazon Free AWS for 1 year.
Login to AWS
First, we log in to the AWS Management Console and switch to the AWS S3 Console Dashboard.
In the dashboard we click on "Create Bucket" and follow the wizard. A name for the bucket must be assigned and the region of the AWS data center. It is recommended to always use the shortest paths (unless geo-securing is required).
As always, I use unique names "offloadtoawsfrankfurt" for creation and identification.
Next, encryption must be enabled for this bucket. This step is mandatory for the proper configuration of CloudSnap. This customization is done on the bucket itself under "Properties" > "Default Encryption". AES-256 as encryption, on the other hand, is free to AWS-KMS.
The bucket configuration is in itself hereby completed. Important is at this point: "more - default - is sometimes more". In no case "Lifecyle rules" and a publication of the bucket may take place, in order to ensure the function and security.
AWS User Creation
Now, to complete the AWS configuration, a corresponding user with access to the bucket must be created. The user administration of AWS is done via the "IAM" and can be accessed via the search:
I created the user "purebucketuser-PURE-X50-2" with "Program controlled access" and assign him the policy "AmazonS3FullAccess" from the default policies. For arrays which only restore, ReadOnly permissions should be sufficient. I left all other settings at the default values.
With "Create user" the user is created and the access data is generated. You can export them as CSV (which I recommend for later use) and save them to a secure location.
Relevant for later use is: the "Access key", the "Secret access key" and the bucket name.
Integration Offload Target / FlashBlade
The prepared AWS bucket must now be connected to the FlashArray. This is done via Storage > Array > "+":
An alias must be specified: I am - as always - also a fan of unique names here, so I chose the specified bucket name from "offloadtoawsfrankfurt". I take the Access Key and Secret Access Key from the previously exported CSV. The bucket name can be read from the S3 AWS Console.
If the bucket has not yet been used for CloudSnap, it can be prepared for this with the checkbox "initialize bucket as offload target". The placement strategy is available in selection options:
aws-standard-class: with this option all offloads are swapped to standard S3 storage.
retention-based: with this option all offloads are stored either on S3 standard storage or S3 "cold" infrequent storage. This option depends on the respective values defined for the Protection Groups in Purity. "Retention-based storage is also known as S3-IA. All snapshots that are retained in the target for more than > 30 days are placed in S3-IA storage.
unchanged: this option MUST be used when connecting to an existing S3 bucket that has already been used with Purity.
The connection is established with "Connect" and the FlashArray has a connection with the bucket. The status is displayed in Purity in the overview. In case of problems you should check the accessibility between the systems and the set permissions.
The specified S3 bucket is scanned first, if snapshots are already detected, they become visible in Purity.
As mentioned earlier, CloudSnap is based on Protection Groups. All volumes within a Protection Group (hereafter PGROUP) can be replicated to one or more defined targets. Within a PGROUP, volumes, snapshot schedules, replication targets/schedules/periods/windows can be defined.
Therefore, first we create a new PGROUP via Storage > Protection Groups > Create Protection Group (PGROUP must not be a member of a container) with a unique name: "PGROUP-offload-TO-FB-01".
Then we define the volumes to replicate, the CloudSnap target, snapshot plans and the replication plan.
Set up Protection Group
Customization Protection Group
The snapshot plan was created according to the following pattern:
A local snapshot is taken every 15 minutes (96 snapshots daily), which remains on the source/FlashArray for one day. Again, a daily snapshot remains on the local system for another 7 days.
So only a few "short-usage" snapshots remain on the FlashArray.
The replication schedule, on the other hand, is actually intended for "long-usage" - due to the AWS Free account, no longer retention was chosen:
every 4 hours (smallest possible value) it is checked if new snapshots have to be transferred. The snapshots remain in the bucket for 1 day PLUS a daily snapshot for another 7 days.
After defining the snapshot plan, the baseline snapshot is taken automatically, after completion only incremental snapshots mentioned above are created.
The replication interval cannot be set less than 4 hours at the time of blogging with Purity 5.3.1.
Regardless of the planned snapshot and replication schedules, PGROUP snapshots can be created and replicated at any time as required. To do this, switch to the respective PGROUP (Storage > Protection Groups) and create a snapshot via "+". Optionally, a suffix can be created, whether the snapshot should be included in the regular PGROUP snap interval and whether it should be replicated.
Besides the FlashArray GUI you can easily monitor the snapshots and offloads via Pure1. In the tab > Snapshots you can list all snapshots via the timeline and get all relevant information (size, target, creation time ...).
It is also possible to work with filters on multiple systems and break down to all available categories granularly.
Via the Protection > Protection Groups tab and the timeline of the respective PGROUP it is possible to view more information about a snapshot/PGROUP.
Statistics such as latencies, IOPS, and bandwidth can be viewed either through Pure1, or as usual on the Flash systems themselves.
(the screenshots in this "Monitoring" section are from the Snap to NFS post, but differ only in the labels and not how they work with CloudSnap).
AWS Snap Recovery
If you really need to restore a storage snapshot from the bucket, this can be done easily (for short term restores). If the snap exists locally on the FlashArray, the AWS snap cannot be used (would not make sense in most cases). If you do need to restore from the AWS target, the local snapshot must be removed beforehand (with the "immediate" option).
The restore process looks like this:
1. restore the snapshot from the bucket. In the background, a local copy of the snapshot is created on the FlashArray.
2. copying the local snapshot to a new volume or overwriting an existing volume.
3. connect the volume to the host and access the data.
We go to Storage > Array > Offload Targets and select the relevant AWS target:
Inside the target we can see the snapshots contained in it and by clicking the Download button we can make a copy of the volume (optionally entering a suffix). With the automatic suffix, the volume is copied according to the following naming pattern: SourceArrayName:ProtectionGroupName.VolumeName.restore-of.SnapshotName.
We can then find the backup volume under: Storage > Volumes.
Now we can copy the volume or directly overwrite the source volume.
First we create a copy of the snapshot, a volume name and a container (optional) must be specified. In addition, you can overwrite existing volumes with "overwrite". I chose "Restore-FROM-AWS" as volume name.
Then you could connect the volume directly to a host.
When restoring a snapshot, there is no way to adjust settings; the volume is simply overwritten irrevocably.
Afterwards, this volume can be directly connected to a host.
The snapshot copy is NOT removed automatically. This operation must be performed manually. The copy is not removed from the bucket. The deletion takes place only within the local FlashArray and is reserved (quite regularly) for 24h in the trash.
Anyone who wants to view the contents of the bucket can get a simple overview via the AWS S3 console. The folder structure is managed by Purity itself and must not be modified under any circumstances or memory reclaimed by manual deletions.
More info - Links
All officially published setting options in the GUI but also CLI can be read via the "on-board" user guides of the Pure Storage systems.
Click on "Help" in the Purity main menu.
The User Guide is structured like the main menu and can be opened downwards. A search function is also integrated - within here you can also search for keywords.
WEB: Pure Storage (Pure1) support portal - Ticket system and support *(requires registered FlashSystems)
PHONE: Pure Storage phone support: GER - (+49) (0)800 7239467; INTERNATIONAL - (+1) 650 7294088
WEB: Pure Storage OFFICIAL blog