VMware’s done an impressive job of bringing the Internet development cycle of incremental releases twice a year and applying it to the traditionally much stodgier world of enterprise storage where customers are used to waiting three to five years for the new features that come with an array’s generational update. The latest 6.6 release adds over 20 new features, and a few of them are pretty significant.
Early HCI solutions, including the initial version of vSAN, allowed virtualization administrators to use host storage as a shared pool but didn’t provide a highly sophisticated storage system. With each new release, the vSAN team added features that improved vSAN’s suitability as a general purpose storage system for VMs and to more specifically support the remote office/branch office (ROBO) and stretched/metro cluster use cases that HCI addresses so well.
By the time last year’s vSAN 6.2 hit the streets, it included all the major features to be competitive with external storage arrays. These included erasure coding, compression and data deduplication to boost efficiency, metadata based snapshots, end to end checksums for data integrity and support for two-node clusters for those small remote offices.
Customers have responded to vSAN’s evolution by driving vSAN growing to a $300 million run rate with over 7,000 customers, over 1000 more than presumed market leader Nutanix.
vSAN 6.6 is the first vSAN release not bundled into a release or update of VMware’s flagship vSphere suite a separation which should allow VMware to accelerate the pace of vSAN releases even further now that the vSAN team is released from following the hypervisor team’s schedule.
If we’re just counting the number of new features, vSAN 6.6 doesn’t disappoint with 23 listed new features. While a few fall in the to be expected category like support for next-gen workloads, there are significant improvements in management, security, and resiliency in addition to what VMware claims to be an up to 50% boost in performance.
Encryption at Rest
IT organizations are facing ever greater scrutiny from regulators, auditors and other things that go bump in the night over data governance. These authorities are recommending, if not requiring that data be encrypted when at rest on SSDs and/or hard drives.
Encrypting data at rest eliminates failed, or stolen, drives as a data leak vector allowing customers to simply throw a failed drive away rather than arranging expensive certified disposal. Given that HCI systems in remote and branch offices are more frequently stolen than servers from more secure data centers encrypting data at rest just seems like a good idea.
One option is to use the per-VM encryption built into vSphere 6.5. Unfortunately encrypting VMs means those encrypted VMs will not compress or deduplicate when stored on vSAN, or an external array with data reduction.
For an HCI system to both reduce, and encrypt data it has to reduce the data before encrypting it. Some other HCI vendors provide data at rest encryption by using self-encrypting drives (SEDs) in their HCI appliances. At boot time the HCI appliance sends the encryption key to the SED, and the drive performs all the data encryption/decryption with its internal processor. Since SEDs encrypt their data in hardware customers can only add encryption during a hardware upgrade.
vSAN 6.6 encrypts data in software, as it’s written to each host’s drives. Data is encrypted in the performance tier, then decrypted, reduced and re-encrypted as it is demoted to the capacity tier. Both of VMware’s encryption solutions integrate with the leading KMIP key managers so large enterprises can centralize and enforce their encryption standards.
They both also leverage the AES-NI encryption instructions that have been included in every Xeon since 2008’s Westmere (Xeon 5600). The last generation Haswell-EP (E5 v3) chips boosted encryption performance even further reaching almost 10Gbps or 300,000 4K writes per core.
Even a very highly loaded all flash vSAN node delivering 100,000 IOPS with a 60/40 read/write mix would only consume a small fraction of a single core encrypting data. With today’s most popular virtualization hosts sporting 20 or 24 compute cores the fraction of a core required for encryption shouldn’t be significant enough to impact VM performance.
Older versions of vSAN, like many external storage systems, treated the devices it managed as either operating or defective and therefore off-line. A drive that timed out on a single command because it was busy correcting an error that occurred during garbage collection would be declared dead and a rebuild begun.
vSAN 6.6 has gotten smarter about both declaring a device dead when it’s really just resting, or pining for the fjords, and about how it rebuilds data. Among other new tricks, vSAN collects trend data on drive health and proactively rebuilds while keeping the degraded drive available just in case it contains the last copy of some data object.
Stretch Cluster Site Protection
Prior to version 6.6 vSAN clusters could be stretched between two data centers over a low latency link, and vSAN would mirror the VM data with one copy in each data center. Unfortunately, simple mirroring just couldn’t provide the level of availability customers build campus or metro clusters to achieve.
With vSAN 6.6 virtual machine storage policies can specify the data protection level, including erasure coding, to be used in each data center as well as that the data should be mirrored between data centers. In addition to the primary (local), and secondary (remote mirror) protection levels vSphere storage policies can also specify site affinity to limit a VM’s data to a single site.
Site Affinity was designed for applications like Exchange DAG that maintain their own data replication. Affinity allows system administrators to control which DAG or SQL Server Availability Groups copies are stored at each site.
A vSAN system using double parity erasure coding and mirrored across two data centers can transparently recover from multiple failures up to an entire data center without all the cost and complexity of a solution like Dell/EMC’s VPlex.
Improved Management and Analytics
Storage systems today collect several orders of magnitude more data about their health, and the workloads they’re supporting than ever before. vSAN 6.6 joins this trend sending detailed phone-home data to VMware. In the initial release, VMware will be using this data to send firmware and drive update alerts to targeted customers. This kind of customized recommendations are critical when you consider the range of hardware that vSAN supports and the limited communications some customers may have with their server OEM.
VMware’s also simplified the installation of vSAN with wizards and a simple solution to the how do I install vCenter to run on vSAN when I need vCenter to install vSAN chicken and egg problem. Installing vSAN is also simplified by the switch from multicast to unicast messages for internode commutations.
More significantly vSAN now includes 1-click controller hardware maintenance that automates the process of updating SAS controller’s firmware, a process that used to require booting servers from thumb-drives to update firmware from a FreeDOS console. I’m hoping VMware extends this to network cards and Fibre Channel HBAs soon.
For large scale users where vCenter isn’t the center of the data center management universe vSAN has improved programming interfaces, primarily through PowerCLI and an improved management pack for vRealize Operations Manager. The vRealize plugin carries storage performance over as VMs are migrated to vSAN providing a simple response when the application owner blames the change to vSAN for performance problems.
We’re impressed that VMware can manage storage software on a web 2.0 six month release schedule. Over the past three years, VMware’s vSAN has evolved beyond providing basic storage for ROBO and VDI islands to provide all the features and resiliency we’ve come to expect from a midrange storage array. With 6.6 vSAN has advanced from providing all the expected features to adding advanced functions and analytics.
Stretch Cluster Site Protection puts vSAN 6.6 on the top of our list for campus and other metro-cluster use cases. Existing solutions for split-clusters are expensive, complicated, or both and vSAN 6.6 is relatively simple and cost effective.