Back in 2014, VMware introduced VSAN, the virtualized storage platform that creates a distributed shared storage based on local disks installed in the vSphere servers in the cluster. While it took a few years for it to become mainstream, nowadays, VSAN can be found in most greenfield VMware environments. There is a plethora of benefits in running this storage stack such as simplified management lifecycle of storage, no need for bulky and expensive storage arrays with their own support and maintenance lifecycle, consolidation of vendor contracts and support endpoints, simplified installation and management, and so on.

VSAN offers a number of available architectures to choose from such as dual node, dual node direct connect, stretched cluster… Note that in VSAN 8.0, VMware introduced a completely new storage architecture under the hood to improve the performance of VSAN with modern hardware, more details in our dedicated blog on the topic.

Protect Your Data with BDRSuite

Cost-Effective Backup Solution for VMs, Servers, Endpoints, Cloud VMs & SaaS applications. Supports On-Premise, Remote, Hybrid and Cloud Backup, including Disaster Recovery, Ransomware Defense & more!

Usually, you can somewhat evaluate the maturity of a product or solution when the vendor starts implementing features to address edge cases that weren’t thought of at the beginning. It means the product is stable and all the core functionalities are implemented to cover 95% of customer cases. This is the case with VSAN HCI Mesh Compute clusters, a feature that was requested by customers and added to VSAN as a result.

VSAN-HCI-Mesh

“VSAN nodes are equipped with local drives that make up the virtualized shared datastore”

Download Banner

VSAN ESA limitations

Note that if vSAN Express Storage Architecture is enabled in your cluster you cannot use the cluster for any HCI mesh setup.

VMware VSAN clusters resource distribution

VMware vSAN works at the cluster level with all nodes being participating nodes. In VSAN OSA (Original Storage Architecture), the nodes have a cache tier and a capacity tier organized in disk groups regardless of whether it is full flash or not. In the new VSAN ESA, disk groups have been replaced with storage pools where all disks contribute to both cache and capacity in the VSAN datastore. As mentioned above, VSAN ESA clusters don’t support sharing the datastore with Mesh Compute Cluster.

When building a VSAN cluster, it is recommended to go for vSAN Ready Nodes when it comes to hardware as these servers are configurations that are tested and verified by VMware and the server vendor.

Now, because the servers handle both compute and storage, those two are no longer decoupled like with a storage array. Meaning it could make scaling a bit trickier if you want to keep using both resources efficiently.

  • If you add servers to increase compute, you will also increase storage which you don’t need and will remain unused
  • If you add servers to increase storage, you will also increase Memory and CPU which are even more expensive

I am obviously not talking about scenarios where you could simply add disks to the disk group or add more memory DIMMs to the servers.

The need for VSAN HCI mesh clusters

In order to address this problematic situation and to add more flexibility to the solution, VMware introduced “HCI mesh clusters”. A capability added back in vSphere 7 Update 1 to mount a remote VSAN datastore from another VSAN cluster. As a result, you get to share your VSAN storage capacity with more nodes and improve efficiency.

HCI Mesh clusters use RDT (Reliable Datagram Transport) that works over TCP/IP. It is an optimized protocol to send large files with high performance and solid reliability which suits storage use cases.

VSAN-HCI-Mesh

“HCI Mesh lets you connect to remote VSAN datastores.”

What is VSAN HCI Mesh Compute Clusters?

While VSAN HCI Mesh Clusters is great and adds flexibility, not all environments run VSAN in all their clusters. Meaning other clusters using regular shared storage (iSCSI, FC, SAS…) couldn’t use the VSAN capacity even if it had tons of available space. This is where VSAN HCI compute mesh clusters come into play. It allows non-VSAN clusters to connect a remote VSAN datastore (also via RDT protocol) and doesn’t require a VSAN license which is a bonus.

After a remote vSAN datastore is mounted on the Mesh Compute Cluster, you can then migrate VMs between clusters with standard vMotion.

The benefit of using a vSAN datastore via HCI Mesh Compute Cluster over sharing it through iSCSI and NFS is that you maintain SPBM management, monitoring, easier management and so on.

VSAN-HCI-Mesh

“vSAN datastore capacity can be leveraged by regular vSphere clusters”

vSAN HCI mesh clusters Prerequisites

There are a few prerequisites and recommendations to consider for HCI Mesh Compute clusters such as:

  • VSAN Enterprise license on the hosting VSAN cluster
  • VMCP enabled with “Datastore with APD – Power off and restart VMs”
  • 10 Gbps for VSAN traffic
  • VSAN cluster and Mesh Compute Cluster managed by the same vCenter server
  • Maximum of 5 clusters per datastore and 5 datastores per cluster
  • Clusters must be running 7.0 Update 1 or later
  • Maximum of 10 client clusters per VSAN cluster and up to 128 vSAN hosts, including hosts in the vSAN server cluster
  • Stretched, data-in-transit encryption and 2-node architectures unsupported

How to use VSAN HCI mesh compute cluster

Let’s take a look at how to mount a vSAN datastore using the HCI Mesh Compute Cluster feature.

LAB01-Cluster” is my long-lived lab VSAN cluster, the datastore is called “LAB01-VSAN”. I created a dummy cluster named “LAB02-Cluster” with a single node and no local storage. However, I did add a vSAN enabled vmkernel in the same subnet as the VSAN traffic of LAB01-Cluster. My vExpert VSAN license is installed in LAB01-Cluster but LAB02-Cluster only has the vSphere license.

Enable VSAN HCI mesh compute cluster

1. In the client cluster, go to Configure > vSAN services and click Configure vSAN
2. In the vSAN configuration wizard, click vSAN HCI Mesh Compute Cluster > Next.

VSAN-HCI-Mesh

3. You can then click Finish

Connect a remote VSAN datastore

1. In thewizard-like vSAN services menu of the client cluster, click on Mount Remote Datastores and then click on the Mount Remote Datastore in the Remote Datastores menu

VSAN-HCI-Mesh

2. Select the remote VSAN datastore to connect to and click Next. If the list is empty, it means there is no compatible cluster. It could be due to connectivity, maybe there is no VSAN cluster in the datacenter entity, older versions and so on.

VSAN-HCI-Mesh

3. In the next pane, everything should be green in the compatibility checks. Check the requirements above to double check if issues are detected in the environment.

VSAN-HCI-Mesh

4. Once you hit Finish and the process completes, the remote datastore should appear in the list of remote datastores. It will also be in the available datastores to work within the client cluster.

You can also find this information in the vSAN configuration pane of the host cluster (cluster serving the VSAN datastore), however, the cluster will appear as local as opposed to remote.

VSAN-HCI-Mesh

5. At this point you can initiate the migration of virtual machines that are stored on the VSAN datastore between the host cluster and the client cluster.

Because the VM is now stored on a storage that is accessible by hosts in both clusters, simple VMotion (without moving the virtual disks) will be sufficient to migrate it across. To make it more obvious when initiating a VMotion, the placement field of the Migrate UI display either Remote or Local to know what type of datastore you are migrating the virtual machine to.

VSAN-HCI-Mesh

After the migration finishes, your virtual machine will be running on the host of the client cluster but the files making up the VM will be stored on the disks that are in the hosts of the host VSAN cluster.

How to remove a remote vSAN datastore from a Mesh Compute Cluster?

Removing a remote datastore from a Mesh Compute cluster accounts to unmounting it. The procedure is quick and simple but you must first check that no VM is on, in the client cluster (including vSphere Cluster Service – vCLS VMs).

  • Go to Cluster > Configure > VSAN Services > Select the VSAN cluster and click UNMOUNT

VSAN-HCI-Mesh

Inter-cluster link failure

Like in every infrastructure setup, testing failure scenarios should be part of the implementation phase. Whether you are deploying for a client where you need to write an architectural document or in your organization where you will be the one running the clusters, ensuring the environment is resilient to various failures such as disk, node or network and knowing how it behaves in such cases is critical. In this article we will only cover the failure of the link between the clusters as any other issue is handled similarly as a traditional VSAN cluster.

As you gathered now, the HCI Mesh Compute cluster connects to the host VSAN cluster via the network through the VSAN vmkernel, similarly to how hosts would connect to a storage array in a conceptual point of view. As a result, you should consider how it would behave should the link with the host VSAN cluster fail.

This could happen in case where the clusters are in different racks, different switch stacks…

VSAN-HCI-Mesh

“Loss of access to VSAN host cluster via VSAN vmkernel.”

As mentioned in the requirement, you should enable vSphere HA with “Datastore with APD”. Whether you set it to conservative or aggressive will depend on your needs.

Now, when the link fails between the clusters, APD will be declared after a period of 60 seconds, and you will see events like “lost access to volume xxx”. After APD is declared, a 180 second timer is started before the APD response is triggered and the VMs are powered off. They will remain in Inaccessible state until you restore access to the VSAN datastore.

Wrap-up

VSAN has grown at an incredible pace over the years and kept munching market shares from big storage vendor companies thanks to its versatility and ease of use (setup, configuration and management). On top of that, side products such as vRealize Operations, Automation and such (now VMware Aria) integrate natively with VSAN.

VSAN HCI Mesh Compute Clusters adds even more flexibility and allows organization to better use the storage capacity of their HCI environment and reduce TCO by reducing wasted capacity.

Follow our Twitter and Facebook feeds for new releases, updates, insightful posts and more.

Rate this post