While this article is talking about ESXi Clusters for The Federation Enterprise Hybrid Cloud 3.1 which contains what is called an NEI Pod it is somewhat applicable to Edge Clusters in any NSX deployment (however VMware only recommend 3 hosts). I work on FEHC for EMC so this is to explain why 4 hosts are mandated for the NEI pod.
Whether its an NEI Pod, an Edge Cluster, or any other name we are referring to some dedicated hardware for running VMware NSX Edge Services Gateway’s (ESG’s) and Distributed Logical Router (DLR) Controllers. In FEHC these look like this:
The issue is that many customers don’t want to waste a minimum of 4 blades just for the NEI Pod. The solution was a collapsed Management Pod that looked like this:
This saves valuable hardware resources, but some customers still want to keep the NEI pod seperate, perhaps add extra NIC’s or move from Cisco UCS B Series, to CISCO UCS C Series to allow for greater bandwidth, etc. This always leads to a discussion that there are only 3 NSX Controllers so why do they need 4 ESXi hosts.
Let’s look at how routing updates are sent from the DLR Control VM’s to the DLR within each ESXi host.
The ESG’s are peered with the DLR Control VM’s which then send routing updates to the NSX Controller, and from there to the ESXi hosts which contain the DLR.
Now if we have a failure of an ESXi host that is containing our ESG
This would mean that traffic flowing over the failed ESG will need to be routed over a surviving ESG, this would happen as depicted:
All here is fine, but what would happen if we have an ESG and our Active DLR Control VM on the same host
This is more serious as the passvie DLR control VM first has to realise there has been a failure of the active DLR Control VM, then become active, before sending updates to the NSX controllers. This adds to the time take to route traffic over the remaining ESG’s, see below:
So that explains why you need 4 ESXi hosts, but we also need DRS rules to separate:
- DLR Control VM’s from our ESG’s
- NSX Controllers
- NSX Load Balancers
We can make optimal use of our hardware by reversing the layout for the Blue DLR’s and ESG’s and the Green DLR’s and ESG’s as shown in the below diagram.
As always if you have any comments or have spotted any mistakes please leave a comment.