ESXi - vCenter connection behind NAT.

Problem

I've recently had the requirement to put a NAT router performing NAT overload between an ESXi server and it's respective vCenter server. According to VMware this is an unsupported configuration but of course since when is anything I do supported anyway...if it's technically possible, go for it!

In my setup the vCenter server was appearing on the "LAN" side of the router and ESXi on the "WAN" which you would expect to not be a problem considering you add the ESXi IP address inside vCenter.

This initially worked, as I'd expected it to. However problems begin after approximately one minute, the host simply dropped offline. I could still ping it fine, and communicate with it using the standalone vCenter client. I could even reconnect it in vCenter however it would only last another minute or so before it dropped. I thought NAT might be the problem.

Solution

Sure enough, NAT was the problem. The actual issue is when a ESXi host is connected to a vCenter server, there are heartbeats between the two. This isn't a problem if the connection is synchronised between the two (i.e. if vCenter send a request to ESXi on port 1234 destination 902 and then ESXi replied on port 902 destination 1234). But of course its not that simple. The destination was always 902 but the source was random, so the NAT router cannot hold the connection state. There is also another issue which I will mention later.

In order to rectify this, a port must be opened in the NAT router. This port is 902 UDP. It must be forwarded towards the vCenter server.

There is still one other minor issue. The ESXi server believes that the vCenter server is accessible from its inside address as this is what vCenter programs into the ESXi host when you make a connection. This has to be overridden.

Within ESXi modify this file: /etc/vmware/vpxa/vpxa.cfg

You may need to change the permissions of the file using the following: chmod 744 /etc/vmware/vpxa/vpxa.cfg

Modify the <serverIp>10.0.0.1</serverIp> directive to contain the WAN (outside) NAT address of the NAT router instead of the vCenter server IP.

Also add the following line: <preserveServerIp>true</preserveServerIp> otherwise the IP you just entered will be overwritten.

Save the file and change the permissions back to their original settings with: chmod 444 /etc/vmware/vpxa/vpxa.cfg

Restart the vpxa management agents on the host with services.sh restart

The host should now be online within vCenter and should stay online!