Tuesday, January 10, 2017

Creating an Active-Passive Pool on VMware NSX Load Balancer

In most of my NSX load balancing deployment at a customer, there is always a use case for Active-Standby or Active-Passive load balancing where the load balancer always forwards the traffic to the primary member and only forward to secondary member only if the primary member is down. The secondary member is working as a standby member.

As a comparison, F5 has this function called Priority Group feature. The Priority Group feature in F5 assigns a priority number to the pool member. Within the pool, traffic is then load balanced according to the priority number assigned to the pool member. Members that are assigned a high priority receive all traffic until the load reaches a certain level or some number of members in the group become unavailable.

This feature is not available in NSX load balancer, but we can use NSX Application Rule to achieve a similar result. NSX Layer 7 engine is based on HAProxy so we can use HAProxy ACL to achieve this Active-Standby load balancing method. Application rules enable NSX to create advanced load balancing rules which may not be possible with the application profile or services natively available on the NSX Edge. The application rule will be utilized by the virtual server configuration.

We can use the nbsrv in ACL to check if primary member is down (pool member is 0) then switch to secondary member (pool). To achieve this, we create two pools, create the application rule, and apply it to the virtual server.
Below is the Application Rule for active/standby pool
acl pool_is_down nbsrv(active_pool_name) eq 0
use_backend standby_pool_name if pool_is_down
Here is a step by step configuration in VMware HOL-1703-SDC-1-HOL
1.  Create 2 Pool: pool-web-01a (active pool) and pool-web-02a (standby pool)

2. Create Application Rule, to add a comment on the script use #. In below screenshot example row 1 & row 3 are comments, the scripts are in row 2 and row 4.

3. Create Virtual Server, select the designated active pool as the Default Pool, and apply the Active-Standby Application Rule

The load balancer will now always forward to pool-web-01a and only use pool-web-02a when pool-web-01a is down/has no member. A caveat to note is that failback/preempt is enabled without delay, so the load balancer will instantly switch back and forward to pool-web01a whenever it comes back online.
I haven't found a way to disable preempt or delay the preempt. So if your active pool is down, disable the member to disable the failback/preempt

Thursday, December 15, 2016

Troubleshoot NSX DFW (Distributed Firewall) dropping or blocking traffic

I often receive questions from friends and customers around NSX DFW (Distributed Firewall) issue or troubleshotting NSX DFW.
They just installed NSX DFW then found that NSX DFW rules does not work as expected and seems to be dropping legitimate traffic.

Scenario 1: Default rule (Rule ID 1001) set to Deny, there are other Allow rules on top of it but traffic did not hit the Allow rules
Scenario 2: vRA (vRealize Automation) is being used and integrated with NSX. vRA Blueprint has App Isolation Policy enabled and overlapping IP address is being used
Scenario 3: Default rule is set to Allow, NSX version is 6.2.x, somehow traffic is blocked

There are several points or steps that we can check to troubleshoot NSX DFW, I would normally start with the following steps:
1. Check logs
Check syslog (dfwpktlogs) or traceflow or flow monitoring. See if any DFW rules is blocking the traffic whether it is the default rule is deny, allow, or maybe there's another rule.

2. NSX SpoofGuard
Verify if NSX SpoofGuard is detecting IP address on the VM from the SpoofGuard menu
 

3. Check VMware tools
DFW uses VMware tools to associate a VM and its vNICs with IP Addresses, if VMware tools was not installed on a VM, its IP address was not learned.
If for some reason you cannot install VMware Tools, starting with NSX 6.2 you can use DHCP or ARP snooping for NSX to detect VM's IP addresses
IP detection method can be changed in the cluster level
 
http://pubs.vmware.com/NSX-62/index.jsp#com.vmware.nsx.admin.doc/GUID-1B5E78B7-F352-4F3F-B6F8-86184ED49391.html
If its only a small number of VMs, you can also manually input the IP address statically in the SpoofGuard menu and put the IP address under Approved IP


4. Open VM Tools (OVT)
https://kb.vmware.com/kb/2073803
Some OS vendors/virtual appliance vendors may use Open VM Tools that ship together with the product.
Unfortunately open VM tools has not been validated with NSX DFW as per NSX docs.
http://pubs.vmware.com/NSX-62/index.jsp#com.vmware.nsx.admin.doc/GUID-95600C1C-FE9A-4652-821B-5BCFE2FD8AFB.html
"Running open VMware Tools on guest or workload virtual machines has not been validated with distributed firewall."  
So NSX may not be able to retrieve IP address using Open VM tools
https://www.reddit.com/r/vmware/comments/3ec0s7/vmware_nsx_and_openvmtools/

5. vRA Integration & NSX App Isolation
When you are integrating NSX DFW with vRA and using App isolation on vRA Blueprint, make sure to change the Service Composer "Applied To" value/behavior from its default applied to DFW change to applied to Policy's Security Groups



The "Applied To" behavior must be changed to apply to Policy's SG to allow Overlapping IPs and App Isolation and make sure that NSX will apply rules only on the VMs that are part of the SGs. We can change this option from the Service Composer settings in NSX 6.2 and later. Prior to NSX 6.2, an API call through vRO must be used. You can find the details in the NSX & vRA Micro-segmentation Tech Guide https://communities.vmware.com/docs/DOC-32774


5.  NSX ALG
NSX introduce ALG (Application Level Gateway) in NSX 6.2 http://blogs.vmware.com/networkvirtualization/2015/11/distributed-firewall-alg.html#.WFJW9FzHVwM
NSX DFW supports ALG for protocols: FTP, CIFS, ORACLE TNS, MS-RPC, SUN-RPC, and TFTP
I have a customer that have an application that runs on TCP 69 which is the same as TFTP port and as soon as we install NSX DFW, its blocking the application even if the default rule is still Allow. After opening a Support Request with VMware GSS, the support engineer confirmed that there is a known issue with NSX 6.2.3/6.2.4 on NSX ALG.
NSX 6.2.3 introduces TFTP ALG, somehow the ALG engine detect/capture this TCP 69 traffic. But after checking the traffic, it drops the traffic as it is not a TFTP traffic.
This issue is also mentioned on VMTN https://communities.vmware.com/message/2626001#2626001. Hopefully this will be fixed in the next release NSX 6.2.5

6. KB 2125437 - Troubleshooting NSX
Check this KB Troubleshooting NSX for vSphere 6.x Distributed Firewall (DFW) (2125437) | VMware KB https://kb.vmware.com/kb/2125437

If everything in above looks good, but you are still having DFW issue blocking your legitimate traffic, then you may want to open a Support Request with VMware GSS.

Sunday, September 13, 2015

Mini VMware vSphere 6 Homelab on VSAN with Shuttle DS81

Finally I have a homelab to support my study on VCAP-DCA and my job. My current work often requiring me to test a complex upgrade or a complex setup. VMware Hands-on Lab and VMware Product Walkthrough are great to learn how to configure specific features. But not all products and features available and the lab does not provide walkthrough from scratch - for example installing ESXi host, vCenter Server, etc.

It was not easy for me because I need to submit and present to my CFO - my wife, WAF (Wife Acceptance Factor) is one of the design factors that need to be considered plus there are also some technical constraints. I can only submit a Purchase Order after the BoQ has been approved by CFO.
- Total budget is $2,000USD
- Low watt. Apartment's maximum watt is 1,300W, there are fridge, aircons, TV, lights, etc. We are looking at lab that draws up to 200W max 300W.
- Compact and small form factor. Apartment's size is only 33m2, there is not much free space left.

Looking at above lists, I was not sure if I can get a homelab that can meet above requirements and constraints so I look for a cloud that can provide a lab with bunch of VMs with nested ESXi. I found ravello systems but I'm not sure if I can rely online labs and still prefer physical labs.
I was visiting Jason Langer's blog and read his post on replacing homelab with ravello systems.
I'm interested in the Physical Design lab on Micro-ATX w/ Lian-Li Case and Intel NUC. The Micro-ATX based lab can provide 32GB RAM per host, but the power supply is 400W. So I need to go with Intel NUC w/ 16GB RAM.

I was planning to get a small 4-bays storage for SOHO/SMB like Synology DS415+ but it is quite expensive.  Alternative for 4-bays in Synology w/ lowest price is DS414slim. But again still quite expensive and I'm not sure if the total budget is sufficient. So I decided to go with VMware VSAN.

To build a VSAN with NUC, I will need NUC that support multiple disks (1 SSD and 1 HDD). Apparently there are a lot of model for NUCs and I wanted something that can run vSphere 6. Florian Grehl's has an article on ESXi 6.0 Image that works with NUC. Most of local electronic/IT shops here sell the complete set NUC and in the end the memory & disks will be useless since I will replace it to 16GB RAM and at least 120GB SSD + 1TB HDD. NUC with Intel vPro is cool because it provides a KVM Remote Control. I agree with Mike Tabor as he pointed on his article Intel NUC i5 5th Generation an ESXi lab improvement, the best suite would be NUC5i5MYHE - 5th Gen Processor, 2 internal disks, and Intel vPro. So I contacted a local Intel's distributor to ask for a BoQ on NUC5i5MYHE / D54250WYKH2 / NUC5i5RYH, they only sell Intel stuff so I need to buy memory, disks, etc from different shops and build my own. The total cost is above $2K and it is quite troublesome to purchase from different shops, one of the risk the warranty and support are separate and there could be compatibility issue if I'm not picking up the correct brand and model.

I was looking for alternatives for Intel NUC and Shuttle DS81 looks promising.
 
Below are some links that has Shuttle DS81 for homelabs
Ultrasmall computers for your VMware lab - Intel NUC and Shuttle DS81 preview: https://www.youtube.com/watch?v=ullzf3TqhoE
The Perfect vSphere 6 Home Lab | Ryan Birk – Virtual Insanity:
http://www.ryanbirk.com/the-perfect-vsphere-6-home-lab/
Build a new home lab VMGuru: https://www.vmguru.com/2015/02/build-new-home-lab/
Building a ESXi 5.5 Server with the Shuttle DS81: https://globalconfig.net/building-esxi-5-5-server-shuttle-ds81/

It can runs vSphere 6 only need to inject Realtek 8111G VIB drivers to ESXi 6 image with PowerCLI 6 or using v-Front ESXi-Customizer by Andreas Peetz. The drivers can be found from one of the above links on Shuttle DS81 for homelabs. Although it is slightly bigger than Intel NUC, the advantages of Shuttle DS81 are it comes with a dual-NIC GigabitEthernet and processor speed is higher than the NUCs. I decided to order 3xShuttle DS81 from local distributor with the following hardware specifications:
Intel® Core™ i3-4170 Processor (3M Cache, 3.70 GHz)
Kingston 8GB 1600MHz DDR3 (PC3-12800) SODIMM Memory
Samsung 850 EVO 120 GB mSATA 2-Inch SSD
HGST Travelstar 7K1000 2.5-Inch 1TB 7200 RPM SATA III 32MB Cache Internal Hard Drive SanDisk 16GB Class 4 SDHC Memory Card

For the switch, I start with TP-Link 8-Port Gigabit Easy Smart Switch TL-SG108E first. It support VLANs, IGMP Snooping, Link Aggregation, and its low watt. I also buy a small UPS/AVR CyberPower BU600E. I use an energy/watt meter to validate all of the above configurations as the CFO would very much like to validate herself. For all 3 nodes Shuttle DS81 + 8-Port GbE TP-Link switch + UPS, it turns out that the total wattage are only between ~65W up to 160W :)

To install ESXi 6 on USB/SD Card, we can install it to USB/SD as a destination directly (with Workstation/Fusion) or have it as a source installer. You can read the details in Vladan's post here. To  create ESXi 6 bootable ISO along with automatically using a static IP Address when the custom ISO first boots up, we can use a ks.cfg, read more about it in William Lam's post here or you can also create kickstart for VSAN. I'm using SD card for the ESXi boot ISO but unfortunately VMware Fusion cannot detect Mac's Internal SD card reader. So I will need to create an ESXi installer on USB. Then install to SD card using USB. After the ESXi hosts are ready, I cannot install vCenter because there is no VMFS Datastore and we need a vCenter Server to configure a VSAN. There's an article on how to bootstrap a VCSA to a single VSAN node. Since I'm using Mac, I cannot use the Client Integration Plug-In to deploy VCSA and need to use the vcsa-cli-installer. To install using vcsa-cli, read Romain Decker's article here.

At the moment, I only install VCSA 6.0u1 with Windows 2003 as AD, DNS, DHCP. I used Win 2003 because of it small size and hardware specification/requirements. With 3 nodes of 120 GB SSD & 1TB HDD I get 2.7TB VSAN datastore in total as below screenshot.

I'm planning to continue to install NSX 6.2 by following Thomas Beaumont's blog post here.