Vmware NSX for vSphere 6.2.2 bugs — lost network connection

We are using VMware NSX in our production environment for a long time.
And recently we got some problem with NSX, the symptoms is

Some VMs will lose network connection after migrated to another VM;
New firewall rules are not able to apply on some of the VMs.

After engaged VMware, VMware confirmed that it’s a bug in NSX.

VMware assigned about 1.6G heap memory for NSX firewall on each of the ESX hosts. If you applied too much rules or you have too many VMs and you’ll reach the memory limit. Then you’ll get this issue…

Current fix is to upgrade to 6.2.3…

Reading a memory.dmp or other .dmp file

This can be accomplished with 7 easy steps:

Step 1. Obtain and install the debugging tools.

Debugging Tools Windows

All you need to install is the “Install Debugging Tools for Windows as a Standalone Component (from Windows SDK)” and during the install only select “Debugging Tools for Windows”. Everything else is used for more advanced troubleshooting or development, and isn’t needed here. Today I followed the link to “Install Debugging Tools for Windows as a Standalone Component (from Windows SDK)” although for a different OS you may need to follow a different link.

Step 2. From an elevated command prompt navigate to the debugging folder. For me with the latest tools on Windows Server 2012 it was at C:\Program Files (x86)\Windows Kits\8.1\Debuggers\x64\. You can specify the path during the install.

Step 3. Type the following:

kd –z C:\Windows\memory.dmp (or the path to your .dmp file)

Step 4. Type the following:

.logopen c:\debuglog.txt

Step 5. Type the following:

.sympath srv*c:\symbols*http://msdl.microsoft.com/download/symbols

If you computer can’t connect to internet, you can download the symbols from below link:


Step 6. Type the following:

.reload;!analyze -v;r;kv;lmnt;.logclose;q

Step 7. Review the results by opening c:\debuglog.txt in your favorite text editor. Searching for PROCESS_NAME: will show which process had the fault. You can use the process name and other information from the dump to find clues and find answers in a web search. Usually the fault is with a hardware drivers of some sort, but there are many things that can cause crashes so the actual analyzing of the dump may take some research.

Often times a driver update will fix the issue. If the summary information doesn’t offer enough information then you’ll need to dig further into the debugging tools or open a CSS case with Microsoft. The steps above will provide you with a summary mostly-human-readable report from the dump. There is much more information available in the memory dump although it gets exponentially more difficult to track down the details the further you get into windows debugging.

Hopefully these quick steps are helpful for you as you troubleshoot the unwelcome BSOD.

FreeBSD: use mrsas driver to replace mfi driver

My server got a Dell PERC H330 raid card, and I made it’s working in HBA mode to make sure that it can get best performance under FreeBSD with zfs.
But every time when I boot the server, I’ll get below error message and it take a long time to pass the disk check stage.


After research, it seems this timeout error was caused by the old mfi driver.
And LSI has released a new mrsas driver for FreeBSD. So It’s better to switch the driver from mfi to mrsas.

First, add below line into /boot/loader.conf

And then add below device hint into /boot/device.hints . This line is very important. Without this device hint, FreeBSD will use the old mfi driver for raid card even though you enabled maras drvier.

And then, add below line into /boot/loader.conf to disable disk id identity.

Without above two line, after you switch from mfi to mrsas, all the disks will be shown as diskid-*****************。

And don’t forget to update /etc/fstab to change the swap partition from mfi*p* to da*p*. Otherwise you’ll lose your swap partition.

Then, reboot your server, and enjoy.

Fix VCSA 6.0 disk issue — “unknow command shell.set”

In my home lab I’m using Lenovo M900 tiny to run ESX 6.0 and using Synology DS1813+ to provide the ISCSI LUN.
And I’m using VCSA as my vCenter server and put it on the iSCSI lun.

Today I updated my DS1813+ to DSM 6.0 update 1, during the update, I reboot my synology nas. And it seems ESX lost connecting to the iSCSI LUN and my VCSA was dead.

Tried to restart VCSA, it always failed and asked my to run fsck.

At first, I want to run fsck in VCSA shell. But the wired thing is that when I run command “shell.set –enable True”, it told me this command doesn’t exist..

It seems that the volumes are in read-only and the shell is dead.
Don’t worry, let’s fix it.
Please follow below steps:

  1. stop VCSA machine
  2. add the iso of RHEL7 installation CD to the (actually as CD/DVD of the machine and modified boot order to start from the CD
  3. boot from CD
  4. enter shell of LiveCD
  5. issue the following commands (to see that logical volumes are OK)
  6. repeat step 5 to check all the volumes


After finished, remove the ISO and reboot the VM.
You will find VCSA is back:)

FreeBSD + Nginx : Enable HTTP/2 and ALPN

For now more and more servers are starting using HTTP/2 which is faster and more secure.
This post is about how to enable HTTP/2 on FreeBSD servers.

Nginx Stable 1.8.* doesn’t support HTTP/2. So we need to install nginx-devel (version 1.9.*) first. If you have already installed Nginx stable, you need to uninstall it first.

And before you install nginx-devel, you need to install openssl from port first. Otherwise nginx will use system based openssl library, and you can’t enable ALPN for http/2. That’s because ALPN requires openssl 1.0.2*, and the system based openssl is version 0.98

So the first step is:

And then add below line into your make.conf to make sure that you’ll use the latest openssl library to build nginx.

Then you can install nginx-devel

Make sure that you select HTTP_SSL and HTTPV2. Please be aware that SPDY is no longer supported by nginx 1.9.*.
Then install it.

Go back to your nginx.conf, and modify it as following:

I’d like to explain a little bit for this configuration.

This is about to enable HTTP/2 and SSL.
You may notice that here I’m using accept_filter=dataready instead of accept_filter=httpready. There are currently two filters in FreeBSD: “dataready” and “httpready” which need to started at boot by adding accf_data_load=”YES” and accf_http_load=”YES” to /boot/loader.conf. dataready waits for the first properly formed packet to arrive from the client before passing the request to nginx. httpready waits not only for the packets, but also for the end of the HTTP header before passing the request onto nginx. Keep in mind “httpready” filter breaks support for ancient HTTP/0.9 because v0.9 does not have any headers. HTTP/0.9 is so old we are not going to worry about support it and since a HTTP/0.9 would not have the newer SSL ciphers anyways.

To configure nginx to use the accept filters in FreeBSD we need to add the arguments to the listen directive. Since http (port 80) is unencrypted we can use the “accept_filter=httpready” accept filter. This is because FreeBSD will need to look at the packet and parse the complete http header. SSL (https port 443) is encrypted so FreeBSD can not parse the packets so we need to use the “accept_filter=dataready” accept filter. Both accept filter examples can be found in the configuration below. To use FreeBSD accept filters you must enable them in /boot/loader.conf to load on boot.

And someone may use the nginx 1.9.* new feature reuseport. I have to say, unfortunately, this new feature doesn’t support FreeBSD. I made several test on my server, it seems if you enable “reuseport” on your server, then all your traffic will be handled by your first worker! The OS can’t balance the workers’ workload. It means if your server is very busy, with reuseport on FreeBSD will significantly slow down your server. You’ll find that your first worker is taking up 100% CPU and the rest are idle! So, at this moment, do not enable “reuseport” on FreeBSD Nginx.

This part is just enable SSL. It’s pretty easy. There is only one thing you need to take care, that is for yourcert.crt, there is no need to put Root certificate into it. Just put your server certificate and intermediate certificate into it to reduce the size of your certificate to reduce the connection time.

These SSL ciphers are recommended by cloudflare. It can support most browsers. But IE6 is not supported.

NGINX caches the session parameters used to create the SSL/TLS connection. This cache, shared among all workers when you include the shared parameter, drastically improves response time for subsequent requests because the connection setup information is already known. Assign a name to the cache and set its size (a 1-MB shared cache accommodates approximately 4,000 sessions).The ssl_session_timeout directive controls how long the session information remains in the cache. The default value is 5 minutes; increasing it to several hours (as in the following example) improves performance but requires a larger cache. Session tickets store information about specific SSL/TLS sessions. When a client resumes interaction with an application, the session ticket is used to resume the session without renegotiation. Session IDs are an alternative; an MD5 hash is used to map to a specific session stored in the cache created by the ssl_session_cache directive.

OCSP stapling, can decreases the time of the SSL/TLS handshake. Traditionally, when a user connects to your application or website via HTTPS, his or her browser validates the SSL certificate against a certificate revocation list (CRL) or uses an Online Certificate Status Protocol (OCSP) record from a certificate authority (CA). These requests add latency and the CAs can be unreliable. With NGINX you can cache the OCSP response to your server and eliminate costly overhead.

At this step you need to create your ssl_trusted_certificate. Please note that for OCSP certificate (ssl_trusted_certificate /usr/local/etc/nginx/ca-certs.crt), you need to put your root certificate and intermediate certificate into it, and no server certificate. And the right order is root certificate first, then intermediate certificate, then second intermediate certificate.

Enable HSTS for your server. Please be aware that if you added this into your configure, then you can’t remove it and go back to http again. Otherwise end users can’t access your website because they will always redirect to https site.

Now, you can go to http://www.ssllabs.com to test your website. You should be able to get score A+