VMworld US 2017

VMworld US is in a few weeks. This year I have been lucky to get a blogger pass from VMware. Every year I go to Barcelona, and if I can, also to VMworld US. For me VMworld US is more interesting. Why?

One of the main reason why go to VMworld, is the Solution Exchange (SOL). The SOL is a great place to meet a lot of different vendors who add value to the VMware portfolio. Talking with the people from the traditional vendors, is always great. You get a good understanding what they’re up to. What will be available in the near future, and their future roadmaps (nothing guaranteed of course).

Read more »

CLI install VMware vCenter server 6.5

Almost on every project I have worked on, I make an kickstart script for installing VMware ESXi. Although at first, it seems a lot of work. Using a kickstart script for installing VMware ESXi has many advantages:

  • Every host is installed consistent.
  • Elimination of human installations errors.
  • Install script can be used as documentation.
  • In case of a re-installation (hardware failure for example), the host is installed quickly.
  • Every IT administrator can perform the installation, you don’t have to be a VMware Expert.
  • I have created kickstart scripts since ESXi version 4.

While reading the VMware vCenter 6.5 installation documentation, I saw the option for a vCenter Server CLI installation or upgrade. Again, performing an unattended installation of vCenter Server has the same advantage as an kickstart for ESXi.

As with everything, the first time, reading, testing, adjusting your script takes time. But it’s while worth it!

First something about the CLI installer.

Read more »

Reboot or shut down vSAN node: Take your time!

While working at a customer’s site, I notice that a system administrator wanted to reboot a vSAN node. The time it took to shutdown the node was too long (30+ min) for the system administrator, so he used iDRAC to hard reset the host.
I asked him, why he just didn’t wait until the host was completely shut down?
He answered, that according to him, this didn’t matter, because ESXi was installed on a SD card. Before he hard reset the host, he makes sure the host is in maintenance mode, so no virtual machine is running on that host and vSAN data is guaranteed.
The system administrator noted that this SD card is only used during boot time.
After ESXi has booted, you can pull out the SD card and ESXi will continue to operate as normal. So in his opinion, he could easily hard reset ESXi, as long he ensured that no virtual machines a running on that host.

This statement is partially true, but before I dive into this statement, first a little background information.

During ESXi boot, a RAMdisk is created. This RAMdisk is among other things used to store the VMkernel, modules and the scratch partition. The scratch partition is used for storing log files.
To ensure that log files are available after a reboot, during shutdown, these log files are copied to persistent storage (in this case a SD-card). During boot, the same log files are copied from persistent storage (SD-card) to the scratch partition on the RAMdisk.
So, the statement that the SD card isn’t used after the VMkernel is up and running is not completely true. During a shutdown of ESXi, the SD card is also used!

In this case, the customer was using VMware vSAN. vSAN generates vSAN trace log files that are stored on in the scratch partition.
Although the size of the log files is limited to 180 MB by default. For most SD cards, this can take up to 30 minutes to be copied.
As you can see in the screenshot below, vmhba32 is 100% used but is only copying data by 0.02 MB per Sec with an average latency of 66.25 ms.

So, what are your options to avoid long start-up of shutdown times?
1. Limit the size of your vSAN trace files. Do mind, in case of troubleshooting, you can miss some important information!
2. Store your log files on a syslog server or a central VMFS datastore.
3. Don’t use SD card for ESXi using VMware vSAN. A SATADOM is not that expensive and much faster than a SD card. Of course, you can also use a local (SSD) disk.

Final note: When shutting down your VMware vSAN node, take your time!

Migrating a datacenter with PowerCLI – Introduction

For the last few months, I have been working on a project to migrate 1000 virtual machines from one datacentre to another datacentre. Both datacentres are 50KM in distance from each other. The building where the source datacentre is situated, will be stripped and will be rebuild conform current standards.
This means that every window, door, wall and all cables will be removed.
You can imagine that moving all the physical hardware (3 blades chassis, 42u of storage hardware and physical switches) and virtual machine from one datacentre to another datacentre is a huge operation. Because the virtual machine hosted in this datacentre are running production workloads, user impact should be minimal, and if user impact is expected, and migrations should be done during maintenance windows.
Both datacentres have their own vCenter servers (6.0), multiple vSphere (6.0) clusters and storage (Fiber Channel), and are connected at L2 level. The vCenter servers are members of the same SSO domain.Read more...

Because of this, I can use long distance vMotion to move the virtual machines from one datacentre to the other datacentre.
Of course, this can be done, using the vSphere web Client, but because of the number of virtual machines, we decided to write a Powershell script, that will do the job for us.
This script is scheduled to run during maintenance hours, and will read text file (batchXX.txt) to determine which virtual machines should be migrated.

In this blog series, I will explain the PowerCLI script I created, step by step, and eventually, how the migration went.

Table of content
Chapter 1: Log function
Chapter 2: Notifications
Chapter 3: Reading VM attributes
Chapter 4: Storage space
Chapter 5: Move-vm
Chapter 6: Testing

Install VMware PSC fails: vdcsetupldu failed. Error [9234] – User invalid credentials

I was setting-up a redundant VMware PSC setup stretching 2 datacenters. Every datacenter has 2 PSC and an load balancer.

Eventually, the virtual machines who run the PSC services, will run in a management cluster consisting of 3 nodes. These nodes are using Virtual SAN (VSAN) for storage.

I first installed 1 node with VMware vSphere and created on one of the SAS disks a datastore. Later on, the VMs will be moved to a VSAN datastore.

The first datacenter went as expected. No problems. But the second datacenter the installation of the PSC software failed with an error: Encountered an internal error

PSC Error

Looking in the logfile vmafd-firstbood-py-xxxx_stderr.log, vdcsetupldu failed with Error [9234] – User invalid credential

PSC Error

I was sure that the password provided was ok. Diving deeper into the log files, I found that after installation, VMware Identity services starts and the install tries to make a LDAP connection on TCP 389, who would fail. I created a PowerShell script that would check TCP 389 every 5 seconds. I find out that eventually, I could make a connection on TCP 389, but the install already gave up.

Ok, so the services will start, but too late.

Looking at ESXTOP (best troubleshooting program for ESXi), I saw, that when the VMware Identity services start, the disk latency went up to 100ms. Could it be that the disk is slowing down the virtual machine, so that the installation would fail? I moved the virtual machine to a SSD disk, restart the installation, and guess what. Install successful.

So, probably the PSC installation program does not check if the service is available, before trying to login. I will fail, saying that the credentials are not valid, rather than saying that it cannot make a connection.