vSphere Performance troubleshooting Part1: CPU

Even do vSphere is telling you that the overall CPU utilization is no more than 60%, this doesn’t indicated that your VM isn’t running low on CPU resources.
In this part where going to deepdive in troubleshooting CPU performance related issues on your vSphere environment. Although I’m using vSphere 5.0 for my screenshot, most of the options used also go for vSphere 3 and 4.

Key tool for CPU performance troubleshooting is esxtop or if your running your esxtop remote resxtop.
I my examples I’m using esxtop directly on the vSphere ESXi console but all of these command also work with resxtop.You just have to provide a extra parameter called –server on witch server you want to run the command. Of course you have to provide a username and password to get access to that server.

If we start esxtop on our vSphere ESXi host we will get the following screen:

Click on image to enlarge

First let me explain what we see here.

Click on image to enlarge

The first three lines:

1:06:05pm	The current time of your ESXi server. Notice that the time is in UTC.
up 18 days 3:35	How long your ESXi server has been up.
307 worlds	a world is a process thats running in your VMkernel
5 VMs	The amount of VMs running on your ESXi host
12 vCPUs	The amount of vCPU provided to VMs
CPU load average	The average CPU load per 5, 10 and 15 minutues. If the average load is higher than the amount of CPU cores, your system has not enough CPU recourses.
PCPU USED(%)	Real-time amount of CPU usage per CPU core in percentages. As you can see my system is a 8 core system (2 quad core CPUs). AVG: is the average of all pCPU cores. (4,3 + 2,4 + 0,0 + 0,3 +5,1 + 1,7 + 2,7 + 3,3) / 8 = 25
PCPU UTIL(%)	Real-time amount of CPU utilizaton per CPU core in percentage. AVG is the average of all pCPU cores. (4,4 + 7,2 + 100 + 5,2 + 5,9 + 2,3 + 2,4 + 4,4) / 8 = 16

After the first three lines you will see a table with like the following:

Click on image to enlarge

So let me explain what we see there:

The recourse world id. A world is an ESXi VMkernel schedulable entity, similar to a process or thread in other operating systems.

GID

The resource group world id. A group contains more worlds. If you press e in esxtop and enter the number of the GID, this GID will expand itself in multiple world’s for the same group. Every VM consists of minimum 4 worlds:

vmx: This world is used for vCPU world explained in vmx-vcpu-#.
vmast.#: Ths world is used for memory scanning.
vmx-mks: This world is used for mouse, keyboard and monitor.
vmx-vcpu-#: This world is used for every vCPU of the VM. The amount of vCPU worlds is the same as the amount of vCPU configured for this VM.

NAME

The name of the world or world recourse pool.

NWLD

The amount of world’s in the world recourse pool.

%USED

The percentage of physical CPU core cycles used by the recourse pool/world.

%USED = #vCPU*100%
indicates that the VM occupies all the CPU cycles he can takes. Indicates that the VM is running at 100%.

%RUN

The percentage of time scheduled. This value can be twice as large as %USED.

%RUN > %USED the pCPU is not running at its rated clock frequency. Probably due Power saving.

%SYS

The percentage of time spend in the ESXi VMkernel on behave of the recourse pool/world to process interrupts and to perform other system activities.

If higher than 25 the VM is a high IO VM. If you are aware of this,OK. If not check other statistics.

%WAIT

The total percentage of time the Resource pool/world spent in wait state.

%VMWAIT

=%WAIT-%IDLE

%RDY

The percentage of time the Resource pool/world was ready to run.

>20% indicated that the amount of pCPU cores is to low.

%IDLE

The percentage of time the Resource pool/world was idle.

%OVRLP

The Percentage of system time that was spent on behalf of some other Resource Pool/World while Resource Pool/World was scheduled.

%CSTP

The percentage of time the Resource pool/world spent in ready, co-deschedule state.

>5% This accours when a VM as more vCPUs and one vCPU has to wait on another vCPU in order to catch up.

%MLMTD

Percentage of time the ESX VMKernel deliberately did not run the Resource Pool/World because that would violate the Resource Pool/World’s limit setting.

%SWPWT

This picture (who I have borrowed from a VMworld presentation) explains the overall relationship between the different variables.

esxtopcpu04

Oke, now we now where the different variables stand for and what there relationship is. The next question will be, which variables do I have to monitor and what are there thresholds?

Variable	Threshold	Resolution
%RDY	>10%	If higher than 10% for a long time, add more CPU cores tho your vSphere host
%CSTP	>5%	This only occurs in a VM with more than 1 vCPU. Add more pCPU to the host or decrease the amount of vCPUs in the VM
%MLMTD	>0%	If higher than 0% the vCPU is throttled because of CPU limits
%SYS	>20%	If higher than 20% the VM is like a high I/O VM. Check guest OS for problems
%RUN	>%USED	The pCPU is not running at its rated clock frequency. Probably due Power saving.

So that’s what you need to know about monitoring you vSphere ESXi host with esxtop.

About Michael
Michael Wilmsen is a experienced VMware Architect with more than 20 years in the IT industry. Main focus is VMware vSphere, Horizon View and Hyper Converged with a deep interest into performance and architecture. Michael is VCDX 210 certified, has been rewarded with the vExpert title from 2011, Nutanix Tech Champion and a Nutanix Platform Professional.

Email • Twitter •

VMware | Michael December 13, 2011

RSS feed for comments on this post.

You must be logged in to post a comment.

VCDX #210
Follow me on Twitter
My Tweets
Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Email Address
Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.
To find out more, including how to control cookies, see here: Cookie Policy

vSphere Performance troubleshooting Part1: CPU

Leave a Reply

Follow me on Twitter

Subscribe to Blog via Email