VCAP5-DCA Objective 6.2 – Troubleshoot CPU and Memory Performance

Knowledge

  • Identify resxtop/esxtop metrics related to memory and CPU
  • Identify vCenter Server Performance Chart metrics related to memory and CPU

Skills and Abilities

  • Troubleshoot ESXi host and Virtual Machine CPU performance issues using appropriate metrics
  • Troubleshoot ESXi host and Virtual Machine memory performance issues using appropriate metrics
  • Use Hot-Add functionality to resolve identified Virtual Machine CPU and memory performance issues

Troubleshoot ESXi host and Virtual Machine CPU performance issues using appropriate metrics

Official Documentation:

vSphere Monitoring and Performance Guide

VMware provides several tools to help you monitor your virtual environment and to locate the source of potential issues and current problems.

Performance charts inthe vSphere Client Allow you to see performance data on a variety of system resources includingCPU, Memory, Storage, and so on.
Performance monitoringcommand-line utilities Allow you to access detailed information on system performance through thecommand line.
Host health Allows you to quickly identify which hosts are healthy and which areexperiencing problems.
Storage maps and charts Provide an in-depth look at your storage resources.
Events, alerts, andalarms in the vSphereClient Allow you configure alerts and alarms and to specify the actions the systemshould take when they are triggered.

Here are some key metrics to look at when troubleshooting host and virtual machine CPU performance problems with esxtop / resxtop. 

  • %RDY (higher than 5) – indicates the amount of time the world was ready to run, but the world is stuck in a run queue while it waits on the CPU scheduler to schedule time on a physical CPU
  • %MLMTD (higher than 0) – indicates a world is being throttled by a CPU limit that has been set. Indicates that the world was ready to run, but wasn’t aloud to due to the limit
  • PCPU UTIL(%) (all over 90-95%) – percentage of unhalted CPU cycles per PCPU. Also the average across all PCPUs
  • %SWPWT (higher than 3) – indicates the percentage of time the world spends waiting on vmkernel memory swapping
  • %CTSP (higher than 3) – indicates the percentage of time the world spends in a ready, co-deschedule state. Used for SMP virtual machines. The higher this metric the further ahead on vCPU is over another vCPU.

Solutions for Consistently High CPU Usage

Temporary spikes in CPU usage indicate that you are making the best use of CPU resources. Consistently high CPU usage might indicate a problem. You can use the vSphere Client CPU performance charts to monitor CPU usage for hosts, clusters, resource pools, virtual machines, and vApps.

Problem

Host CPU usage constantly is high. A high CPU usage value can lead to increased ready time and processor queuing of the virtual machines on the host. 

Virtual machine CPU usage is above 90% and the CPU ready value is above 20%. Application performance is impacted.
Cause

The host probably is lacking the CPU resources required to meet the demand.
Solution

  • Verify that VMware Tools is installed on every virtual machine on the host.
  • Compare the CPU usage value of a virtual machine with the CPU usage of other virtual machines on the host or in the resource pool. The stacked bar chart on the host’s Virtual Machine view shows the CPU usage for all virtual machines on the host.
  • Determine whether the high ready time for the virtual machine resulted from its CPU usage time reaching the CPU limit setting. If so, increase the CPU limit on the virtual machine.
  • Increase the CPU shares to give the virtual machine more opportunities to run. The total ready time on the host might remain at the same level if the host system is constrained by CPU. If the host ready time doesn’t decrease, set the CPU reservations for high-priority virtual machines to guarantee that they receive the required CPU cycles.
  • Increase the amount of memory allocated to the virtual machine. This action decreases disk and or network activity for applications that cache. This might lower disk I/O and reduce the need for the host to virtualize the hardware. Virtual machines with smaller resource allocations generally accumulate more CPU ready time.
  • Reduce the number of virtual CPUs on a virtual machine to only the number required to execute the workload. For example, a single-threaded application on a four-way virtual machine only benefits from a single vCPU. But the hypervisor’s maintenance of the three idle vCPUs takes CPU cycles that could be used for other work.
  • If the host is not already in a DRS cluster, add it to one. If the host is in a DRS cluster, increase the number of hosts and migrate one or more virtual machines onto the new host.
  • Upgrade the physical CPUs or cores on the host if necessary.
  • Use the newest version of hypervisor software, and enable CPU-saving features such as TCP Segmentation Offload, large memory pages, and jumbo frames.

More information:

Interpreting esxtop 4.1 Statistics: http://communities.vmware.com/docs/DOC-11812

Yellow-Bricks Blog about ESXTOP: http://www.yellow-bricks.com/esxtop/

Troubleshoot ESXi host and Virtual Machine memory performance issues using appropriate metrics

Official Documentation:

vSphere Monitoring and Performance Guide

VMware provides several tools to help you monitor your virtual environment and to locate the source of potential issues and current problems.

Performance charts inthe vSphere Client Allow you to see performance data on a variety of system resources includingCPU, Memory, Storage, and so on.
Performance monitoringcommand-line utilities Allow you to access detailed information on system performance through thecommand line.
Host health Allows you to quickly identify which hosts are healthy and which areexperiencing problems.
Storage maps and charts Provide an in-depth look at your storage resources.
Events, alerts, andalarms in the vSphereClient Allow you configure alerts and alarms and to specify the actions the systemshould take when they are triggered.

Here are some key metrics to look at when troubleshooting host and virtual machine memory performance problems with esxtop / resxtop.

  • SWCUR (higher than 0) – indicates the host is swapping memory <- THIS IS BAD!
  • MCTLSZ (higher than 0) – indicates that memory is being ballooned; if memory pressure continues, memory compression may occur
  • ZIP/s (higher than 0) – indicates that memory is being compressed; if memory pressure continues, memory swapping may occur

Solutions for Memory Performance Problems

Host machine memory is the hardware backing for guest virtual memory and guest physical memory. Host machine memory must be at least slightly larger than the combined active memory of the virtual machines on the host. A virtual machine’s memory size must be slightly larger than the average guest memory usage.

Increasing the virtual machine memory size results in more overhead memory usage.
Problem

  • Memory usage is constantly high (94% or greater) or constantly low (24% or less).
  • Free memory consistently is 6% or less and swapping frequently occurs.

Cause

  • The host probably is lacking the memory required to meet the demand. The active memory size is the same as the granted memory size, which results in memory resources that are not sufficient for the workload. Granted memory is too much if the active memory is constantly low.
  • Host machine memory resources are not enough to meet the demand, which leads to memory reclamation and degraded performance.
  • The active memory size is the same as the granted memory size, which results in memory resources that are not sufficient for the workload.

Solution

  • Verify that VMware Tools is installed on each virtual machine. The balloon driver is installed with VMware Tools and is critical to performance.
  • Verify that the balloon driver is enabled. The VMkernel regularly reclaims unused virtual machine memory by ballooning and swapping. Generally, this does not impact virtual machine performance.
  • Reduce the memory space on the virtual machine, and correct the cache size if it is too large. This frees up memory for other virtual machines.
  • If the memory reservation of the virtual machine is set to a value much higher than its active memory, decrease the reservation setting so that the VMkernel can reclaim the idle memory for other virtual machines on the host.
  • Migrate one or more virtual machines to a host in a DRS cluster.
  • Add physical memory to the host.

More information:

Interpreting esxtop 4.1 Statistics: http://communities.vmware.com/docs/DOC-11812

Yellow-Bricks Blog about ESXTOP: http://www.yellow-bricks.com/esxtop/

Use Hot-Add functionality to resolve identified Virtual Machine CPU and memory performance issues

Official Documentation:

vSphere Virtual Machine Administration, Chapter 8 “Configuring Virtual Machines”, Section “Change CPU Hot Plug Settings in the … Client”, page 94.
The CPU hot plug option lets you add CPU resources for a virtual machine while the machine is powered on.

The following conditions apply:

  • For best results, use hardware version 8 virtual machines.
  • Hot-adding multicore virtual CPUs is supported only with hardware version 8 virtual machines.
  • Not all guest operating systems support CPU hot add. You can disable these settings if the guest is not supported.
  • To use the CPU hot-add feature with hardware version 7 virtual machines, set Number of cores per socket to 1.
  • Adding CPU resources to a running virtual machine with CPU hot plug enabled disconnects and reconnects all USB passthrough devices connected to that virtual machine.

Prerequisites

Verify that the virtual machine is running under the following conditions:

  • VMware Tools is installed. This condition is required for hot plug functionality with Linux guest operating systems.
  • The virtual machine has a guest operating system that supports CPU hot plug.
  • The virtual machine is using hardware version 7 or later.
  • The virtual machine is powered off.
  • Required privileges: Virtual Machine.Configuration.Settings on the virtual machine

Procedure

  1. In the vSphere Client inventory, right-click the virtual machine and select Edit Settings.
  2. Click the Options tab and under Advanced, select Memory/CPU Hotplug.
  3. Change the CPU Hot Plug setting.
  4. Click OK to save your changes and close the dialog box.

What to do next

You can now add CPUs to the powered on virtual machine.
More information.

Jason Boche blog: vSphere Memory Hot Add/CPU Hot Plug

Pete net blog: VMware vSphere Hot Add and Hot Plug

Other exam notes

VMware vSphere official documentation

VMware vSphere Basics Guide html pdf epub mobi
vSphere Installation and Setup Guide html pdf epub mobi
vSphere Upgrade Guide html pdf epub mobi
vCenter Server and Host Management Guide html pdf epub mobi
vSphere Virtual Machine Administration Guide html pdf epub mobi
vSphere Host Profiles Guide html pdf epub mobi
vSphere Networking Guide html pdf epub mobi
vSphere Storage Guide html pdf epub mobi
vSphere Security Guide html pdf epub mobi
vSphere Resource Management Guide html pdf epub mobi
vSphere Availability Guide html pdf epub mobi
vSphere Monitoring and Performance Guide html pdf epub mobi
vSphere Troubleshooting html pdf epub mobi
VMware vSphere Examples and Scenarios Guide html pdf epub mobi


Related articles:

Disclaimer.
The information in this article is provided “AS IS” with no warranties, and confers no rights. This article does not represent the thoughts, intentions, plans or strategies of my employer. It is solely my opinion.

Marco

Marco works for ViaData as a Senior Technical Consultant. He has over 15 years experience as a system engineer and consultant, specialized in virtualization. VMware VCP4, VCP5-DC & VCP5-DT. VMware vExpert 2013, 2014,2015 & 2016. Microsoft MCSE & MCITP Enterprise Administrator. Veeam VMSP, VMTSP & VMCE.