As I am doing a fairly deep dive into vSphere resource management, I’m bumping into other concepts which book authors are simply assuming the reader would already know of. This is not always the case and it certainly didn’t apply to me.
For instance, things like OS memory management (virtual memory vs. physical memory) – I just needed a refresh on. Another one is NUMA (Non-Uniform Memory Architecture) – of which I’ve only seldom heard of, but never really got to understand what it is.
I will also provide a quick introduction of the vSphere concepts related to resource management.
So let’s crack on …
RESOURCE PROVIDERS & CONSUMERS
|ESXi Host Processes||Yes||Yes||VMware states the ESXi host as a resource consumer only! However, I disagree. The resources are made available by hardware. It is therefore correct to say that those resources would be also consumed by the actual processes running by the ESXi host itself – hostd, vmx, vmm, vxd, etc.|
|ESXi Cluster||Yes||Yes||Once again, I am in disagreement with VMware here. ESXi Cluster is a resource provider|
|Resource Pool||Yes||Yes||A resource pool will be a resource consumer of its parent – which could be either the host itself, or another resource pool. It is also true that a resource pool will be a resource provider to its child resource pools|
|Virtual Machine(s)||No||Yes||A virtual machine will only be a consumer of the resources provided by either the host, or a resource pool|
SHARES, RESERVATIONS & LIMTS
Resource management is done by means of Shares, Reservations and Limits.
Shares define the relative weight a specific VM or Resource Pool among its siblings. The following rules apply:
- Shares configured at the Resource Pool level vs. Shares configured at VM level – VM level takes priority
- Shares propagation – ultimately, shares end up being propagated to the child VMs or other child resource pools based on the relative weight define by the actual share value
Reservation defines a minimum guarantee of that resource. The following rules apply:
- By default, no reservations are set
- Once reserved, they are taken away from the parent’s total available resource pool
- When using reservation, it is recommendable to always allow for minimum of 10% free resources
- When making reservations it is important to always keep in mind HA/DRS requirements since we are actually taking resources away permanently
Limits – while the system could always allocate more than the reservation set, it will *never* allocate more than the limit. By setting a limit we end-up putting a cap on that resource.
A resource pool is a logical entity which allows resource allocation and management in a hierarchical fashion. They provide ease of management and scalability.
- The first resource pool created by the administrator will be a child of the root resource pool
- A resource pool could be a parent to another resource pool
The reason I’m including vMotion as an important concept to resource management is very simple:
vMotion is a feature which allows automatic migration of Virtual Machines from one host to the other base on resource availability. Not necessary related to resource management, I ought to mention that vMotion is also the feature on which HA (High Availability) relies on.
DRS stands for Distributed Resource Scheduler – it is a process which constantly evaluates resource usage within a vSphere Cluster. Depending on how it’s configured, it will not only provide recommendations, but also take action towards balancing resource usage within the cluster. For instance, it could automatically move Virtual Machine state between two different ESXi hosts using VMotion.
NUMA stands for Non-Uniform Memory Architecture (or Access – in other references). It is a CPU architecture present in selected Intel and AMD processors which allows for much faster CPU access rates to the memory. This is made possible by giving each cpu/core, its own memory bank for a more direct, local access.
There is quite a lot of info on NUMA out there and I do encourage you to, at least, have a read at the following:
Before getting into the next topic, let’s look at a quick example referencing the diagram below:
Above we have application Application-1 running multiple threads. Each thread is running on specific CPU/cores. For example, the cyan thread is running on CORE-1; the purple thread is running on cpu/core CORE-4, and so on.
Although each thread is running on a specific cpu/core, we can see that the data located in RAM could be spread between other memory banks linked to other cpu/cores. For instance, the purple thread is using data located in MEM-2 (remote memory access) and MEM-4 (local memory access). The orange thread however, is accessing memory located only on MEM-3 (local memory access).
We can draw two important conclusions:
- local memory access will be very fast
- remote memory access will be slower and requires extra cpu cycles to move memory content between memory banks
It becomes clear that NUMA will be beneficial depending on the nature of the workload.
In very generic terms, the OS will manage two lists: a free-list and a allocated-list. When an application makes a call to allocate memory, the relevant blocks of addresses are removed from the free-list and added to the allocated-list. The inverse process happens when the OS tries to free-up memory. There are variation in the algorithm used based on the actual operating system.
If you fancy a headache maybe you should read more here.
In order to understand how ESXi manages memory allocated to all Virtual Machines, I thought it would not be a bad idea to revisit how memory management is done within the OS.
Of course there will be slight variations between the different OS kernels (Ms-Windows, Linux, Mac) which is why I have focused mainly on the generic concepts. Having said that, I do believe the information that follows will give myself (and you) more than the foundation to, at a later stage, take vSphere memory management to that next level!
I have tried to sum it up in one diagram and I explain the concepts further:
- There are two applications running within the OS – Application Green & Application Blue
- Physical Memory is the total amount of memory available to the OS and applications running within; white squares represent free memory
- Application Memory represents a *contiguous* memory space given to applications – in our case, Blue and Green
- Swapping device is represented by the orange Hard-Disk icon – as you should know, once specific memory utilisation thresholds are reached, the OS will start copying idle memory content to disk in order to free up physical RAM to other, more active, applications – we call this OS Level Swapping
- Memory fragmentation is a phenomenon which, sooner or later, will occur in any system. Notice how Application Blue gets the physical blocks labeled as 0, 7, 10. Similarly, Application Green gets the blocks labeled 1, 2, 4, 6, 8, 9, 11, 12, 14. The OS will always try to get these blocks as contiguous as possible though in most cases, these will *not* be contiguous!
- Although the allocated memory blocks in RAM (physical memory) are not contiguous, they are presented to the requesting application, as contiguous!! In the diagram above you can see how each application “sees” its own contiguous address space [Green: 0-8] and [Blue: 0-2]. This is why the Application Memory blocks are also called Virtual Memory blocks
- To “translate” between the virtual and physical memory space the OS creates mappings between the two – these mappings are held in page-tables
- In order to speed-up access to the page-tables, a dedicated cache is used for this purpose – this is the Translation Lookaside Buffer. Informit.com site provides more info on this here.