Most performance issues which I was working on turned out to be basic issues regarding HANA / Linux parameter and configuration of the Hypervisor. Virtualization is regardless if big or small systems also in HANA environment an often-chosen architecture. If you want ensure good performance and how to check it in your environment keep on reading.
Most systems run on VMware but also more and more systems are planned or already running on Power. Here I only speak from on premise installations, because the ones in the cloud from Hyperscaler like azure (Hyper-V), AWS (own KVM) or GCP (own KVM) you can´t real take influence on. For the biggest instances there are bare metal installation which make it pretty easy for the NUMA configuration. The application HANA is NUMA aware.
NUMA is a good keyword to start because this one of the most ignored / intransparent performance issues. What is NUMA and why should you take attention on it when you install a HANA on a hypervisor.
NUMA – Non-uniform Memory Access
“NUMA is a method of configuring a cluster of microprocessor in a multiprocessing system so that they can share memory locally, improving performance and the ability of the system to be expanded.”
=> OK, this sounds not really self-explanatory, or?
Let’s take an example with a picture:
Most systems run on VMware but also more and more systems are planned or already running on Power. Here I only speak from on premise installations, because the ones in the cloud from Hyperscaler like azure (Hyper-V), AWS (own KVM) or GCP (own KVM) you can´t real take influence on. For the biggest instances there are bare metal installation which make it pretty easy for the NUMA configuration. The application HANA is NUMA aware.
NUMA is a good keyword to start because this one of the most ignored / intransparent performance issues. What is NUMA and why should you take attention on it when you install a HANA on a hypervisor.
NUMA – Non-uniform Memory Access
“NUMA is a method of configuring a cluster of microprocessor in a multiprocessing system so that they can share memory locally, improving performance and the ability of the system to be expanded.”
=> OK, this sounds not really self-explanatory, or?
Let’s take an example with a picture:
The performance impact depends on type of CPU, vendor (topology) and number of sockets.
This means a local access is always 2-3 times faster than a remote one. But how can you have influence on the placement of a VM (=virtual machine)?
The hypervisor should normally take care of this. But in special cases like big HANA VMs or wrong default settings of the VM you have to adjust it manually. This should be done for all productive HANA servers. Normally the person who have installed the HANA should be aware of this, but experience shows that in 90% of the installations nobody cares about.
IBM Power (PPC)
On IBM Power an optimization is pretty easy with the latest HMC versions:
# on ssh shell of the HMC
# Listing of all servers
$ lssyscfg -r sys -F name
# dynamic platform optimizer (DPO) => NUMA optimization
$ hscroot@<ip-hmc>:~> lsmemopt -m <pServer Name> -r lpar -o currscore
$ hscroot:~> lsmemopt -m pserv1-r lpar -o currscore
lpar_name=hana1,lpar_id=1,curr_lpar_score=100
lpar_name=hana2,lpar_id=2,curr_lpar_score=100
lpar_name=hana3,lpar_id=3,curr_lpar_score=none
lpar_name=hana4,lpar_id=4,curr_lpar_score=100
lpar_name=hana5,lpar_id=5,curr_lpar_score=none
lpar_name=hana6,lpar_id=6,curr_lpar_score=100
lpar_name=hana8,lpar_id=8,curr_lpar_score=32 << improvable LPAR
# on ssh shell of the HMC
# use DPO for optimization
$ optmem -m <Power Server Name> -o start -t affinity -p <name(s) of improvable LPAR(s)>
$ optmem -m pserv1 -o start -t affinity -p hana8
# check running background activities
$ lsmemopt -m <Power Server Name>
VMware
On VMware this is trickier than on IBM Power, because also the sizing rules differ.
With VMware you can use half socket sharing or if your VM I bigger than one NUMA Node / Socket, you have to round up and must allocate the full socket. This leads to some resource wasting.
Here a picture from ©VMware:
Every VM which is bigger than one socket is called ‘wide VM’.
One example for you which can be also checked in your environment by using the shell on your ESX.
Alternatively I’m sure you find a way how to contact me
Example – remote Memory access / Overprovisioning
####################
ESX
E5 – 2695 v4
18 cores per socket
2 sockets
72 vCPUs
1 TB RAM
####################
HANA Sizing:
600GB RAM
36vCPU
Current Setup:
768 GB RAM
36vCPU
Sizing rules:
1 TB RAM (=> 2 sockets, because one NUMA node has 512GB and we need more than this)
72vCPU
This is currently one the famous mistakes which I can see in about 60% of all environments, because the VM admin is not aware of the sizing rules of the SAP HANA and most of them are not aware which influence their VM settings can have on the topology and the resulting performance. So, attention to placement and overprovisioning.
ESX view
groupName groupID clientID homeNode affinity nWorlds vmmWorlds localMem remoteMem currLocal% cummLocal%
vm.78924 58029 0 0 0x3 16 16 73177088 0 100 99
vm.78924 58029 1 1 0x3 16 16 72204288 0 100 100
vm.1237962 76880487 0 0 0x3 16 16 18254012 250242884 6 53
vm.1237962 76880487 1 0 0x3 16 16 267603968 831488 99 66
vm.1237962 76880487 2 0 0x3 4 4 145781060 121605820 54 56
Here we see an ESX with 2 VMs 1237962 is our hdb01 HANA DB which has 16+16+4 vCPUs (3 Sockets) and we can see it consumes remote memory. Wait a moment – 3 sockets? Our physical server has only 2. Yes, this is possible with VMware, but it is an additional overhead and costs performance. You can also create an 8-socket server within a 2 socket ESX, but it doesn’t make sense in context of HANA. There are other applications where this feature is useful.
But all of this “virtual sockets” are located on the physical socket-0. This leads to an overprovisioning of this node because the other VM additionally uses some resources.
nodeID used idle entitled owed loadAvgPct nVcpu freeMem totalMem
0 5408 30591 5356 0 14 52 26703288 536736256
1 1574 34426 926 0 3 16 85939588 536870912
Socket-0 using 52 vCPU and Socket-1 16? Seems to be that this ESX is a little unbalanced and overprovisioned.
vmdumper -l | cut -d \/ -f 2-5 | while read path; do egrep -oi “DICT.*(displayname.*|numa.*|cores.*|vcpu.*|memsize.*|affinity.*)= .*|numa:.*|numaHost:.*” “/$path/vmware.log”; echo -e; done
DICT numvcpus = "36"
DICT memSize = "786432"
DICT displayName = "hdb01"
DICT sched.cpu.affinity = "all"
DICT sched.mem.affinity = "all"
DICT cpuid.coresPerSocket = "4"
DICT numa.autosize.cookie = "360001"
DICT numa.autosize.vcpu.maxPerVirtualNode = "16"
DICT numa.vcpu.preferHT = "TRUE"
numaHost: NUMA config: consolidation= 1 preferHT= 1
numaHost: 36 VCPUs 3 VPDs 3 PPDs
numaHost: VCPU 0 VPD 0 PPD 0
numaHost: VCPU 1 VPD 0 PPD 0
numaHost: VCPU 2 VPD 0 PPD 0
numaHost: VCPU 3 VPD 0 PPD 0
numaHost: VCPU 4 VPD 0 PPD 0
numaHost: VCPU 5 VPD 0 PPD 0
numaHost: VCPU 6 VPD 0 PPD 0
numaHost: VCPU 7 VPD 0 PPD 0
numaHost: VCPU 8 VPD 0 PPD 0
numaHost: VCPU 9 VPD 0 PPD 0
numaHost: VCPU 10 VPD 0 PPD 0
numaHost: VCPU 11 VPD 0 PPD 0
numaHost: VCPU 12 VPD 0 PPD 0
numaHost: VCPU 13 VPD 0 PPD 0
numaHost: VCPU 14 VPD 0 PPD 0
numaHost: VCPU 15 VPD 0 PPD 0
numaHost: VCPU 16 VPD 1 PPD 1
numaHost: VCPU 17 VPD 1 PPD 1
numaHost: VCPU 18 VPD 1 PPD 1
numaHost: VCPU 19 VPD 1 PPD 1
numaHost: VCPU 20 VPD 1 PPD 1
numaHost: VCPU 21 VPD 1 PPD 1
numaHost: VCPU 22 VPD 1 PPD 1
numaHost: VCPU 23 VPD 1 PPD 1
numaHost: VCPU 24 VPD 1 PPD 1
numaHost: VCPU 25 VPD 1 PPD 1
numaHost: VCPU 26 VPD 1 PPD 1
numaHost: VCPU 27 VPD 1 PPD 1
numaHost: VCPU 28 VPD 1 PPD 1
numaHost: VCPU 29 VPD 1 PPD 1
numaHost: VCPU 30 VPD 1 PPD 1
numaHost: VCPU 31 VPD 1 PPD 1
numaHost: VCPU 32 VPD 2 PPD 2
numaHost: VCPU 33 VPD 2 PPD 2
numaHost: VCPU 34 VPD 2 PPD 2
numaHost: VCPU 35 VPD 2 PPD 2
Here we can see that the mapping of VPD to PPD is 1:1, but there is no physical third socket in an E-5 server
At first, we have a wide VM. This means preferHT should be disabled. Another bullet point is the limitation of 16vCPU which lead to this 3 socket setup: 36/16=2,25 => ~3
Ok such numbers are fine but some pictures to realize what exactly this means:
In the last picture you can see that the 768GB doesn’t fit into 512GB, so a remote access is used to satisfy the need. The other VM should not be spread over two NUMA nodes. This has bad affects on the HANA performance.
So, in the end you have two options:
◈ Reduce the size of your HANA and resize the VM that it fits into one NUMA node
◈ Move the second VM away, so that the whole ESX can be used by the HANA VM
It is not allowed to share a socket for a prod. HANA VM with another VM (regardless if it is SAP application or not). This means also that overprovisioning is not allowed.
The shown example is not supported in many ways. SAP can discontinue support, but I haven’t heard from customers or colleagues that this ever happened, but what often be done is that VMware support will be contacted and be pretty sure that they will find this and your issue will be processed if you have supported setup.
No comments:
Post a Comment