Monday, 11 January 2021

HANA – hdbcons first aid hacks

A tool which got more and more power since HANA 2.0 was released is hdbcons. It should only be used by experienced admins and should be tested in non-prod systems, because it can produce system crashes. There are several known issues documented in notes which describe scenarios you should avoid the usage of hdbcons in prod systems.

This blog won’t describe the full functional scope (the central hdbcons note 2222218). I will show some general use cases which are not well known, but should be kept in mind for scenarios I call ‘HANA first aid’. At the end of the blog you will also get an example script for such a troubleshooting situation.

1. General stuff about hdbcons

2. HANA 2.0 SPS04 vs. SPS05 features

3. Disk / Container / Log Management

4. Memory / NUMA

5. HSR

6. Troubleshooting

General stuff about hdbcons

At first just use it if you are sure you don’t get this information by using the SQL way over e.g. HANA Cockpit. It remains as an expert tool. Also if you already used it 100 times. Every installation has its own topology and parameter set which can be different. Please try every command in the non-prod system before you use it in production. However, this just minimizes the risk. It doesn’t eliminate it!

When I should use hdbcons?

◉ good if you don’t have SQL access or the needed roles

◉ secondary site of a HSR

◉ system is unresponsible

All 3 aspects have in common that there is no way to connect via SQL. In HSR scenario (besides active/active = read enabled) this is the case for the secondary site.

As already explained I won’t explain the full scope of the tool, but I will show some hacks how I use it:

On Linux level you can start hdbcons as sidadm:

#shows normal help

hdbcons "help"

#shows all sub commands

hdbcons "all help"

#is a quick search if, you do not know the exact command

hdbcons "all help" | grep -i <search term> -B 10

hdbcons [-p <pid>] [-e <process_name>]

Per default hdbcons will connect to the indexserver process, but you can specify the process ID of another SAP HANA process via "-p <pid>" or the name of another SAP HANA process via "-e <process_name>" (e.g. <process_name> = 'hdbnameserver').

HANA 2.0 SPS04 vs. SPS05 features

As there are no big new features in SPS05 the version are pretty identically. There is only one difference in context of CPU control regarding the LLC called cpuresctrl.

Here is the section overview of SPS05:

ae_tableload – Handle loading of column store tables and columns

all – Print help and other info for all hdbcons commands

authentication – Authentication management.

binarysemaphore – BinarySemaphore management

bye – Exit console client

cd – ContainerDirectory management

cfgreg – Basis Configurator

checktopic – CheckTopic management

cnd – ContainerNameDirectory management

conditionalvariable – ConditionalVariable management

connection – Connection management

context – Execution context management (i.e., threads)

cpuresctrl – Manage cpu resources such as last-level cache allocation

converter – Converter management

crash – Crash management

crypto – Cryptography management (SSL/SAML/X509/Encryption).

csaccessor – Display diagnostics related to the CSAccessor library

ddlcontextstore – Get DdlContextStore information

deadlockdetector – Deadlock detector.

debug – Debug management

distribute – Handling distributed systems

dvol – DataVolume management

ELF – ELF symbol resolution management

encryption – Persistence encryption management

eslog – Manipulate logger on extended storage

event – Event management

exit – Exit console client

flightrecorder – Flight Recorder

hananet – HANA-Net command interface

help – Display help for a command or command list

hkt – HANA Kernal Tracer (HKT) management

indexmanager – Get IndexManager information, especially for IndexHandles

itab – Internaltable diagnostics

jexec – Information and actions for Job Executor/Scheduler

licensing – Licensing management.

log – Show information about logger and manipulate logger

machine – Information about the machine topology

mm – Memory management

monitor – Monitor view command

mproxy – Malloc proxy management

msl – Mid size LOB management

mutex – Mutex management

numa – Provides NUMA statistics for all columns of a given table, broken down by column constituents like dictionary, data vector and index.

nvmprovider – NVM Provider

output – Command for managing output from the hdbcons

page – Page management

pageaccess – PageAccess management

profiler – Profiler

quit – Exit console client

readwritelock – ReadWriteLock management

replication – Monitor data and log replication

resman – ResourceManager management

rowstore – Row Store

runtimedump – Generate a runtime dump.

savepoint – Savepoint management

semaphore – Semaphore management

servicethreads – Thread information M_SERVICE_THREADS

snapshot – Snapshot management

stat – Statistics management

statisticsservercontroller – StatisticsServer internals

statreg – Statistics registry command

syncprimi – Syncprimitive management (Mutex, CondVariable, Semaphore, BinarySemaphore,

ReadWriteLock)

table – Table Management

tablepreload – Manage and monitor table preload

trace – Trace management

tracetopic – TraceTopic management

transaction – Transaction management

ut – UnifiedTable Management

version – Version management

vf – VirtualFile management

x2 – get X2 info

Disk / Container / Log Management

One of the most ask questions is about the reorg of the persistence layer. Here we have an easy command to get the information we need:

hdbcons "dvol info"

## Start command at: 2021-01-09 13:08:29.730

DataVolumePartition #0 (/hana/data/D02/mnt00001/hdb00004.00004/)

  size= 2593019396096

  used= 2424372674560

NOTE: We DO NOT recommend reclaiming this DataVolume. Reclaiming may take considerable time and free up space that will have to be reallocated in subsequent savepoint cycles anyway.

Here is also the recommendation if you should reorg/reclaim or not.

If we need more details how the data volume divided, we need information about the structure. The persistence is structured in container. We have several container types like “Unified Table”, “VirtualFileLOB” or “MidSizeContainerLOB”. Too many LOB files can be a performance overhead for the startup procedure, the size of the PageCache (Memory overhead) or issues regarding the garbage collection. This means you should always be aware of the size.

#global statistics to get an overview of non-temp and temp containers

hdbcons "cd gstat"

## Start command at: 2021-01-08 14:54:52.570

Sum of non-temporary containers:

FreeUndoFile            :     7862

VirtualFileLOB          : 49879170

VirtualFileCLOB         :       25

VirtualFileNCLOB        :   377153

VirtualFileMain         :   287842

VirtualFileMetaData     :        3

liveCache               :        3

UnifiedTable            :    89115

MidSizeLobContainerLOB  :       22

MidSizeLobContainerCLOB :        1

MidSizeLobContainerNCLOB:       27

FixedSizeEntry          :        1

Sum of temporary containers:

raw size of ContainerDirectory      :  92 MB, page count:   370

raw size of LOB ContainerDirectory  :  19 GB, page count: 79951

raw size of MS LOBContainerDirectory: 256 kB, page count:     1

raw size of UNDO ContainerDirectory : 313 MB, page count:  1252

raw size of temp ContainerDirectory :   0  B, page count:     0

raw size of FileIDMap container     :   4 GB, page count: 17885

raw size of LOB FileIDMap container :  96 MB, page count:   384

#query all containers

hdbcons "cd stat -a"

## Start command at: 2021-01-09 14:02:10.059

UnifiedTable            : count=    89115, sum page count=   1156561, sum raw size= 1533 GB, deleted count= 0, sum deleted page count= 0, sum deleted raw size= 0  B

VirtualFileLOB          : count= 49879170, sum page count= 138155123, sum raw size=  593 GB, deleted count= 0, sum deleted page count= 0, sum deleted raw size= 0  B

MidSizeLobContainerLOB  : count=       22, sum page count=    150284, sum raw size=    9 GB, deleted count= 0, sum deleted page count= 0, sum deleted raw size= 0  B

VirtualFileNCLOB        : count=   378285, sum page count=    553743, sum raw size=    4 GB, deleted count= 0, sum deleted page count= 0, sum deleted raw size= 0  B

VirtualFileMain         : count=   287842, sum page count=    309886, sum raw size=    3 GB, deleted count= 0, sum deleted page count= 0, sum deleted raw size= 0  B

MidSizeLobContainerNCLOB: count=       27, sum page count=     13859, sum raw size=  866 MB, deleted count= 0, sum deleted page count= 0, sum deleted raw size= 0  B

FreeUndoFile            : count=     7954, sum page count=      7952, sum raw size=  108 MB, deleted count= 0, sum deleted page count= 0, sum deleted raw size= 0  B

VirtualFileMetaData     : count=        3, sum page count=        17, sum raw size=    4 MB, deleted count= 0, sum deleted page count= 0, sum deleted raw size= 0  B

liveCache               : count=        3, sum page count=         2, sum raw size=  512 kB, deleted count= 0, sum deleted page count= 0, sum deleted raw size= 0  B

VirtualFileCLOB         : count=       26, sum page count=        52, sum raw size=  208 kB, deleted count= 0, sum deleted page count= 0, sum deleted raw size= 0  B

MidSizeLobContainerCLOB : count=        1, sum page count=         1, sum raw size=   64 kB, deleted count= 0, sum deleted page count= 0, sum deleted raw size= 0  B

FixedSizeEntry          : count=        1, sum page count=         0, sum raw size=    0  B, deleted count= 0, sum deleted page count= 0, sum deleted raw size= 0  B

time needed to collect data from container directories: 4.78 min

[OK]

## Finish command at: 2021-01-09 14:06:57.154 command took: 4.785 min

#query a specific type

hdbcons "cd count -a -t <type>"

hdbcons "cd count -a -t VirtualFileLOB​"

## Start command at: 2021-01-09 13:24:13.367

LOBContainerDirectory       : count= 49879170, sum page count= 138155123, sum raw size= 593 GB, deleted count= 0, sum deleted page count= 0, sum deleted raw size= 0  B

#find the top <n> objects

hdbcons "cd top -l <n>"

hdbcons "cd top -l 10"

## Start command at: 2021-01-09 13:45:55.874

Top Raw Size:

  1: size: 47 GB, cID: 0xfe03235f24, I\1\SAPREQ\1\EDID4\1\\1\D\1\ || $container$

  2: size: 42 GB, cID: 0xfe03235f11, I\1\SAPREQ\1\CDPOS\1\\1\D\1\ || $container$

  3: size: 41 GB, cID: 0xfe03235f18, I\1\SAPREQ\1\VBFA\1\\1\D\1\ || $container$

  4: size: 39 GB, cID: 0xfe03235f13, I\1\SAPREQ\1\VRPMA\1\\1\D\1\ || $container$

  5: size: 37 GB, cID: 0xfe03235f15, I\1\SAPREQ\1\BALDAT\1\\1\D\1\ || $container$

  6: size: 36 GB, cID: 0xfe03235f14, I\1\SAPREQ\1\VBRP\1\\1\D\1\ || $container$

  7: size: 32 GB, cID: 0xfe03235f0d, I\1\SAPREQ\1\CE40010_ACCT\1\\1\D\1\ || $container$

  8: size: 30 GB, cID: 0xfe03235f0e, I\1\SAPREQ\1\BSIM\1\\1\D\1\ || $container$

  9: size: 26 GB, cID: 0xfe7cd37a90, I\1\SAPREQ\1\_SYS_SPLIT_BSEG~7\1\\1\D\1\ || $container$

 10: size: 26 GB, cID: 0xfe7cd37a91, I\1\SAPREQ\1\_SYS_SPLIT_BSEG~8\1\\1\D\1\ || $container$

With this commands you will get all details you need for a detailed analysis

From data volume you will go further to the log volume.

There are min. 2 log segments. One in state free and one in state writing. There can be more till the log volume is full.

#show state of each log segments

hdbcons "log info" | grep STATE

#to show / count all log segments with state free

hdbcons "log info" | grep STATE | grep Free | wc -l

#to show / count all log segments NOT in state free

hdbcons "log info" | grep STATE | grep -v Free | wc -l

Memory / NUMA

One of the important function is to get an overview over all tenants. The easiest way – in my opinion – is the inter process memory management (IPMM). This is also an important part of every RTE dump. IPMM includes PAL, TPAL, AB and U. What do these acronyms mean?

#create IPMM report

hdbcons "mm ipmm"

Inter Process Memory Management

IPMM shared memory segment: SAP_IPMM_O048_D02

SId=        Process secondary identifier

AB=         Allocated bytes by allocators for heap and stacks (inclusive allocator cache)

UA=         Unaccounted memory (by unlimited allocators for exceptions etc.), maybe overcommit

U=          Used bytes for heap and stacks (without allocator cache)

FSL=        Free segment list size

FMC=        Free precharge memory cache size

SMAS/SMDS=  System memory traffic information (accumulation of allocations/deallocations)

PAL=        Process allocation limit

TPAL=       Temp process allocation limit

Blocked=    Memory (TPAL!) blocked/reserved for this process

W=          Weight of process

[0] PID=250869 (state=S), SId=640483232, compactors active, alive, process name: hdbnameserver

    AB=166.63gb (178927042560b), UA=0b, U=7.08gb (7612028948b), FSL=6gb (6442450944b), FMC=0b, SMAS=1.52tb (1678410842112b), SMDS=1.36tb (1499367014400b), PAL=7.74tb (8516251680768b), TPAL=168.45gb (180875427840b), W=100

[1] PID=251734 (state=S), SId=640483233, compactors active, alive, process name: hdbcompileserver

    AB=4.57gb (4913692672b), UA=0b, U=677.12mb (710020245b), FSL=0b, FMC=0b, SMAS=214.09gb (229882134528b), SMDS=209.38gb (224827015168b), PAL=7.74tb (8516251680768b), TPAL=4.57gb (4913692672b), Blocked=1.02gb (1102446592b) (perm=1102446592B, temp=0), W=100

[2] PID=251737 (state=S), SId=640483234, compactors active, alive, process name: hdbpreprocessor

    AB=3.45gb (3715432448b), UA=0b, U=670.25mb (702814346b), FSL=0b, FMC=0b, SMAS=185.04gb (198688833536b), SMDS=181.46gb (194850193408b), PAL=7.74tb (8516251680768b), TPAL=3.45gb (3715432448b), W=100

[3] PID=280438 (state=S), SId=640483247, compactors active, alive, process name: hdbxsengine

    AB=25.36gb (27232108544b), UA=0b, U=1.96gb (2105583430b), FSL=0b, FMC=0b, SMAS=304.03gb (326460309504b), SMDS=278.56gb (299110891520b), PAL=7.74tb (8516251680768b), TPAL=25.60gb (27491827712b), W=100

[4] PID=279916 (state=R), SId=640483246, compactors active, alive, process name: hdbindexserver

    AB=2.14tb (2361270599680b), UA=0b, U=1005.04gb (1079160726009b), FSL=198.25gb (212869316608b), FMC=0b, SMAS=9.21tb (10130337824768b), SMDS=7.06tb (7768950571008b), PAL=7.74tb (8516251680768b), TPAL=2.17tb (2394008911872b), W=100

[5] PID=266322 (state=S), SId=640483237, compactors active, alive, process name: hdbwebdispatcher

    AB=4.48gb (4814077952b), UA=0b, U=934.67mb (980079007b), FSL=0b, FMC=0b, SMAS=232.21gb (249344753664b), SMDS=227.61gb (244402880512b), PAL=7.74tb (8516251680768b), TPAL=4.48gb (4814077952b), W=100

[6] PID=266887 (state=S), SId=640483238, compactors active, alive, process name: hdbindexserver

    AB=4.52tb (4977854775296b), UA=0b, U=3.24tb (3569697000331b), FSL=167.25gb (179583320064b), FMC=0b, SMAS=119.61tb (131515736981504b), SMDS=115.08tb (126537765879808b), PAL=7.74tb (8516251680768b), TPAL=4.63tb (5092921180160b), W=100

[7] PID=266890 (state=S), SId=640483239, compactors active, alive, process name: hdbxsengine

    AB=29.97gb (32184336384b), UA=0b, U=2.08gb (2239677798b), FSL=0b, FMC=0b, SMAS=355.90gb (382152015872b), SMDS=325.82gb (349850238976b), PAL=7.74tb (8516251680768b), TPAL=30.01gb (32224116736b), W=100

 AB=64.69gb (69462654976b), UA=0b, U=800.17mb (839044495b), FSL=5.75gb (6174015488b), FMC=2.50gb (2684420096b), SMAS=211.13gb (226702262272b), SMDS=146.30gb (157099687936b), PAL=7.74tb (8516251680768b), TPAL=64.74gb (69514493952b), W=100

GLOBAL_MAX_ALLOCATION_LIMIT=7.74tb (8516251680768b)

SHARED_MEMORY=139.59gb (149889613824b)

OOM_RESERVATION_ALLOCATORS=96.18mb (100859904b)

CODE_SIZE=2.44gb (2623602688b)

IPMM_RESERVATION=0b

Sum TPAL (including shm, code size and IPMM reservation but not IPMM OOM reservation)=7.74tb (8516117463040b) (cached sum=7.74tb (8516117463040b)), emergency reservation=128mb (134217728b)

#processes=18, weight sum=1800, cached weight sum=1800, #checks for cached allocation limit sum=1457242, #terminated processes since last reset=9

Early exits are currently allowed.

Last reclaim memory started to wait for IPMM lock since 2021-01-09 15:49:28.852.

Last reclaim memory got IPMM lock at 2021-01-09 15:49:28.852.

User ids respected in shm calculation: [17286]

Parallel memory reclaim tasks are activated.

This is an opt build.

Peak used memory: 5.50tb (6057423072499b) with corresponding values (Heap=5.37tb (5906923121907b), Shared=137.71gb (147876347904b), Code=2.44gb (2623602688b)).

Peak heap memory: 5.37tb (5906923121907b). Peak shared memory: 221.46gb (237802225664b). Peak code size: 2.44gb (2623602688b)

Important is that the sum of all PAL are lower than the GAL. With SAP HANA 2 SPS03, the free memory cache (FMC) was introduced in SAP HANA. This feature enables HANA to efficiently reuse the already allocated memory.

Some rules:

AB > U

AB >= TPAL

PAL > AB

PAL = <service>.ini -> [memorymanager] -> allocationlimit

GAL = global.ini -> [memorymanager] -> global_allocation_limit

You want some details about the NUMA topology? No problem:

#Virtual Memory allocations by HANA Memory Manager

hdbcons "mm numa -v"

## Start command at: 2021-01-09 15:46:34.490

*************************************************

               Virtual Memory allocations by HANA Memory Manager

*************************************************

Total number of NUMA Nodes = 8

 Target NODE   #Allocs   #Deallocs   Alloc size (MB)  Fallback size (MB)   Dealloc size (MB)

------------   -------   ---------   ---------------  ------------------   -----------------

           0      1243         579         929401.88                0.00           696546.06

           1      1569         723         878001.88                0.00           611223.69

           2      1804         671         981302.19                0.00           604844.62

           3      1980         827        1152106.75                0.00           766656.25

           4      1362         881         894711.38                0.00           715093.94

           5      2645        1541        2286683.88                0.00          1855888.88

           6      1814        1268        1080738.38                0.00           889302.94

           7      1716         920        1344334.12                0.00          1008572.25

 INTERLEAVED         0           0              0.00                  --                0.00

NOTE: Fallback schemes are used if 'mbind' to target node fails!

#Physical Memory allocations by OS (resident pages)

hdbcons "mm numa -p"

## Start command at: 2021-01-09 15:47:26.270

***********************************************

              Physical Memory allocations by OS (resident pages)

***********************************************

Total number of NUMA Nodes = 8

 Target NODE   Local allocation (MB)  Remote allocation (MB)  NUMA HIT Rate(%)

 -----------   ---------------------  ----------------------  ----------------

           0               131055.12                75951.12             63.31

           1               138521.44               100013.12             58.07

           2               193621.62               149436.81             56.44

           3               268244.00                77401.75             77.61

           4                89977.88                69812.50             56.31

           5               320315.00                80008.69             80.01

           6               121304.75                69729.06             63.50

           7               199923.38               134092.31             59.85

 INTERLEAVED                    0.00                      --                --

*********************************************

My system is a test system with remote allocation. In a production environment this musn’t be the case!

HSR

On the secondary site are some details which you can only see there. If you monitor the HSR on the primary site with the python script systemReplicationStatus.py or via HANA Cockpit you only see the data shipping. If this is complete the HSR is running with log shipping. This is correct, but are the systems at this point in time in sync? No, there is a backlog and some steps missing which can be monitored via hdbcons.

#show replay backlog section

hdbcons "replication info" | grep replayBacklog​

 - replayBacklog                   : 0 microseconds

 - replayBacklogSize               : 0 bytes

 - replayBacklogMax                : 13065250401 microseconds

 - replayBacklogSizeMax            : 46759845888 bytes

The backlog should be close to 0 (depending on the sync mode). Only then is it guaranteed that the systems are in the sync!

#Details about the replay steps

hdbcons "replication info" | grep replayStep

 - replayStepStatus                : ReplayStep_None

 - replayStepFinishLogPos          : 0x5254392ff80

 - replayStepFinishLogPosTimestamp : 09.01.2021-16.59.59 (1610211599646137)

 - replayStepStartLogPos           : 0x5254392ff80

 - replayStepStartLogPosTimestamp  : 09.01.2021-16.59.59 (1610211599646137)

 - replayStepCount                 : 338611163

ReplayStep_None means currently the systems are in sync and no step is open

#Full backup size

hdbcons "replication info" | grep FullBackupSize

 - shippedFullBackupSize           : 5829827788800 bytes

 - shippedFullBackupSizeNet        : 5829827788800 bytes (100.00%)

 - shippedLastFullBackupSize       : 5829827788800 bytes

 - shippedLastFullBackupSizeNet    : 5829827788800 bytes (100.00%)

Troubleshooting

The first step when you forcing an issue, create a runtime dump. 

#Generate call stacks for active threads with sql

hdbcons "context l -s -q" | grep -v suppressed

#wait analysis to indentify the activity responsible for the locking

indexmanager waitgraph -w

#Job execution log

jexec logstart -c <events_in_million>

jexec logstop

jexec logprint <file_name>.jexlog

If you want create your own HANA first aid script, just use the hdbcons commands. Here an example (which is not adapted for a multi tenant system):

 #!/bin/sh

c_date=$(date +%F-%H%M%S)

function runtimedump

{

        echo "Create RuntimeDump"

        hdbcons "runtimedump dump"

}

function job_execs

{

        echo "Create Job Execution Log"

        hdbcons "jexec logstart -c 30"

        sleep 10

        hdbcons "jexec logstop"

        hdbcons "jexec logprint "/tmp/${c_date}.jexlog""

}

function waitgraph

{

        echo "Create Waitgraph"

        hdbcons "indexmanager waitgraph -w -o /tmp/waitgraph_${c_date}.dot"

        dot -Tpdf -o /tmp/waitgraph_${c_date}.pdf /tmp/waitgraph_${c_date}.dot

}

function active_threads

{

        echo "Create Active Threads Log"

        hdbcons "context l -s -q" | grep -v suppressed > /tmp/active_threads_${c_date}.log

}

runtimedump

job_execs

waitgraph

active_threads

runtimedump

echo "####################################"

echo "created files:"

echo "/tmp/${c_date}.jexlog"

echo "/tmp/waitgraph_${c_date}.pdf"

echo "/tmp/active_threads_${c_date}.log"

echo "####################################"

With this details the SAP support or every other HANA expert can create a first analysis for the situation. The experts normally need the callstacks, active threads, memory details and more of the point in time situation of the issue. With such an first aid script you can create this information by executing one script. This saves a lot of time and is uniform. Every admin knows it and unexpected system behaviour and you do not know where to start => easy, execute the first aid script. At XLC we have developed a comprehensive HDB first aid tool that saves our customers a lot of time and trouble.

You can read out some views of the passive HSR site via SQL over the primary. If you don’t have SQL access you can read out some (not all!)

#list all available monitor views

hdbcons "monitor list"

#search for a specific term

hdbcons "monitor list" | grep <search_term>

#print content of monitoring views

monitor print -h -n <monitor_view_name>

At the end you should know some hacks which allow you to troubleshoot your HANA system via hdbcons without any SQL access. With this knowledge you can create your own HANA first aid script to collect important information during time of the issue.

No comments:

Post a Comment