SAP HANA Tutorial, Material and Certification Guide

1.What are savepoints?

◉ Savepoints are required to synchronize changes in memory with the persistency on disk level. All modified pages of row and column store are written to disk during a savepoint.
◉ Each SAP HANA host and service has its own savepoints.
◉ The data belonging to a savepoint represents a consistent state of the data on disk and remains untouched until the next savepoint operation has been completed.

2.When is a savepoint triggered?

Savepoint interval(automatic)

During normal operations savepoints are automatically triggered when a predefined time since the last savepoint is passed. The length of the time interval between two consecutive savepoints can be controlled with the following parameter:

global.ini -> [persistence] -> savepoint_interval_s

Its default value is 300, so savepoints are taken in intervals of 300 seconds (5 minutes).

System command (manual)

The following command can be used to execute a savepoint manually:
ALTER SYSTEM SAVEPOINT

Soft shutdown

A soft shutdown invokes a savepoint before the services are stopped.
A hard shutdown doesn’t trigger a savepoint. This can increase the subsequent restart time.

Backup

A global savepoint is performed before a data backup is started.
A savepoint is written after the backup of a specific service if finished.

Startup

After a consistent database state is reached during startup, a savepoint is performed.

Snapshots

Snapshots are savepoints that are preserved for longer use and so they are not overwritten by the next savepoint.

3. Helpful Views

View	Details
M_SAVEPOINT_STATISTICS	Global savepoint information per host and service
M_SAVEPOINTS	Detailed information for individual savepoints
M_SERVICE_THREADS M_SERVICE_THREAD_SAMPLES HOST_SERVICE_THREAD_SAMPLES	As of SAP HANA SPS 10 savepoint details are logged for THREAD_TYPE = ‘PeriodicSavepoint’

4. Helpful SQL Script.

1969700 – SQL statement collection for SAP HANA

SQL statement	Details
SQL: “HANA_IO_Savepoints“	Detailed information for individual savepoints
SQL: “HANA_IO_Snapshots”	Snapshot information

5. Blocking Phase

The majority of the savepoint is performed online without holding a lock, but the finalization of the savepoint requires a lock. This step is called the blocking phase of the savepoint. It consists of two major subphases:

Sub phase	Thread detail	Description
WaitForLock	enterCriticalPhase(waitForLock)	Before the critical phase is entered, a ConsistentChangeLock needs to be allocated by the savepoint. If this lock is held by other threads / transactions, the dur style="width: 100%;ation of this phase is increasing. At the same time all other modifications on the underlying table like INSERT, UPDATE or DELETE are blocked by the savepoint with ConsistentChangeLock.
Critical	processCriticalPhase	Once the ConsistentChangeLock is acquired, the actual critical phase is entered and remaining I/O writes are performed in order to guarantee a consistent set of data on disk level. During this time other transactions aren’t allowed to perform changes on the underlying table and are blocked with ConsistentChangeLock.

6. Typical savepoint issues analysis

Symptoms	Thread detail	Description
Long waitForLock phase	enterCriticalPhase (waitForLock)	Long durations of the blocking phase (outside of the critical phase) are typically caused by SAP HANA internal lock contention. The following known scenarios exist ConsistentChangeLock Starting with Rev. 102 you can configure the following parameter in order to trigger a runtime dump (SAP Note 2400007) in case waiting for entering the critical phase takes longer than <seconds> seconds:indexserver.ini -> [persistence] -> runtimedump_for_blocked_savepoint_timeout = ‘<seconds>’
Long critical phase	processCriticalPhase	Delays during the critical phase are often caused by problems in the disk I/O area.

7. Analyze the runtime dump

indexserver_<hostname>.30003.rtedump.<timestamp>.savepoint_blocked.trc

is triggerred by the parameter runtimedump_for_blocked_savepoint_timeout.

You could check the runtime dump from the following aspects.

We could find the savepoint thread, Savepoint Callstack contains “DataAccess::SavepointLock::lockExclusive”

HANA Savepoint Analysis, SAP HANA Certifications, SAP HANA Materials and Tutorials

Other threads(SQL thread) waiting for the lock, Callstack contains: “DataAccess::SavepointSPI::lockSavepoint”

Runtime dump : section [SAVEPOINT_SHAREDLOCK_OWNERS]

Always, most time the savepoint hangs because the exclusive lock is occupied by other thread. This section can helps find which thread is occupying the lock.

SAVEPOINT_SHAREDLOCK_OWNERS

Owners of shared ConsistentChangeLock locks

In case a savepoint is blocked in the waitForLock phase (SAP Note 2100009), the blocking activities can be found in this section.

Example: In the following section, you could find that the thread id 298995 is blocking the shared lock which leads to the exclusive lock is blocked and hangs the savepoint.

[SAVEPOINT_SHAREDLOCK_OWNERS] Owners of shared SavepointLocks: (2017-10-10 11:18:13 112 Local)

96034[thr=298995]: JobWrk0145, TID: 4856, UTID: 1588661641, CID: -1, LCID: 0, parent: 299143, SQLUserName: “”, AppUserName: “”, AppName: “”, ConnCtx: —, StmtCtx: —, type: “JobWorker”, method: “”, detail: “”, command: “” at 0x00007efe63342e88 in ltt::string_base<char, ltt::char_traits<char> >::trim_(unsigned long)+0xb8 at string.hpp:683 (libhdbcs.so)

[OK]

After you got the thread id of the sharedlock owner, you could search the thread id and try to find its parent thread id. In this example, we could find the parent thread id is the following:

107423[thr=299143]: MergedogMerger, TID: 4856, UTID: 1588661641, CID: -1, LCID: 0, parent: 299445, SQLUserName: “”, AppUserName: “”, AppName: “”, ConnCtx: —, StmtCtx: —, type: “MergedogMerger“, method: “”, detail: “3 of 3 table(s): SAPERP:/1LT/VF00094506“, command: “” at 0x00007efe4e645f59 in syscall+0x19 (libc.so.6)

We got the conclusion that the merge of the table /1LT/VF00094506 is blocking the shared lock. Then we could try to find if any issue with the merge of the table.

Runtime dump: Section : [STATISTICS] M_SAVEPOINTS_

Import the data of this view to excel, and sort by column “CRITICAL_PHASE_WAIT_TIME” and “CRITICAL_PHASE_DURATION”

And we could see that the CRITICAL_PHASE_WAIT_TIME is over 10s, which is quite slow. This proves that there is issue with the savepoint and also issue with the exclusive lock.

And if you could find long duration of “CRITICAL_PHASE_DURATION”. This means there is issue with the I/O.

SAP HANA Central

Pages

Thursday, 26 October 2017

HANA Savepoint Analysis

No comments:

Post a Comment