Have you ever wondered what SAP HANA is doing from the moment you trigger the start or restart until it’s finally available to be used by your application layer?
Or perhaps you have experienced a startup that has taken your system way longer to be available than you have planned for, and you are trying to understand, what it was doing all this time to avoid it, improve it or plan for it next time?
Let me try shed some light on this, as the better you understand which factors influence the startup time, the better you can judge and influence it or at least take it into account.
When starting SAP HANA, as a very first step, the daemon service is started.
The daemon is a lightweight service that sets the environment variables required by the remaining services (nameserver, indexserver, scriptserver, xsengine, …) and then starts them in the correct order.
All services started by the daemon start up mostly independently from each other, following 9 steps explained in detail below, synching among each other at a certain point:
Services which have a persistence (e.g. nameserver, indexserver, …) need to open their volumes.
Before the volumes can be opened, the corresponding filesystems need to be made available to the right servers at the right mountpoints.
Taking care of this task is not part of the single services’ job. Instead, if you are using the SAP HANA Fiber Channel Storage Connector, it’s the nameserver service, that mounts the data and log volumes for all services. If you don’t use it, you need to ensure that this task is accomplished, before the single services attempt to open their volumes.
Opening a volume basically means obtaining a file handle for it and hence we are talking about milliseconds or – at most – seconds. Any delay beyond this magnitude is usually caused by reservation conflicts or other storage-related issues.
Relevant trace entries:
i ha_fcClient fcClient.py : trying to attach for partition 1, usage type DATA on path /hana/data/SID/mnt00001
i PhysicalPageAcce DataVolumePartitionImpl.cpp : DVolPart: opened datavolume partition #0 with 1 files in '/hana/data/SID/mnt00001/hdb00003/')
or
i PhysicalPageAcce DataVolumeImpl.cpp : Open volume file(s) in "/hana/data/SID/mnt00001/hdb00003/"
Now that the volumes are open, basic low-level persistence structures are loaded from disk.
As these are kept in memory in a structure very similar to the one stored on disk, no time- or resource-consuming conversions are required. Instead, the duration of this step is mainly I/O bound.
More precisely, the following tasks are accomplished in this step:
1. The anchor and the restart page are loaded. To HANA, these are something like the master boot record is to an operating system.
2. Converters and container directories are loaded.
3. Undo files are searched for information on transactions that have not committed before the system shut down. Don’t be fooled by the fact that the traces refer to sessions rather than transactions – it’s uncommitted transactions.
4. HANA persistence statistics are loaded and initialized.
What has been said so far about this step implies, that the duration of this step should not vary substantially. However, it can if one of the contributing factors themselves vary significantly, which are:
◈ The size of persistence, specifically, of the volumes
◈ The number of undo files
◈ The number of disk based LOBs
◈ The read performance for the DATA volume(s)
Relevant trace entries:
i Savepoint SavepointImpl.cpp : AnchorPage with SPV=123 loaded. Persistence created at 2019-01-01 01:23:34.567890
i Savepoint SavepointImpl.cpp : restartPage loaded with spv=123
i PersistenceManag PersistenceManagerImpl.cpp : loadConverters and markSnapshotFBM – done
i PersistenceManag PersistentSpaceImpl.cpp : loadContainerDirectories - done.
i PMRestart PersistenceSessionRegistry.cpp : Loading 123 open session(s) and 123 history cleanup file(s) finished in 12.3456 seconds;
i PMRestart VirtualFileStatsProxy.cpp : LOB owner statistics initialized from VirtualFile: 45678 CD entries in 123.345 sec
i PMRestart PersistenceSessionRegistry.cpp : Initialized transient structures in 123.456 seconds;
Improvements in HANA 2:
As of HANA 2, packed LOBs are available.
Among many other optimizations, packed LOBs use an optimized persistence structure which allows faster LOB handling during startup and thereby decreases the time spend in this second step.
For as long as the system is up and running, the row store is kept in shared memory, owned by the corresponding HANA service operating system process. During shutdown, this shared memory would normally be released back to the OS, while, during system startup, a significant part of the overall startup time would have to be spent on loading the row store into memory again.
How long it takes to load the row store into memory again depends – of course – mainly on the volume of row store tables in the system, but quite often this step turns out to be a main contributor to the overall startup time.
To save this time, there’s an optimization which, if certain conditions are met, allows to keep the shared memory containing the row store in memory despite restarting the corresponding service.
Long story short – given a handful of conditions are met, it’s possible to keep the row store in memory despite a service/system shutdown and to reattach it to the owner service at its re-start, making loading the row store from disk during startup redundant.
If any condition isn’t met, keeping it at shut down and/or reattaching at startup, the row store will simply be loaded from the DATA volume in this step.
Now that the row store has been reloaded or reattached, the row store needs to rollback all changes that have not been committed before shutdown.
Unlike the time spent on loading and initializing persistence structures, the time spent on loading or reattaching the row store might very well vary significantly from one startup to another, depending on:
◈ The size of the row store
◈ Whether all conditions are met for keeping and reattaching the row store
◈ The read performance of the DATA volume, if the row store cannot be kept over restart
◈ The amount of changes in row store that need to be rolled back
Relevant trace entries:
i Service_Startup SmFastRestart.cc : Loading RowStore segments from Persistency
i Service_Startup CheckpointMgr.cc: 123 RS segments loaded from Persistency in 12.3456 sec (321.12345 MB/s)
or
i Service_Startup SmFastRestart.cc : Reattaching RS shared memory for RowStore segments
i Service_Startup SmFastRestart.cc: 123 RS segments loaded in 0.321 sec (1234.23 MB/s)
i Service_Startup Starter.cc(00422) : RS Undo log collection finished in 1.23 sec.
i Service_Startup Starter.cc(00428) : Consistent RS data image has been prepared.
i Service_Startup Starter.cc(00656) : RS uncommitted version construction finished in 1.23 sec.
i Service_Startup Starter.cc(00548) : All Page lists rebuild finished in 0.123 sec.
Just like the row store had to roll back all changes that have not been committed before shutdown in the previously explained step, HANA needs to do some cleanup for the column store as well.
In this step, the garbage collector cleans up all version except for the most recent one for any column store table. As you know, the garbage collector continuously cleans up versions that are no longer required while the system is up and running – still – if the system is shut down at a point in time when a transaction is still open and has not been committed for a long time (hours, days, weeks, seen it all, …), there might be some versions left to clean up during start up.
In HANA 1 this step needs to finish completely before it can continue with replaying the logs.
The time consumed by this step may vary a lot from one startup to another as it highly depends on:
◈ The existence of transactions that blocked the garbage collection before shutdown
◈ The read performance of the DATA volume for processing the history cleanup files
◈ On the HANA release, as this defines whether the step is blocking the next step or not
Relevant trace entries:
i Logger PersistenceManagerImpl.cpp : Termination of rollback(s) open in restart/backup savepoint finished in 0 seconds;
i Logger PersistenceManagerImpl.cpp : Garbage collection of history files finished: 12 cleanup files in 12.345 seconds;
Improvements in HANA 2:
With HANA 2, garbage-collecting obsolete versions is executed asynchronously. That means, it will allow the next step, replaying the logs, to continue in parallel.
Now that the row store is available in memory, HANA replays the logs to redo all changes that were performed after the last savepoint. There’s no need to load all column store tables before replaying the logs – if any of the changes to be replayed in this step affect column store tables, the affected ones are loaded into memory on the fly.
After successfully finishing the replay, all uncommitted transactions are rolled back.
The time spent in this step can vary a lot and it mainly depends on:
◈ The volume of logs to be replayed
◈ The volume of changes by uncommitted transactions to be rolled back
◈ The read performance of the LOG volume
◈ The read and write performance of the DATA volume(s)
Relevant trace entries:
i Logger PersistenceManagerImpl.cpp : Expecting savepoint 123 log record as first log record at 0x0000000001
i Logger PersistenceManagerImpl.cpp : Starting log replay at position 0x0000000001
i Logger RecoveryHandlerImpl.cpp : Recovery finished at log position 0x0000001234
i Logger RecoveryHandlerImpl.cpp : Termination of 1 indoubt transactions was executed in 0.123 seconds; 0 remaining sessions.
Now, it’s time that all services of a database synchronize with each other to ensure transactional consistency.
This is a quick thing to do, so very little time is spent exclusively by this step itself – if it does take longer, then usually because one of the services which is part of the sync is still busy with steps 1-5 and not ready to sync yet.
The main influence factors are:
◈ ◈ The startup progress of other services of the same database
The network connectivity between the services
Relevant trace entries:
i LogReplay RowStoreTransactionCallback.cc : Slave volumeIDs: 4 5
i Service_Startup transmgmt.cc : Checked the restart informations by volume [(4, 0x23456) (5, 0x13579)]
i LogReplay RowStoreTransactionCallback.cc : Finished master-slave DTX consistency check
All changes that have been performed in steps 3 – 5 are now persisted to the DATA volume by a savepoint.
This step shouldn’t take too long while it can vary depending on the following factors:
◈ The volume of changes during replay of logs phase.
◈ The write performance of the DATA volume(s)
Relevant trace entries:
i PersistenceManag PersistenceManagerImpl.cpp : writing restart savepoint
i Savepoint SavepointImpl.cpp : Savepoint current savepoint version: 124, restart redo log position: 0x0000001236, next savepoint version: 125, last snapshot SP version: 0
If you haven’t set parameter
indexserver.ini -> [row_engine] -> consistency_check_at_startup
to none, a row store consistency check is performed as last step during startup.
Usually, this check finishes quite fast and it’s limited to a maximum runtime of 10 minutes.
If you’re confident to take good care of consistency checks while the system is up and running and feel safe to disable this quick check during startup to save the <= 10 minutes of time spent during startup, refer to SAP Note 2222217 to disable this check.
The time spent by this step should not vary a lot from startup to startup and depends on:
◈ The HANA version
◈ The size of row store
◈ The value of parameter consistency_check_at_startup
Relevant trace entries:
i Service_Startup IntegrityChecker.cc : Checking rowstore consistency.
i Service_Startup IntegrityCheckerTimer.h : Startup Row Table Consistency Check timeout is 600
i Service_Startup IntegrityCheckerTimer.h : Startup Row Table Consistency Check is finished before timeout
i Service_Startup IntegrityChecker.cc : Consistency check time: 537.69 seconds
Now that all the work is done, the SQL port is opened. Just like step the sync among the services, this is a very quick thing to do.
If you really see issues that cause opening the SQL port to be delayed or to fail, it’s usually because another operating system process already occupies the port.
To ensure that all HANA-related ports are reserved correctly, you can use the SAP Host Agent as described in SAP Note 2382421.
As soon as the SQL port is open, your applications are, technically speaking, able to access the HANA system. Still, remember that not all column store tables have been loaded yet and that, depending on your experience with your system, you might want to wait for the column store preload to finish (or not) before restarting the applications that use your HANA system.
Relevant trace entries:
i Service_Startup tcp_listener_callback.cc : Configure channel send timeout to SQL listener: 300000 ms
i Service_Startup tcp_listener_callback.cc : Configure channel receive timeout to SQL listener: 300000 ms
i Service_Startup tcp_listener_callback.cc : start the SQL listening port: 30015 with backlog size 128
i assign TREXIndexServer.cpp : assign to volume 3 finished
i TableReload TRexApiSystem.cpp : Starting reloading column store tables based on previous load information
That’s it for now.
Or perhaps you have experienced a startup that has taken your system way longer to be available than you have planned for, and you are trying to understand, what it was doing all this time to avoid it, improve it or plan for it next time?
Let me try shed some light on this, as the better you understand which factors influence the startup time, the better you can judge and influence it or at least take it into account.
When starting SAP HANA, as a very first step, the daemon service is started.
The daemon is a lightweight service that sets the environment variables required by the remaining services (nameserver, indexserver, scriptserver, xsengine, …) and then starts them in the correct order.
1. Opening the Volumes
Services which have a persistence (e.g. nameserver, indexserver, …) need to open their volumes.
Before the volumes can be opened, the corresponding filesystems need to be made available to the right servers at the right mountpoints.
Taking care of this task is not part of the single services’ job. Instead, if you are using the SAP HANA Fiber Channel Storage Connector, it’s the nameserver service, that mounts the data and log volumes for all services. If you don’t use it, you need to ensure that this task is accomplished, before the single services attempt to open their volumes.
Opening a volume basically means obtaining a file handle for it and hence we are talking about milliseconds or – at most – seconds. Any delay beyond this magnitude is usually caused by reservation conflicts or other storage-related issues.
Relevant trace entries:
i ha_fcClient fcClient.py : trying to attach for partition 1, usage type DATA on path /hana/data/SID/mnt00001
i PhysicalPageAcce DataVolumePartitionImpl.cpp : DVolPart: opened datavolume partition #0 with 1 files in '/hana/data/SID/mnt00001/hdb00003/')
or
i PhysicalPageAcce DataVolumeImpl.cpp : Open volume file(s) in "/hana/data/SID/mnt00001/hdb00003/"
2. Loading and Initializing Persistence Structures
Now that the volumes are open, basic low-level persistence structures are loaded from disk.
As these are kept in memory in a structure very similar to the one stored on disk, no time- or resource-consuming conversions are required. Instead, the duration of this step is mainly I/O bound.
More precisely, the following tasks are accomplished in this step:
1. The anchor and the restart page are loaded. To HANA, these are something like the master boot record is to an operating system.
2. Converters and container directories are loaded.
3. Undo files are searched for information on transactions that have not committed before the system shut down. Don’t be fooled by the fact that the traces refer to sessions rather than transactions – it’s uncommitted transactions.
4. HANA persistence statistics are loaded and initialized.
What has been said so far about this step implies, that the duration of this step should not vary substantially. However, it can if one of the contributing factors themselves vary significantly, which are:
◈ The size of persistence, specifically, of the volumes
◈ The number of undo files
◈ The number of disk based LOBs
◈ The read performance for the DATA volume(s)
Relevant trace entries:
i Savepoint SavepointImpl.cpp : AnchorPage with SPV=123 loaded. Persistence created at 2019-01-01 01:23:34.567890
i Savepoint SavepointImpl.cpp : restartPage loaded with spv=123
i PersistenceManag PersistenceManagerImpl.cpp : loadConverters and markSnapshotFBM – done
i PersistenceManag PersistentSpaceImpl.cpp : loadContainerDirectories - done.
i PMRestart PersistenceSessionRegistry.cpp : Loading 123 open session(s) and 123 history cleanup file(s) finished in 12.3456 seconds;
i PMRestart VirtualFileStatsProxy.cpp : LOB owner statistics initialized from VirtualFile: 45678 CD entries in 123.345 sec
i PMRestart PersistenceSessionRegistry.cpp : Initialized transient structures in 123.456 seconds;
Improvements in HANA 2:
As of HANA 2, packed LOBs are available.
Among many other optimizations, packed LOBs use an optimized persistence structure which allows faster LOB handling during startup and thereby decreases the time spend in this second step.
3. Loading or Reattaching the Row Store
For as long as the system is up and running, the row store is kept in shared memory, owned by the corresponding HANA service operating system process. During shutdown, this shared memory would normally be released back to the OS, while, during system startup, a significant part of the overall startup time would have to be spent on loading the row store into memory again.
How long it takes to load the row store into memory again depends – of course – mainly on the volume of row store tables in the system, but quite often this step turns out to be a main contributor to the overall startup time.
To save this time, there’s an optimization which, if certain conditions are met, allows to keep the shared memory containing the row store in memory despite restarting the corresponding service.
Long story short – given a handful of conditions are met, it’s possible to keep the row store in memory despite a service/system shutdown and to reattach it to the owner service at its re-start, making loading the row store from disk during startup redundant.
If any condition isn’t met, keeping it at shut down and/or reattaching at startup, the row store will simply be loaded from the DATA volume in this step.
Now that the row store has been reloaded or reattached, the row store needs to rollback all changes that have not been committed before shutdown.
Unlike the time spent on loading and initializing persistence structures, the time spent on loading or reattaching the row store might very well vary significantly from one startup to another, depending on:
◈ The size of the row store
◈ Whether all conditions are met for keeping and reattaching the row store
◈ The read performance of the DATA volume, if the row store cannot be kept over restart
◈ The amount of changes in row store that need to be rolled back
Relevant trace entries:
i Service_Startup SmFastRestart.cc : Loading RowStore segments from Persistency
i Service_Startup CheckpointMgr.cc: 123 RS segments loaded from Persistency in 12.3456 sec (321.12345 MB/s)
or
i Service_Startup SmFastRestart.cc : Reattaching RS shared memory for RowStore segments
i Service_Startup SmFastRestart.cc: 123 RS segments loaded in 0.321 sec (1234.23 MB/s)
i Service_Startup Starter.cc(00422) : RS Undo log collection finished in 1.23 sec.
i Service_Startup Starter.cc(00428) : Consistent RS data image has been prepared.
i Service_Startup Starter.cc(00656) : RS uncommitted version construction finished in 1.23 sec.
i Service_Startup Starter.cc(00548) : All Page lists rebuild finished in 0.123 sec.
4. Garbage-Collecting Versions
Just like the row store had to roll back all changes that have not been committed before shutdown in the previously explained step, HANA needs to do some cleanup for the column store as well.
In this step, the garbage collector cleans up all version except for the most recent one for any column store table. As you know, the garbage collector continuously cleans up versions that are no longer required while the system is up and running – still – if the system is shut down at a point in time when a transaction is still open and has not been committed for a long time (hours, days, weeks, seen it all, …), there might be some versions left to clean up during start up.
In HANA 1 this step needs to finish completely before it can continue with replaying the logs.
The time consumed by this step may vary a lot from one startup to another as it highly depends on:
◈ The existence of transactions that blocked the garbage collection before shutdown
◈ The read performance of the DATA volume for processing the history cleanup files
◈ On the HANA release, as this defines whether the step is blocking the next step or not
Relevant trace entries:
i Logger PersistenceManagerImpl.cpp : Termination of rollback(s) open in restart/backup savepoint finished in 0 seconds;
i Logger PersistenceManagerImpl.cpp : Garbage collection of history files finished: 12 cleanup files in 12.345 seconds;
Improvements in HANA 2:
With HANA 2, garbage-collecting obsolete versions is executed asynchronously. That means, it will allow the next step, replaying the logs, to continue in parallel.
5. Replaying the Logs
Now that the row store is available in memory, HANA replays the logs to redo all changes that were performed after the last savepoint. There’s no need to load all column store tables before replaying the logs – if any of the changes to be replayed in this step affect column store tables, the affected ones are loaded into memory on the fly.
After successfully finishing the replay, all uncommitted transactions are rolled back.
The time spent in this step can vary a lot and it mainly depends on:
◈ The volume of logs to be replayed
◈ The volume of changes by uncommitted transactions to be rolled back
◈ The read performance of the LOG volume
◈ The read and write performance of the DATA volume(s)
Relevant trace entries:
i Logger PersistenceManagerImpl.cpp : Expecting savepoint 123 log record as first log record at 0x0000000001
i Logger PersistenceManagerImpl.cpp : Starting log replay at position 0x0000000001
i Logger RecoveryHandlerImpl.cpp : Recovery finished at log position 0x0000001234
i Logger RecoveryHandlerImpl.cpp : Termination of 1 indoubt transactions was executed in 0.123 seconds; 0 remaining sessions.
6. Transaction Management
Now, it’s time that all services of a database synchronize with each other to ensure transactional consistency.
This is a quick thing to do, so very little time is spent exclusively by this step itself – if it does take longer, then usually because one of the services which is part of the sync is still busy with steps 1-5 and not ready to sync yet.
The main influence factors are:
◈ ◈ The startup progress of other services of the same database
The network connectivity between the services
Relevant trace entries:
i LogReplay RowStoreTransactionCallback.cc : Slave volumeIDs: 4 5
i Service_Startup transmgmt.cc : Checked the restart informations by volume [(4, 0x23456) (5, 0x13579)]
i LogReplay RowStoreTransactionCallback.cc : Finished master-slave DTX consistency check
7. Savepoint
All changes that have been performed in steps 3 – 5 are now persisted to the DATA volume by a savepoint.
This step shouldn’t take too long while it can vary depending on the following factors:
◈ The volume of changes during replay of logs phase.
◈ The write performance of the DATA volume(s)
Relevant trace entries:
i PersistenceManag PersistenceManagerImpl.cpp : writing restart savepoint
i Savepoint SavepointImpl.cpp : Savepoint current savepoint version: 124, restart redo log position: 0x0000001236, next savepoint version: 125, last snapshot SP version: 0
8. Checking the Row Store Consistency
If you haven’t set parameter
indexserver.ini -> [row_engine] -> consistency_check_at_startup
to none, a row store consistency check is performed as last step during startup.
Usually, this check finishes quite fast and it’s limited to a maximum runtime of 10 minutes.
If you’re confident to take good care of consistency checks while the system is up and running and feel safe to disable this quick check during startup to save the <= 10 minutes of time spent during startup, refer to SAP Note 2222217 to disable this check.
The time spent by this step should not vary a lot from startup to startup and depends on:
◈ The HANA version
◈ The size of row store
◈ The value of parameter consistency_check_at_startup
Relevant trace entries:
i Service_Startup IntegrityChecker.cc : Checking rowstore consistency.
i Service_Startup IntegrityCheckerTimer.h : Startup Row Table Consistency Check timeout is 600
i Service_Startup IntegrityCheckerTimer.h : Startup Row Table Consistency Check is finished before timeout
i Service_Startup IntegrityChecker.cc : Consistency check time: 537.69 seconds
9. Open SQL Port
Now that all the work is done, the SQL port is opened. Just like step the sync among the services, this is a very quick thing to do.
If you really see issues that cause opening the SQL port to be delayed or to fail, it’s usually because another operating system process already occupies the port.
To ensure that all HANA-related ports are reserved correctly, you can use the SAP Host Agent as described in SAP Note 2382421.
As soon as the SQL port is open, your applications are, technically speaking, able to access the HANA system. Still, remember that not all column store tables have been loaded yet and that, depending on your experience with your system, you might want to wait for the column store preload to finish (or not) before restarting the applications that use your HANA system.
Relevant trace entries:
i Service_Startup tcp_listener_callback.cc : Configure channel send timeout to SQL listener: 300000 ms
i Service_Startup tcp_listener_callback.cc : Configure channel receive timeout to SQL listener: 300000 ms
i Service_Startup tcp_listener_callback.cc : start the SQL listening port: 30015 with backlog size 128
i assign TREXIndexServer.cpp : assign to volume 3 finished
i TableReload TRexApiSystem.cpp : Starting reloading column store tables based on previous load information
That’s it for now.
No comments:
Post a Comment