Altering HANA behavior at peak load using Admission Control in HANA 2.0
Below are the details which helps us understand how HANA 2.0 uses admission control to handle higher system load in a simple question and answer format. Hope it helps!
NOTE:You might not be able to find some of system views (M_ADMISSION_CONTROL_*) specified here if you are not in HANA 2.0 SP05 , though HANA 2.0 has a default admission control enabled.
1. What is admission control?
Admission control feature in HANA is used to apply processing limits and to decide how to handle new requests if the system is close to the point of saturation.
2. How does HANA manages peak load using Admission control ?
HANA applies thresholds using configuration parameters to define an acceptable limit of activity in terms of the percentage of memory usage or percentage of CPU capacity.
Limits can be applied at two levels.
1. QUEUING REQUESTS: New requests will be queued until adequate processing capacity is available or a timeout is reached.
indexserver.ini->admission_control->
2. REJECTING REQUESTS: A higher threshold can be defined to determine the maximum workload level above which new requests will be rejected. If requests have been queued, items in the queue are processed when the load on the system reduces below the threshold levels. If the queue exceeds a specified size or if items are queued for longer than a specified period of time they are rejected. In the case of rejected requests an error message is returned to the client that the server is temporarily overloaded: 1038,’ERR_SES_SERVER_BUSY’,’rejected as server is temporarily overloaded’.
indexserver.ini->admission_control->
Requests in queue will be rejected not only once the threshold values specified above is reached, it will also be thrown out once the queue time out value specified that is maintained with below parameter.
indexserver.ini->admission_control->
3. How does HANA knows as of when the Admission control should be used?
The load on the system is measured by background processes which gather a set of performance statistics covering available capacity for memory and CPU usage. The statistics are moderated by a configurable averaging factor (exponentially weighted moving average) to minimize volatility, and the moderated value is used in comparison with the threshold settings.
Steps:
A. Statistics collected on regular interval
Statistics table which collects the system usage details:M_ADMISSION_CONTROL_STATISTICS.
Statistics collecting interval: Based on the parameter below.
statistics_collection_interval | 1000 | Unit milliseconds. The statistics collection interval is set by default to 1000ms (1 second) which has a negligible effect on performance. Values from 100ms are supported. Statistics details are visible in the view M_ADMISSION_CONTROL_STATISTICS. |
TABLE NAME: M_ADMISSION_CONTROL_STATISTICS:Provides the overall statistics values of the Session-Wise Admission Control feature
B. The statistics are moderated by a configurable averaging factor (exponentially weighted moving average) to minimize volatility
These parameters, in the admission_control section of the ini file, are summarized in the following table
Parameter | Default | Detail |
averaging_factor | 70 | This percentage value gives a weighting to the statistic averaging process: a low value has a strong moderating effect (but may not adequately reflect real CPU usage) and a value of 100% means that no averaging is performed, that is, only the current value for memory and CPU consumption is considered. |
averaging_factor_cpu | 0 | This parameter can be used specifically to smooth statistics on CPU usage. It is not set by default (value zero) but if it is set then it overrides the averaging_factor parameter and this CPU value is applied as a factor to average CPU statistics only. |
averaging_factor_memory | 0 | This parameter can be used specifically to smooth statistics on memory usage. It is not set by default (value zero) but if it is set then it overrides the averaging_factor parameter and this memory value is applied as a factor to average memory statistics only. |
C. The moderated value from step B above is used in comparison with the threshold settings
Parameter | Default | Detail |
enable | True | Enables or disables the admission control feature. |
queue_cpu_threshold | 90 | The percentage of CPU usage above which requests will be queued. Queue details are available in the view M_ADMISSION_CONTROL_QUEUES. |
queue_memory_threshold | 90 | The percentage of memory usage above which requests will be queued. |
reject_cpu_threshold | 0 | The percentage of CPU usage above which requests will be rejected. The default value 0 means that no requests are rejected, but may be queued. |
reject_memory_threshold | 0 | The percentage of memory usage above which requests will be rejected. The default value 0 means that no requests are rejected, but may be queued. |
M_ADMISSION_CONTROL_QUEUES:Provides detailed information regarding queued session requests by Session-Wise Admission Control.
4. What are the ways by which we can get the information as of which connections has been queued for later processing ?
If requests have been queued, items in the queue are processed when capacity becomes available. A background job continues to evaluate the load on the system in comparison to the thresholds and when the load is reduced enough queued requests are submitted in batches on an oldest-first basis.
The queue status of a request is visible in the M_CONNECTIONS view; the connection status value is set to queuing in column M_CONNECTIONS.CONNECTION_STATUS.
5. How to control Admission control queues and when can the queues be released for execution?
There are several configuration parameters (in the admission_control section of the ini file) to manage the queue and how the requests in the queue are released. We can apply a maximum queue size or a queue timeout value; if either of these limits are exceeded then requests which would otherwise be queued will be rejected. An interval parameter is available to determine how frequently to check the server load so that de-queueing can start, and a de-queue batch size setting is also available.
Parameter | Default | Detail |
max_queue_size | 10000 | The maximum number of requests which can be queued. Requests above this number will be rejected. |
dequeue_interval | 1000 | Unit: milliseconds. Use this parameter to set the frequency of the check to reevaluate the load in comparison to the thresholds. The default is 1000ms (1 second). This value is recommended to avoid overloading the system, though values from 100ms are supported. |
dequeue_size | 50 | Use this parameter to set the de-queue batch size, that is, the number of queued items which are released together once the load is sufficiently reduced. This value can be between 1 and 9999 queued requests. |
queue_timeout | 600 | Unit: seconds. Use this parameter to set the maximum length of time for which items can be queued. The default is 10 minutes. The minimum value which can be applied is 60 seconds, there is no maximum limit. Requests queued for this length of time will be rejected. Note that the timeout value applies to all entries in the queue. Any changes made to this configuration value will be applied to all entries in the existing queue. |
queue_timeout_check_interval | 10000 | Unit: milliseconds. Use this parameter to determine how frequently to check if items have exceeded the queue timeout limit. The default is 10 seconds. The minimum value which can be applied is 100 milliseconds, there is no maximum limit. |
6. What are the situation where admission control should not be used ?
There are some situations where it is not recommended to enable admission control, for example, during planned maintenance events such as an upgrade or the migration of an application. In these cases it is expected that the load level is likely to be saturated for a long time and admission control could therefore result in the failure of important query executions.
7. Is admission control in HANA applied to all the processes?Can we bypass admission control?
The admission control filtering process does not apply to all requests. In particular, requests that release resources will always be executed, for example, Commit, Rollback, Disconnect and so on.
The filtering also depends on user privileges
8. During peak load where HANA has already started rejecting requests , how can the HANA DB administrator enter into HANA DB check on system status?
During peak load administration requests from SESSION_ADMIN and WORKLOAD_ADMIN are always executed. However, this is limited: the check on user privileges is only made for the first request when a connection is made. For subsequent requests no check is made and all users, regardless of privileges, are subject to the workload controls in place.The user with administrator privileges would have to reconnect to be able to bypass the controls again. This functionality was implemented to reduce the overhead incurred in checking privileges for routed connections. No privilege check takes place when admission control is disabled
9. How does admission control reacts to queries which is to be directed to secondary system in ACTIVE/ACTIVE setup?
No changes in admission control behavior. Admission control evaluates requests at the session layer as part of statement preparation before decoding the request packet from the client, whereas the decision about routing the request is made later in the SQL engine. This means that in an active-active setup if the admission control load threshold is exceeded on the primary then the incoming request is queued on the primary system. Statements which have been prepared and a decision to route the request to the secondary has already been made would directly connect to the secondary.
10. How to enable admission control in HANA?
Admission control is by default enabled in HANA with default parameter and is valid in HANA since 2.0.
11. What and how can we configure admission control related parameters?
Threshold values for admission control to determine when requests are queued or rejected are defined as configuration parameters.
The admission control feature is enabled by default and the related threshold values and configurable parameters are available in the indexserver.ini file. A pair of settings is available for both memory and CPU which define firstly
a. Queuing level (default value is 90%)
b. Rejection level (not active by default).
12. How to find the statistics details about queued and rejected requests by admission control?What is the interval by which this collection of data happens?
A high-level alert will be raised if there are any session requests queued or rejected by admission control.For this we have a system view M_ADMISSION_CONTROL_STATISTICS.This table is populated based on statistics collected by the parameter statistics_collection_interval.
13. How to evaluate the historic admission control event and the reason for its rejection?
If statements are being rejected we may need to investigate why this is happening. Events related to admission control are logged and can be reviewed in the M_ADMISSION_CONTROL_EVENTS view. The key information items here are the event type (such as a statement was rejected or a statement was queued or dequeued) and the event reason which gives an explanatory text related to the type. Other details in this view include the length of time the statement was queued and the measured values for memory and CPU usage.
Two parameters are available to manage the event log in the admission_control_events section of the ini file:
Parameter | Default | Detail |
queue_wait_time_threshold | 5000000 | The length of time measured in microseconds for which a request must be queued above which it is included in the event log (default is 5 seconds). If the parameter is set to 0 then events are not logged. |
record_limit | 1000000 | The maximum record count permitted in the monitor of historical events. |
14. How does admission control affects the other timeout parameters that have been in HANA DB?
If Admission Control has been configured and is active it takes precedence over any other timeout value which might have been applied. This means other timeouts which apply to a query (such as a query timeout) would not be effective until the query has been dequeued or rejected by the queue time out.
No comments:
Post a Comment