Background
SAP HANA uses an involved algorithm to decide about the parallelism with which queries are executed. However, in some situations you want to influence the degree of parallelization on top of the general HANA mechanism and control the parallelization of individual parts of your models based on business considerations.
How this can be achieved with SAP HANA Calculation Views is the topic of the current posting. While this might be a helpful option for individual highly time-critical models it should not be used in an uncontrolled way as otherwise the overall parallelization in HANA might be negatively affected.
Since HANA 2.0, SPS04, it is possible to control by modeling the number of logical threads that will be used to execute specific parts of the model. We refer in the following to “logical” threads because we describe the parallelization from the model perspective. Each logical thread might be further parallelized based on HANA parallelization optimizations. For example, an aggregation node can be parallelized into 6 logical threads based on the model definition. Within each logical thread additional parallelization might occur independent of the model settings.
In a nutshell, a flag called “Partition Local Execution” marks the start and stop nodes between which the degree of logical parallelization is defined. The start flag needs to be set on a node that is based on a table. Typically, this will be a Projection or Aggregation node. This node will mark the beginning of the parallelization. The end of parallelization is marked by an Union node in which this flag is set again.
The degree of parallelization is defined at the node where the start flag is set.
Parallelization based on table partitions
The degree of parallelization can be determined by the number of partitions of the table that is used as a data source of the start node. Parallelizing based on table partitions can be especially helpful in scale-out solutions where data needs to be transferred between different hosts. Imagine a table whose partitions are distributed over several hosts. Without parallelization based on table partitioning, data from the individual partitions would be send over the network in order to be processed according to the model on one host. With parallelization based on table partitioning the logic that is modeled in the parallelization block would be processed on the local host on which the table partitions are located. This local processing would not only distribute the processing load but probably also reduce the number of records that need to be sent over the network.Thus it is a good idea to have an aggregation node included at the end of your parallelization block to reduce the data before sending it over the network.
Parallelization based on distinct entries in column
Alternatively, the degree of parallelization can be determined by the number of distinct values in a column of the table that serves as data source of the start node.
Verify Degree of Parallelization
Three methods are shown below how you can check the degree of parallelization at runtime. All methods work for both, parallelization based on table partitions and distinct entries:
a) add a calculated column that contains the logical partition id
b) use SQL Analyzer
c) use tracing of query execution
a) add a calculated column that contains the logical partition id
To see the id of the logical thread that processed a certain record, you can add the Column Engine expression “partitionid()” as a calculated column and display this column in the results.
calculated column that evaluates the partition identifier
Assuming that parallelization is based on a table that consists of 6 partitions you will see an integer between 1 and 6 behind each record. The number corresponds to the logical thread that processed the record. This means that records with the same partition id were processed by the same thread:
calculated column “partitionIndicator” (filled by column engine expression “partitionid()”) indicates the logical thread that processed the record
b) use SQL Analyzer
run SQL Analyzer for your statement and navigate to the nodes that should be parallelized. If, for example, your parallelization is based on 6 partitions of a table you would find your node 6 times. In the screenshot below the aggregation node is grouped in 5 identical plus 1 more aggregation node (6.) at which the processing is merged:
SQL Analyzer showing 6 parallel processing threads for the aggregation
If you drill-down into the Aggregation “ceAggregationPop” you will find the calculated columns that are parallelized as well:
calculated column is evaluated in each parallel thread
c) use tracing of query execution
Finally, you can trace your individual query, e.g., by adding “WITH PARAMETERS (‘PLACEHOLDER’ = (‘$$CE_SUPPORT$$’,”))” to your query. When you execute the query a trace file with details on Calculation View execution will be created and stored as an indexserver trace file that ends in “_cesupport.trc”. If you open this trace file you will find something similar to:
“Partitioned execution rule: 6partitions found for table ‘LD0::EXAMPLEPARALLELIZATION_HDI_DB:tables::partitionedTable (t -1)’. The partitioned execution branch will be cloned.”
The screenshot below illustrates this.
“indexserver<…>_cesupport.trc” trace file showing parallelization of degree 6
Constraints
It is not allowed to define a parallelization block across stacked views or stack one parallelization block on top of another in a query. The constraints are illustrated graphically below.
constraints for using parallelization setting
Setting the parallelization flag will block unfolding as blocked unfolding is a pre-requisite to use this functionality. In addition, due to a bug, it is not possible to parallelize on source tables that are refered to by synonyms before HANA revision 42.
In sum the flag “Partition Local Execution” gives you a high level of control over the degree of parallelization for certain parts of your models based on your business logic and their relevance. Parallelization based on table partitioning can be especially helpful in scenarios in which tables are distributed across several hosts. However, influencing parallelization by modeling should be used wisely to avoid overparallelizing all models at the cost of other processes.
No comments:
Post a Comment