Wednesday, 26 September 2018

Difference in query results when run from HANA MDX and HANA SQL

SAP HANA offers multiple access options to query SAP HANA views available in database:

◈ The HANA SQL access, the primary HANA access, allows you to query HANA views using the SQL query language. This access is optimized for on-premise solutions.

◈ The HANA MDX access, that offers multi-dimensional concepts to HANA models, allows you to query HANA views as cubes (including HANA hierarchies) through the Multi-Dimensional eXpressions (MDX) query language.

◈ The HANA InA access, designed for Web applications or solutions, allows you to provide real time access to information stored in HANA database using the SAP HANA Info Access (InA) HTTP Service. The SAP HANA info access HTTP service wraps SQL queries and exposes them through an HTTP interface.
This document describes the difference in query results when you run the same query specification on a same HANA view from HANA SQL/InA and HANA MDX.

Use case description


To illustrate this difference, let’s consider a HANA view (named FINANCE here) containing the following dimensions and measures:

Type Name  Label 
Dimension DIMACCOUNT Account
Dimension  DIMDEPARTMENT  Department 
Dimension  DIMORGANIZATION   Organization 
Dimension  DIMSCENARIO  Scenario 
Dimension  DIMTIME  Time 
Measure  AMOUNT  Amount 

In this HANA view, you also create and associate the following HANA hierarchies to the dimensions:

Dimension Hierarchy Name Hierarchy Type
DIMACCOUNT ACCOUNTS Parent-child hierarchy
DIMDEPARTMENT DEPARTMENTS Parent-child hierarchy
DIMTIME DATE_CALENDAR Level-based hierarchy
DIMTIME FISCAL_CALENDAR Level-based hierarchy

In HANA SQL, to get the query results for Amount per Organization, you write the following script:

SELECT

    View__1.”ORGANIZATION_KEY”,

    View__1.”ORGANIZATION”,

    SUM(View__1.”AMOUNT”)

FROM

    “_SYS_BIC”.”adventure-works/FINANCE” View__1

GROUP BY

    View__1.”ORGANIZATION_KEY”,

    View__1.”ORGANIZATION”

ORDER BY

    1

And get the following result:

Organization key Organization Amount
Northeast Division  134003346
Northwest Division 138922889
Central Division  149032082
Southeast Division  279284664
Southwest Division  232365971
Canadian Division  292174769
11  France  66963785
12  Germany  30339796
13  Australia  35553070


In HANA MDX, you write the following MDX script to express the same query specification:

SELECT

    NON EMPTY {[Measures].[AMOUNT]} ON COLUMNS,

    NON EMPTY [DIMORGANIZATION].[DIMORGANIZATION].[ORGANIZATION_KEY].Members

DIMENSION PROPERTIES [DIMORGANIZATION].[DIMORGANIZATION].[ORGANIZATION_KEY].[ORGANIZATION] ON ROWS

FROM [FINANCE]

As a result, you receive:

Organization key Organization Amount
Northeast Division  106108814
Northwest Division 108873980 
Central Division  117558826 
Southeast Division  252761627 
Southwest Division  183205286 
Canadian Division  231313761 
11  France  54658581 
12  Germany  24280245 
13  Australia  28249786 

The two queries return different results in the Amount values (Note that the HANA InA access provides the same query results as HANA SQL).

Why this difference between the SQL and MDX queries?

Paradigm difference between SQL and MDX


The query results displayed above are both corrects for HANA SQL and HANA MDX. However, SQL and MDX have different paradigms that need to be understood.

In the query specification above, you only want to query the Amount per Organization.

In the SQL paradigm, this means to ignore all other dimensions (i.e. Account, Department, Scenario and Time) and to just aggregate implicitly the Amount over them.

In the MDX world, the paradigm is different. Indeed, the MDX paradigm is like the cartesian (x, y) axis coordinates. Suppose you have a function f such as Amount = f (x, y) (i.e. Amount is a function of x and y). What would be the Amount for x = 10? It cannot be defined because Amount is a function of x and y and y is unknown. Amount cannot be determined since values for x and y are both required. The MDX standard works in this mode.

Thus, in MDX, for the HANA view defined above, the Amount measure depends on the dimensions Account, Department, Organization, Scenario and Time. Because your MDX query above includes explicitly only the Organization dimension to get Amount, the query result is therefore indeterminate.

For this use case, in MDX, when a dependent dimension is not explicitly included in a query axis to evaluate a measure, the default member of that dimension is implicitly included in the slicer axis to complete the coordinates.

Determining the default member in SAP HANA


In HANA MDX, each dimension has a default hierarchy associated to it. These default hierarchies can be implicitly generated by HANA MDX; or authored in the HANA view by the designer. And each of these hierarchies may have a default member.

In HANA, the default member of a hierarchy is, by default, the member that has the property MEMBER_ORDINAL = 0.

For hierarchies implicitly generated by HANA MDX, the default member corresponds to the hierarchy root node i.e. the “ALL” member. However, for hierarchies authored in the HANA view, the default member depends on hierarchy settings.

To know the default member associated to hierarchies in your HANA view in MDX, you can run the following query:

MDX SELECT “DIMENSION_UNIQUE_NAME”, “HIERARCHY_UNIQUE_NAME”, “DEFAULT_MEMBER” FROM BIMC_HIERARCHIES WHERE “CATALOG_NAME”=’adventure-works’ AND “CUBE_NAME”=’FINANCE’
That returns the following result:

DIMENSION_UNIQUE_NAME HIERARCHY_UNIQUE_NAME   DEFAULT_MEMBER 
[DIMACCOUNT]  [DIMACCOUNT].[ACCOUNTS] [DIMACCOUNT].[ACCOUNTS].&[1]
[DIMDEPARTMENT]  [DIMDEPARTMENT].[DEPARTMENTS]   [DIMDEPARTMENT].[DEPARTMENTS].&[1] 
[DIMORGANIZATION]  [DIMORGANIZATION].[DIMORGANIZATION]  [DIMORGANIZATION].[DIMORGANIZATION].[All].[(all)] 
[DIMSCENARIO]  [DIMSCENARIO].[DIMSCENARIO]   [DIMSCENARIO].[DIMSCENARIO].[All].[(all)] 
[DIMTIME] [DIMTIME].[DATE_CALENDAR]   [DIMTIME].[DATE_CALENDAR].[All].[(all)] 
[DIMTIME]  [DIMTIME].[FISCAL_CALENDAR]   [DIMTIME].[FISCAL_CALENDAR].[All].[(all)] 
[Measures] [Measures].[Measures] [Measures].[AMOUNT] 

In the query result above, you can observe that all hierarchies have a root node (i.e. the ALL member) as default member; excepted for the Accounts and Departments hierarchies that have respectively ACCOUNTS.&[1] and DEPARTMENTS.&[1] as default members.

This means that your MDX query above corresponds to the Amount for ACCOUNTS.&[1] and DEPARTMENT.&[1] i.e. the Amount will be aggregated over all SCENARIO and TIME dimension members (as in SQL) but it will be NOT aggregated over all ACCOUNTS or DEPARTMENTS.

That explains why the MDX query result for Amount is less than the one you observe in SQL – but this is correct.

Solution


The HANA modelers (HANA Studio or Web IDE For SAP HANA) offer various ways to set explicitly or implicitly default members for HANA hierarchies authored in HANA views.

To produce the same query results when run from HANA SQL and HANA MDX, the solution consists of having authored hierarchies with default members that place all other hierarchy members as their descendants.

The simplest workaround is to add a root node as default member for your authored hierarchies:

1. Edit your hierarchy
2. Go to Advanced settings
3. Clear the Default Member field (if non-empty)
4. Set “Add Root Node” for the Root Node Visibility option

SAP HANA Tutorial and Materials, SAP HANA Guides, SAP HANA Study Materials, SAP HANA Learning

Indeed, when you set Add Root Node for the Root Node Visibility option, then an ALL member is added as default member if the Default Member field is empty. For parent-child hierarchies, if you set Add Root Node If Defined for the Root Node Visibility option, then a root node is added only if you have defined a root node value while creating the parent child hierarchy.

Thus, if you set these options for ACCOUNTS and DEPARTMENTS hierarchies, then your MDX query returns the same result as your SQL query.

No comments:

Post a Comment