Recently I was asked what the difference between a Hana Calculation View and a SQL View is. To provide reading material, I did a quick search via google and was shocked by the answers I found. Some wrong, some missing the point, many outdated.
Hence let me give you my twist…
The difference between a CalcView and a SQL View is that the CalcView has the option to contain more semantic. In practical terms that means, a CalcView does the aggregation automatically, returning the summary always. A SQL View does return the individual records as defined inside the View.
Quite simple, it provides a certain view on the database tables. So it does not contain data, just logic.
In case of SQL Views, the logic is expressed in SQL. So a SQL View is nothing else than a stored SQL select statement. This can be seen when looking up the SQL View in the data dictionary.
How does the user get the sum of NETAMOUNT per product? Simply select the two columns from the CalcView.
Hence let me give you my twist…
The difference between a CalcView and a SQL View is that the CalcView has the option to contain more semantic. In practical terms that means, a CalcView does the aggregation automatically, returning the summary always. A SQL View does return the individual records as defined inside the View.
What is a Database View in general?
Quite simple, it provides a certain view on the database tables. So it does not contain data, just logic.
What is a SQL View?
In case of SQL Views, the logic is expressed in SQL. So a SQL View is nothing else than a stored SQL select statement. This can be seen when looking up the SQL View in the data dictionary.
For the user this View looks and feels like another database object he selects from. By executing a SQL query like this, maybe?
The important part is how this is executed. The database does not simply execute the View SQL and then throws away all records except the ones matching the condition. The database rather tries to merge the the View-SQL and the select-SQL into one.
The explain plan is the proof. It consists of a filter, the join and a project. The same execution plan as if merging the statements manually.
By knowing these internals it becomes obvious, that certain things in Views are bad for performance – everything that prevents merging the select’s where clause deeply into the view definition. But that is a separate topic to explore.
What is a CalcView?
At the looks of it, the same thing. It is defined using a graphical UI.
The user selects from it using SQL…
And the execution plan is similar. As this CalcView is joining more tables, the execution plan contains more tables but the key point is the filter condition – it is still executed first, well inside the calculation view.
What is the difference?
The most important difference is that the CalcView has more information about the structure. What is a measure? How is a measure aggregated? All of this can be seen in the CalcView editor, e.g. NETAMOUNT has a default aggregation of sum() assigned to it – see above screenshot from the CalcView editor again.
What happens when only the NETAMOUNT is selected in case of the SQL View with no conditions? Millions of rows with only the NETAMOUNT are shown.
What about the equivalent query against the CalcView? :drumroll:
The sum of NETAMOUNT is shown. One single line, because the corresponding aggregation was added automatically.
That looks like a neat trick but actually, it is a game changer.
What is CalcView good for?
In the good old Business Objects days, the relational tables had been imported into the Universe Designer and all logic was maintained there. How do the tables relate to each other? What is a measure, what an attribute? How are the measures aggregated, sum, count, …? What are the hierarchies, e.g. from country to region to city to customer name? Which measures can be used together, which are unrelated? All of that was defined in the Universe.
By moving that information one level deeper, into the database layer, everybody can utilize these definitions. By executing SQL statements against the CalcView as done here. An oData Service is using this information to allow expanding the data along hierarchy trees. All Business Intelligence tools can use it.
And if the CalcViews are not added as an after thought later but are already part of the CDS definition, all is in once place.
What is behind a CalcView?
In the SQL View case a SQL statement was the backing object. For a CalcView it is a Calculation scenario, an pretty much unreadable structure of how nodes are connected and more.
No comments:
Post a Comment