In this blog, I wanted to focus on Data Warehouse Cloud (DWC) which is a scalable, persona-driven DWaaS tailored for IT and business users.
I am excited to announce the availability of DWC in AWS Sydney (in addition to North Virginia and Frankfurt). In this blog, I wanted to share some of my learnings and show you how easy it is to get started with DWC. The best part is everything required from data ingestion, data modeling and creating dashboards is all included within DWC. Yes, DWC has an embedded SAP Analytics Cloud (SAC) to enable business users to quickly create dashboards. With DWC, its has become much more easy for business users to perform real-time analytics and reporting on live data to obtain insights and take the necessary action.
In this blog, I will walk you through the different capabilities of DWC and show how you can connect to a data source like HANA (on-premise or service on SAP Cloud Platform) and model the views. These views in DWC can be consumed by embedded SAC or other 3rd party tools via ODBC. In this example, I have show how to consume the modeled views in PowerBI as an example. Its important to note that, data from the source HANA system is virtualized and not replicated. This blog also will show how you can use your own ETL tools/SQL client tools to create tables directly within DWC.
Spaces:
DWC has an interesting concept of Spaces. When you login to DWC, you can go to Space Management and create your own spaces.
Spaces can be created for different departments within your organization to ensure object isolation, data access controls and you can also assign storage capacity, prioritization etc. For this blog, I have created a space called “Sales Forecast”.
Within the space, you can allocate the storage, assign members to this space and most importantly create connections to the source systems.
As of today, there are connectivity options for SAP HANA, SAP ABAP systems, MS SQL, Oracle, Generic OData. In the below example, I am using SAP HANA as connection type and providing the connection details of a HANA Cloud instance in SAP Cloud Platform. Below is the HANA Cloud instance which I am using for this demo.
I can launch the HANA Cockpit and DB Explorer to view the schemas and DB objects within this table. As you can see below, I have already loaded few tables with data from a script which were available in GitHub. We are going to see how these tables can be virtually consumed in DWC and modeled along with tables which exist in DWC.
When creating a connection in DWC, we would need to provide the host/Port and User/Password of this HANA system. If this HANA system was on-premise, then would need to configure a DPAgent on-premise and connect it to DWC. There is an administration screen for admins to setup the DPAgent configuration. In the below example, its not required as the HANA system is on SAP Cloud Platform.
Once a connection is added, you can test the connectivity. In DWC, you also have an option to use your own ETL tools/SQL Clients to SQL objects directly in DWC. Use the “Create Open SQL Schema” for this.
The popup window will provide you with the details of a Schema name, host/port along with the user/password created for this schema. This information will be required later.
Connecting via SQL Client
To connect with this schema, I will be using the DBeaver as a SQL Client tool. When creating a new connection, I would need to provide the details captured above
Once the connection is successful, you can run your SQL scripts out here. In the below example, I have created Product related master data tables and loaded them with data.
Modeling the views using Data Builder
Now, I am going to switch back to DWC and navigate to the Data Builder to begin the modeling of the views. The goal is to combine data from the source HANA system with the Product Master data tables and visualize them in a dashboard.
In the Data Builder, you can create new tables and upload data into them via CSV or create views based on local or remote tables. There is also an option to create new Entity Relationship (ER) models. This would help when you are modeling the views as the system would be able to easily identity the relationship with the objects which you drag and drop in the canvas.
Its very easy to create an ER model. From the Sources menu, identify the name of the connection which you have created in the space and navigate to the respective table. Drag and drop them on the canvas and connect them. This is a simple ER diagram, but in reality there will be many tables which will be involved and its good to have an ER model in place as it will help in the modeling of view in the subsequent steps.
Next, lets create a Graphical View. Navigate to your connection and drag and drop the required tables. In this case, I am bringing in Sales order and items table. The system automatically identifies the join condition.
In the projection, you can remove all the duplicate columns and select only those that are required for output.
You can preview the data and when ready set the type of the view to “Analytical Dataset”. You will need to add all the relevant columns as “Measures” and make sure you set the “Allow consumption” flag. This will ensure that you can access this view in the reporting tools.
Now that our view is ready with all the measure, we need to ensure that there is master data to help identify the details of each product in the sales orders. For this, we will need to create another Graphical view. Notice that in the “Source”, I navigate to the DWC schema “Sales_Forecast#Products” to view the product master data tables. I have place these tables on the canvas and connected them to achieve the appropriate output. Notice that the type of this view is “Dimension”.
Once I have saved and deployed this view, I should be able to see this in the “Repository” section as shown below. In the below example, I am editing the initial Sales Order related view and adding the “Product_Data” dimension view to the canvas.
You can preview the data and deploy the view once you are ready. This view should be able to show all the measures and master data related to the sales order and is not ready for consumption in a dashboard.
Now lets examine all the objects which have been created in the data builder for this space. Notice the difference between local table, Relational Dataset (Remote table), Analytical Dataset and Dimensions.
No comments:
Post a Comment