Introduction
JSON is a widely spread format for exchanging and storing structured data. It enjoys great popularity in the age of microservices, e.g. for exchanging data between servers and clients via RESTful APIs. In this blog post, I’ll show you how to persist data in JSON format to an SAP HANA table or an SAP Data Warehouse Cloud table using SAP Data Intelligence.
For this purpose, the “SAP HANA Client” operator in SAP Data Intelligence is used. In addition to other options, such as executing SQL commands and inserting csv. data, it offers the possibility to persist JSON files in SAP HANA and SAP Data Warehouse Cloud. It is crucial that the file has the correct format. You can see examples of the correct implementation of the JSON file in this blog post.
The remainder of this post is organized as follows: First, two suitable JSON formats are presented in which the source data can be transformed to be utilized in the “SAP HANA Client” operator. Next, an example scenario for inserting data with these two formats is illustrated. Finally, a summary is provided.
Prerequisite
Your SAP HANA database or SAP Data Warehouse Cloud tenant must be connected to your SAP Data Intelligence system via the connection management of Data Intelligence.
Option 1 – JSON object with keys
To use the “SAP HANA Client” operator in this scenario, the input data is sent as a JSON array to the “data” port of the “SAP HANA Client” operator. In Option 1, each element that is written into the database as a record is a JSON object whose keys match the column names of the target table.
Below, you find an example of the JSON file. The table has the three columns “ID“, “COUNTRY” and “CAPITAL“:
[
{
"COUNTRY": "France",
"ID": 1,
"CAPITAL": "Paris"
},
{
"ID": 2,
"CAPITAL": "Berlin",
"COUNTRY": "Germany",
"SparseKey": "Example"
}
]
As you can see in this example, the JSON object can also contain other keys that are not supposed to be inserted into the database table. However, none of the columns from the SAP HANA table can be missing within the JSON file. Furthermore, you can see that the keys can be contained in a different order.
Option 2 – JSON array representing a table row
To use the “SAP HANA Client” operator in this scenario, the input data is sent as a JSON array to the “data” port of the “SAP HANA Client” operator. In Option 2, each element that is written into the database as a record is a JSON array whose elements match the column names of the target table in the correct order.
An example JSON could look like this. The table has the three columns “ID“, “COUNTRY” and “CAPITAL“:
[
[
3,
"Austria",
"Vienna"
],
[
4,
"Greece",
"Athens"
]
]
If the JSON array contains fewer elements than the number of columns in the target table, “NULL” values are inserted for the remaining fields.
Example:
In the example illustrated here, the JSON files shown in Option 1 and Option 2 are written into a table (table with three columns “ID“, “COUNTRY” and “CAPITAL“) in SAP Data Warehouse Cloud.
The example graph looks as follows. It consists of (1) a Python 3.6 operator (“Send JSON data“) which sends the JSON file, (2) an “SAP HANA Client” operator which is used for persistence and (3) a “Graph Terminator” operator which is used to terminate the pipeline.
No comments:
Post a Comment