Friday, 9 July 2021

SAP Data Intelligence: Insert JSON Data into SAP HANA or SAP Data Warehouse Cloud

Introduction

JSON is a widely spread format for exchanging and storing structured data. It enjoys great popularity in the age of microservices, e.g. for exchanging data between servers and clients via RESTful APIs. In this blog post, I’ll show you how to persist data in JSON format to an SAP HANA table or an SAP Data Warehouse Cloud table using SAP Data Intelligence.

For this purpose, the “SAP HANA Client” operator in SAP Data Intelligence is used. In addition to other options, such as executing SQL commands and inserting csv. data, it offers the possibility to persist JSON files in SAP HANA and SAP Data Warehouse Cloud. It is crucial that the file has the correct format. You can see examples of the correct implementation of the JSON file in this blog post.

The remainder of this post is organized as follows: First, two suitable JSON formats are presented in which the source data can be transformed to be utilized in the “SAP HANA Client” operator. Next, an example scenario for inserting data with these two formats is illustrated. Finally, a summary is provided.

SAP Data Intelligence, SAP HANA, SAP Data Warehouse Cloud, SAP HANA Exam Prep, SAP HANA Certification, SAP HANA Career, SAP HANA Preparation

Prerequisite

Your SAP HANA database or SAP Data Warehouse Cloud tenant must be connected to your SAP Data Intelligence system via the connection management of Data Intelligence.

Option 1 – JSON object with keys

To use the “SAP HANA Client” operator in this scenario, the input data is sent as a JSON array to the “data” port of the “SAP HANA Client” operator. In Option 1, each element that is written into the database as a record is a JSON object whose keys match the column names of the target table.

Below, you find an example of the JSON file. The table has the three columns “ID“, “COUNTRY” and “CAPITAL“:

 [

    {

        "COUNTRY": "France",

        "ID": 1,

        "CAPITAL": "Paris"

    },

    {

        "ID": 2,

        "CAPITAL": "Berlin",

        "COUNTRY": "Germany",

        "SparseKey": "Example"

    }

]

As you can see in this example, the JSON object can also contain other keys that are not supposed to be inserted into the database table. However, none of the columns from the SAP HANA table can be missing within the JSON file. Furthermore, you can see that the keys can be contained in a different order.

Option 2 – JSON array representing a table row

To use the “SAP HANA Client” operator in this scenario, the input data is sent as a JSON array to the “data” port of the “SAP HANA Client” operator.  In Option 2, each element that is written into the database as a record is a JSON array whose elements match the column names of the target table in the correct order.

An example JSON could look like this. The table has the three columns “ID“, “COUNTRY” and “CAPITAL“:

 [

    [

        3,

        "Austria",

        "Vienna"

    ],

    [

        4,

        "Greece",

        "Athens"

    ]

]

If the JSON array contains fewer elements than the number of columns in the target table, “NULL” values are inserted for the remaining fields.

Example:

In the example illustrated here, the JSON files shown in Option 1 and Option 2 are written into a table (table with three columns “ID“, “COUNTRY” and “CAPITAL“) in SAP Data Warehouse Cloud.

The example graph looks as follows. It consists of (1) a Python 3.6 operator (“Send JSON data“) which sends the JSON file, (2) an “SAP HANA Client” operator which is used for persistence and (3) a “Graph Terminator” operator which is used to terminate the pipeline.

SAP Data Intelligence, SAP HANA, SAP Data Warehouse Cloud, SAP HANA Exam Prep, SAP HANA Certification, SAP HANA Career, SAP HANA Preparation
Example Graph

The picture below shows how the “SAP HANA Client” operator must be configured. In the “Connection” configuration, the connection to the SAP Data Warehouse Cloud is set up. In the “Table Name” configuration, the table in which the data from the JSON file is to be inserted is specified (this table can also be created at runtime by the operator, if the configuration of “Table Initialization” is set to “Create“). In the “Table Columns” configuration, the table columns are defined (see also the following figure with the columns of the example table). The configuration of the field “Input Format” is especially important. Here, “JSON” must be selected for the scenario to work.

SAP Data Intelligence, SAP HANA, SAP Data Warehouse Cloud, SAP HANA Exam Prep, SAP HANA Certification, SAP HANA Career, SAP HANA Preparation
Configuration SAP HANA Client

SAP Data Intelligence, SAP HANA, SAP Data Warehouse Cloud, SAP HANA Exam Prep, SAP HANA Certification, SAP HANA Career, SAP HANA Preparation
Configuration SAP HANA Client

Example of Option 1:

Below is the code of the “Send JSON data” operator, which can be used to send the JSON file from Option 1 to the “SAP HANA Client” operator.

import json

attributes = {}
example_json = [{"COUNTRY": "France", "ID": 1, "CAPITAL": "Paris"}, {"ID": 2, "CAPITAL": "Berlin", "COUNTRY": "Germany", "SparseKey": "Example"}]

body = json.dumps(example_json)

api.send("output", api.Message(body, attributes))

After successfully running the graph, the table can be viewed in SAP Data Warehouse Cloud.

SAP Data Intelligence, SAP HANA, SAP Data Warehouse Cloud, SAP HANA Exam Prep, SAP HANA Certification, SAP HANA Career, SAP HANA Preparation
Result in SAP Data Warehouse Cloud

The table was created and the two records were successfully inserted into the table.

Example of Option 2:

Below is the code of the “Send JSON data” operator, which can be used to send the JSON file from option 2 to the “SAP HANA Client” operator.

import json

attributes = {}
example_json = [[3, "Austria", "Vienna"], [4, "Greece", "Athens"]]

body = json.dumps(example_json)

api.send("output", api.Message(body, attributes))

After successfully executing the graph, the table can be viewed in SAP Data Warehouse Cloud.

SAP Data Intelligence, SAP HANA, SAP Data Warehouse Cloud, SAP HANA Exam Prep, SAP HANA Certification, SAP HANA Career, SAP HANA Preparation
Result in SAP Data Warehouse Cloud

Great! The two data records have been successfully persisted.

As another example, the following JSON in the format from option 2 is sent to the “SAP HANA Client” operator via the “Send JSON data” operator.

[[5, "China"]]

SAP Data Intelligence, SAP HANA, SAP Data Warehouse Cloud, SAP HANA Exam Prep, SAP HANA Certification, SAP HANA Career, SAP HANA Preparation
Result in SAP Data Warehouse Cloud

Since the JSON array contained only 2 values, a “NULL” value was inserted in the “CAPITAL” column.

Another reference to the operator documentation to keep in mind: “When using JSON, each column value will be treated in its JSON-native type, unconverted.” If the data type of the value within your JSON file differs from the data type of the target table column in SAP HANA/ SAP Data Warehouse Cloud, your pipeline will crash (for both, Option 1 and Option 2).

No comments:

Post a Comment