Introduction
SAP HANA Cloud can virtually integrate data to be positioned as a "gateway for enterprise data" within the enterprise. Specifically, it is possible to virtually integrate various data sources via a virtual table created on HANA Cloud (data federation).
However, in actual operation, there may be cases where you want to retain data in SAP HANA Cloud due to network speed and performance / load on the source system side.
On-premises SAP HANA needs to replicate data in this case using ETL features such as SDI, but SAP HANA Cloud only replicates the data by setting it as an option for virtual tables, or snapshot data. It is possible to have. (No need to create ETL job!)
In this blog, I will test the function of this virtual table of SAP HANA Cloud.
◉ Data virtualization (federation)
◉ Data replication (replication)
◉ Data snapshot (cache)
It is an introduction of the following shaded parts (Online switch federation / caching / replication) in the overview material of SAP HANA Cloud.
1. Data Virtualization (Federation)
First, create a virtual table that references the on-premises SAP HANA table.
create virtual table "FVT_LINEITEM" at "OPHANA"."<NULL>"."TPCH"."LINEITEM_2";
The virtual table has been created.
Confirm the number of cases.
Issue a SELECT statement. Since it is a virtual table, it does not have data in SAP HANA Cloud, but it acquires data in on-premises SAP HANA and returns only the result data to SAP HANA Cloud. This is taking some time. (On-premises SAP HANA is at AWS us-east, SAP HANA Cloud is at AWS Frankfurt.)
2. Data replication
Next, I would like to execute an ALTER TABLE statement on the virtual table I just created to duplicate the data, not the data federation.
alter virtual table "FVT_LINEITEM" add shared replica;
As a result, the table holding the replicated data is automatically generated in the schema "_SYS_TABLE_REPLICA_DATA" as shown below.
If you check the information of this automatically generated table, you can see that the number of data is the same as the data on the source side and the data is physically held from the size.
When confirming the data count for the virtual table, the data count is the same as before ALTER TABLE.
Execute the same SELECT statement as before. You can see that the performance is much faster.
Next, try updating the data on the source side. This time, 2 records have been added.
If you check the number of data items for the virtual table, you can see that the two item data has increased.
When checking the automatically generated table, this also increased by 2 cases. In other words, it was confirmed that the data updated on the source side was replicated to SAP HANA Cloud in near real time.
Finally, delete this duplicate.
alter virtual table "FVT_LINEITEM" drop replica;
The automatically generated table has been deleted.
3. Snapshot replication
Then do a snapshot duplication. Issue the following ALTER TABLE statement for the virtual table.
alter virtual table "FVT_LINEITEM" add shared snapshot replica;
Similarly, a table is automatically generated in the “_SYS_TABLE_REPLICA_DATA” schema.
By checking the information in the table, you can confirm that you physically hold the data.
Issuing a SELECT statement will return results as quickly as the replication pattern above.
Next, try updating the data on the source system side. This time, I added one data.
The source data has been updated, but the SAP HANA Cloud snapshot table data has not.
The data is not updated even if the number of data is confirmed for the virtual table.
Update the snapshot with the following command.
alter virtual table "FVT_LINEITEM" refresh snapshot replica;
The data was updated when I checked the virtual table.
If you check the automatically generated snapshot table, the data is updated here as well
Finally, delete the snapshot.
alter virtual table "FVT_LINEITEM" drop replica;
The snapshot has been deleted.
Issuing a SELECT statement against a virtual table now takes a few seconds, as before. (Obtaining data from on-premises SAP HANA)
No comments:
Post a Comment