In this blog I describe how to set up real-time replication, and in this case for a Microsoft SQL, or MSSQL, database. The tool of choice is SAP Smart Data Integration (SDI), which comes with SAP HANA Cloud or the on-prem version.
The whole process is also documented in the readme file of this Github repo. That repo also contains the Web IDE project to built on top of the replication setup. How you can use the Web IDE to create real-time replication tasks, I’ll describe as a follow-up blog.
Trigger or log-based replication?
When you set up real-time replication with SDI, you have the choice between trigger and log-based replication. I won’t go into all the pro’s and con’s of the two techniques, but in my case the choice is simple. My source database is a cloud-based MSSQL database, which does not allow access to the log files. So, trigger-based replication it is.
Pre-requisites
◉ A MSSQL database (this tutorial uses an AWS Cloud DB);
◉ An SAP HANA database (this tutorial uses SAP HANA Cloud);
◉ The SAP SDI Data Provisioning Agent (this tutorial uses version 2.5.1.2);
◉ Connection set up between HANA and the SDI Data Provisioning Agent;
◉ A driver for MSSQL, for SDI to connect (this tutorial uses mssql-jdbc-8.4.1.jre8.jar) which should be stored in the ./lib directory of the data provisioning agent;
◉ Registration of the MSSQL adapter using the DP Agent client;
◉ A database client for MSSQL, I used DBeaver.
Prep MSSQL database
For this tutorial, a Microsoft SQL Server Express Edition database was created using the AWS Relational Database Service. The database version is the latest available 14.* version and is sized as a db.t3.small. Anything smaller would even make the creation of a simple table a slow process.
Below, a rundown of how you CAN configure your MSSQL database for trigger-based replication with SDI. I’m saying “CAN”, not “MUST” as the way that I’m creating a user and a schema might not be an MSSQL best practise. I do refer to the SAP Help pages with the requirements for the SDI technical user, so at the least follow those instructions.
Configure MSSQL database
Using the admin user that is provided by AWS, a “technical user” is created which will later be used for SDI to logon with to this source database.
USE master;
CREATE LOGIN HC_TECHNICAL_USER
WITH PASSWORD = '<password>',
CHECK_POLICY = OFF,
CHECK_EXPIRATION = OFF;
A database is created specifically for the replication tests
USE master;
CREATE DATABASE hc;
The technical user is granted the required privileges according to the SAP Help. The instructions below differ a bit from those, and are not all mandatory. There’s a trade-ff between security concern and usability. For example, I grant access to the entire schema and not individual tables.
Also, *disclaimer*, this is one way you can configure your source database, but it might not be the best way for you. I’m assuming you know what you’re doing.
In the new database, a new schema is created and for the technical user a user is created to login to this database
USE hc;
CREATE SCHEMA rep;
create user HC_TECHNICAL_USER for login HC_TECHNICAL_USER;
The user should also be allowed to create tables, to run our tests later
--Allow the user to create tables
USE hc;
GRANT CREATE TABLE TO HC_TECHNICAL_USER;
It’s up to you if you want to replicate DDL changes as well, I chose both DML/DDL.
--Creating a DML trigger requires ALTER permission on the table or schema on which the trigger is being created.
USE hc;
GRANT ALTER ON SCHEMA::rep TO HC_TECHNICAL_USER;
--Creating a DDL trigger with database scopes (ON DATABASE) requires ALTER ANY DATABASE DDL TRIGGER permission in the current database.
--Not needed if you don't want to replicate DDL changes
USE hc;
GRANT ALTER ANY DATABASE DDL TRIGGER TO HC_TECHNICAL_USER;
--GRANT CREATE PROCEDURE needed for SDI to create a generic procedure needed for replication
USE hc;
GRANT CREATE PROCEDURE TO HC_TECHNICAL_USER;
--GRANT VIEW SERVER STATE permission to view data processing state, such as transaction ID. This must be granted on the master database.
USE master;
GRANT VIEW SERVER STATE TO HC_TECHNICAL_USER;
Select access is needed to see data in the table, for example when viewing the content of the virtual table.
USE hc;
GRANT SELECT ON SCHEMA::rep TO HC_TECHNICAL_USER;
Besides, we want to also use the technical user to insert, update or delete data so we can run some DML tests with that user
USE hc;
GRANT SELECT, INSERT, UPDATE, DELETE ON SCHEMA::rep TO HC_TECHNICAL_USER;
The following is not in the SDI documentation, but was needed to allow replication (I’ll save you the error solving process)
--Create schema for SDI data to be stored
USE hc;
CREATE SCHEMA HC_TECHNICAL_USER;
--Grant privileges to technical user on the created schema
USE hc;
ALTER AUTHORIZATION ON SCHEMA::HC_TECHNICAL_USER TO HC_TECHNICAL_USER;
Create source table and insert a few records of initial data
DROP TABLE REP.SALES;
CREATE TABLE REP.SALES (ID INTEGER, CREATION_DATE DATE, CUSTOMER_NAME NVARCHAR(100), PRODUCT_NAME NVARCHAR (100), QUANTITY INTEGER, PRICE DECIMAL, POS_COUNTRY NVARCHAR(100), PRIMARY KEY (ID));
INSERT INTO REP.SALES VALUES (1,'20200908','Cas Jensen','Toothbrush 747','6','261.54','United States of America');
INSERT INTO REP.SALES VALUES (2,'20201018','Barry French','Shampoo F100','2','199.99','Germany');
Create remote source on HANA
Now it’s time to connect HANA with the MSSQL database, and that’s done using a so-called remote source. The SDI Data Provisioning agent is already setup, and that setup falls outside the scope of this blog.
Below, we set up using the remote source in the Database Explorer, with a graphical UI. The remote source can also be created using a SQL create statement, which you can find in the Github repo readme. I’m using the DBADMIN user for the setup.
No comments:
Post a Comment