Sunday, 30 July 2017

SAP HANA 2.0 SPS02 new feature: updated python driver

Introduction


As background: there have been two HANA python drivers available for some time:
  • The python driver included with the SAP HANA Client package (hdbcli). The client package included a distribution of python 2.7 and the driver works only with that distribution.
  • The python driver available on github (pyHDB). This driver is implemented in pure python, and worked with both python 2 and 3, but is not fully supported by SAP.
In SAP HANA SPS02, the hdbcli driver has been updated and now provides the following new features:
  • support for python 2.7 and python 3.4+.
  • installable into local python environments using the pip installer
  • the driver is built on the same C++ native library used by ODBC, ADO.NET, and other interfaces. As a result it provides all the protocol support built into that library, including support for Active/Active database configurations, network compression, and more.
  • improved support for calling stored procedures.
  • fully supported as part of the SAP HANA Client distribution.
This blog post walks you through the usual Getting Started tasks, including install, establishing a connection, executing SQL DDL commands, and handling result sets.

Installing the python driver


There are two steps to installing the python driver:
  1. Install the SAP HANA Client package on your computer. 
  2. Install the python driver from the HANA Client install into a local python environment.

Installing the python driver into a local python environment


You must have a python distribution installed on your local machine. Some python distributions allow you to create multiple independent virtual environments, each with its own version of python and its own set of packages. You can install the python driver into any local python environment.

From a command line:
  1. If you have multiple python environments, activate the python environment into which you wish to install the python driver. If you don’t have multiple python environments, just ignore this step.
  2. Confirm you have a supported python version (2.7, or 3.4+) by typing the following at a command prompt to show the python version:
    • python -V
  3. Install the HANA Client package into your python environments using the pip installer.
    1. Here is a command for the default installation on Windows. The particular version of the zip file will probably be different:
      • pip install “C:\Program Files\SAP\hdbclient\hdbcli-2.N.N.zip”
    2. Here is a command for the default installation on Linux.
      • pip install /full/path/to/file/hdbcli-2.N.N.tar.gz
Now confirm you have the software properly installed. From a command prompt, type python to start a python interpreter. Then enter the following to confirm the package is properly installed:

>>> from hdbcli import dbapi
>>> dbapi.__name__
'hdbcli.dbapi'

All the code samples here are shown as if executed from a python interpreter.

Connecting to a database


Here is sample python code showing a database connection:

>>>from hdbcli import dbapi
>>>conn = dbapi.connect(
    address="<hostname>",
    port=<3NNMM>,
    user="<username>",
    password="<password>"
)
  • For HANA tenant databases, you can use the port number 3NN13 (where NN is the SAP instance number).
  • For HANA system databases in a multitenant system, the port number is 3NN13.
  • For HANA single-tenant databases, the port number is 3NN15.
For details of connection class methods, see <SAP Documentation Link>

Executing SQL statements


All SQL statements are executed using the cursor object. Here is how to execute a single statement that does not return a result set:

>>> cursor = conn.cursor()
>>> cursor.execute("CREATE TABLE T1 (ID INTEGER PRIMARY KEY, C2 VARCHAR(255))")
>>> cursor.close()

Binding parameters for SQL statements


For statements that take a parameter, it is good practice to bind the parameter to a variable rather than to concatenate a string and execute it. The HANA client driver supports both the question mark and named styles of parameter binding as described in the DBAPI specification.

Question mark parameter binding


Here is an insert statement using the question mark parameter style. The statement inserts values into the T1 table created above. The parameter values are supplied as a python sequence, and may be literal values or variable names. This example uses literal values:

>>> sql = 'INSERT INTO T1 (ID, C2) VALUES (?, ?)'
>>> cursor = conn.cursor()
>>> cursor.execute(sql, (1, 'hello'))
True
>>> cursor.execute(sql, (2, 'hello again'))
True
>>> cursor.close()

Named parameter binding


Here is another insert using the named parameter style. This time the values are supplied as a python dictionary, and in this case the example uses variable names.

>>> sql = 'INSERT INTO T1 (ID, C2) VALUES (:id, :c2)'
>>> cursor = conn.cursor()
>>> id = 3
>>> c2 = "goodbye"
>>> cursor.execute(sql, {"id": id, "c2": c2})
True
>>> cursor.close()

Querying data and handling result sets


There are several ways of handling result sets, useful in different circumstances. Here is a simple way to loop over the rows of the result set.

>>> sql = 'SELECT * FROM T1'
>>> cursor = conn.cursor()
>>> cursor.execute(sql)
True
>>> for row in cursor:
...    print(row)
(1, 'hello')
(2, 'hello again')

If you know the result will have a single row, then cursor.fetchone() is useful

>>> id = 2
>>> sql = 'SELECT * FROM T1 WHERE ID = :id'
>>> cursor = conn.cursor()
>>> cursor.execute(sql, {"id": id})
True
>>> row = cursor.fetchone()
>>> print(row)
(2, 'hello again')

If the result set is not too large, you may want to read it into a python variable in one step using cursor.fetchall().

>>> sql = 'SELECT * FROM T1'
>>> cursor = conn.cursor()
>>> cursor.execute(sql)
True
>>> result_set = cursor.fetchall()
>>> print(result_set)
[(1, 'hello'), (2, 'hello again'), (3, 'goodbye')]

Calling stored procedures


You can get results from stored procedures in two ways: through output (or inout) parameters, or through returned results. In both cases you can use the cursor.callproc() method.

Here is an example of how to return an output parameter.

>>> cursor = conn.cursor()
>>> # if the procedure exists, drop it
>>> try:
...     cursor.execute("DROP PROCEDURE ADD2")
... except:
...     pass

>>> # create the procedure
>>> sql_proc = """
... create procedure ADD2 (in a int, in b int, out c int)
... language sqlscript
... reads sql data as
... begin
...     c := :a + :b;
... end
... """
>>> cursor.execute(sql_proc)
True
>>> # call the procedure
>>> params_in = (2, 5, None)
>>> params_out = cursor.callproc("ADD2", params_in)
>>> # close the cursor
>>> cursor.close()
>>> print(params_out)
(2, 5, 7)

… and here is an example of handling returned result sets. The cursor.callproc() method acts just like cursor.execute("SELECT...").

>>> cursor = conn.cursor()
>>> # if the procedure exists, drop it
>>> try:
...     cursor.execute("DROP PROCEDURE PROC_T1")
... except:
...     pass

>>> # create the procedure
>>> sql_proc = """
... CREATE PROCEDURE PROC_T1(IN substring varchar(10))
... LANGUAGE SQLSCRIPT
... AS
... BEGIN
...     SELECT ID, C2 FROM T1
...     WHERE C2 LIKE '%' || :substring || '%';
... END
... """
>>> cursor.execute(sql_proc)
True
>>> # call the procedure
>>> cursor.callproc("PROC_T1", ("hello",))
>>> for row in cursor:
...     print(row)
(1, 'hello')
(2, 'hello again')

Streaming data


There is a separate class for streaming data to or from SAP HANA. You only need to use this class if you wish to process data in a piecewise fashion. Typically this may be used for video or speech data, but here is a sample that uses an image.

# get an image from the file system
with open('image.png', 'rb') as f:
     img = f.read()
# write to the database
in_blob = dbapi.LOB()
id = 2
conn.setautocommit(False)
cursor.execute("update T1 set PHOTO = :img WHERE id =:id", {"img": in_blob, "id": id})
in_blob.write(data=img)
in_blob.close()
conn.commit()
cursor.close()

And here is an example of how to read the image back out of the database. The important part is the use of the read() method to fetch a portion of the blob at a time.

# read it into a stream
conn.setautocommit(True)
cursor = conn.cursor()
cursor.execute("select PHOTO from T1 where id = :id", {"id": id})
out_blob = cursor.fetchone(True)[0]
out_img = bytearray()
while True:
    data = out_blob.read(size=200)
    if data is None:
        print("Done.")
        break
    else:
        print("Reading more data...")
        out_img.extend(data.tobytes())

This completes the description of the new HANA python driver.

No comments:

Post a Comment