German areas with high Biergarten density? Spoiler: The result won’t surprise…
SAP HANA Spatial offers in-database capabilities for handling and processing geospatial data. Furthermore, by leveraging the SAP HANA embedded machine learning capabilities developers are equipped with the ability to infuse spatial analytical models into their applications without the explicit need to actually visualize the geo-referenced input data.
Especially for those scenarios, but also for the data preparation of complex visualizations, it is handy to have a tool to do quick intermediate visualizations in order to check partial results or quickly do a feasibility check of the data. SQL and GIS (database) developers may be familiar with an open source client tool called DBeaver. DBeaver is a free universal database client with excellent built-in spatial datatype support. Since version 6.2.2 it offers support for SAP HANA and since version 6.3.1 the support of SAP HANA Spatial has been contributed.
This article should give a quick overview of how to use DBeaver with SAP HANA Spatial to get some quick map visualizations of database objects without using a full-blown GIS client software. As an example we are going to do a DBSCAN clustering on a beer garden dataset.
Yes, you caught me red-handed: I want to fool your mind and create positive associations!
If you do not have a SAP HANA instance at hand and would like to setup a free SAP HANA Express to load some spatial data, I recommend reviewing my previous blog post first:
Open Source GIS with SAP HANA
Installing DBeaver
Since DBeaver is actually an Eclipse plugin it runs on Windows, Mac and Linux (…and probably anywhere else where you manage to run Eclipse). The easiest way to install DBeaver is to download and execute the ready-made binaries.
If you downloaded and installed the latest version, you have everything you need. SAP HANA (Spatial) support comes out of the box and there is no need to install a separate plugin.
Connecting to SAP HANA
When you open up DBeaver the first time, you will be presented with a dialogue to connect a new database.
You can also manually open this dialogue by clicking the plug sign on the upper left corner.
So, now let’s configure the connection to your SAP HANA instance. Switch to the ‘All’ tab on the left side and enter “HANA” as a search keyword. Select SAP HANA and click ‘Next’.
Enter the connection details for your SAP HANA instance and click ‘Test Connection’:
When you first use a connection to SAP HANA the respective driver needs to be download. Don’t worry! DBeaver will do the job for you. Simply confirm the automatically suggested download by clicking ‘Download’.
After the download you should receive a message that the connection has been established. Confirm this message with ‘OK’ and click ‘Finish’ to add your instance to the DBeaver Database Navigator.
Now your are all set to use DBeaver with SAP HANA and – of course – SAP HANA Spatial in particular!
Previewing a Spatial Dataset
As an example, I am going to use Open Street Map POI (Point of Interest) data from Germany. The respective shape files can be downloaded from Geofabrik and imported using the built-in SAP HANA Shape File Import.
In my case, I imported all POI data into a table called GERMANY in schema GEOFABRIK. You can navigate to your spatial table in the Database Navigator by expanding your instance, the schema and the directory ‘Tables’.
For starters, let’s do a data preview on the table. Right-click your table and choose “Read data in SQL console”.
DBeaver will generate and execute an SQL statement selecting all columns and rows of the respective table. Of course, for performance reasons only the first 200 rows (depending on configuration) will be fetched by the client.
On the right side you will notice a tiled view with so called ‘Panels’. In the example above the cell ‘OSM_ID’ in row 1 is highlighted and the panel shows the details for this cell. Which in this case is simply the numerical ID. Now let’s get to the fun part and see what happens, when you highlight a cell of a spatial column (column ‘SHAPE’ in our example).
You will see the point on a map! Right in your database studio. On the map, you can navigate and zoom in and out. Furthermore, you have the possibility of choosing different map layers and switching between MapBox and OpenStreetMap. Take some time to play around and configure your favorite map. The point that I have chosen above is actually a ‘bench’ (column FCLASS) located at the Wandlitzsee. Sounds cozy, does it?
Benches? Right, we are interested in beer gardens. So we copy’n’paste the following statement into the SQL console and execute it.
SELECT *
FROM GEOFABRIK.GERMANY
WHERE FCLASS = 'biergarten'
Before executing I did increase the line limit to 5000 (instead of 200) at the bottom of the window to show all my 3313 entries.
On the left side of your screen you may have noticed a tab called ‘Spatial’. If we click this tab all entries of the result set will be visualized on a map. So here we go: Let’s have a look at all the OSM beer gardens in Germany.
Wait – here is little limitation, you will have to overcome. Whenever working on full result sets or calculated columns, you will have to explicitly define the Spatial Reference System you are using to get the data mapped on a map. To do that click on the dropdown item ‘Simple’ at the bottom of the screen and define the Spatial Reference System you are using. In my case this is SRS 3857.
(To be fair this is not a limitation of DBeaver but rather the SAP HANA integration piece)
After confirming the dialogue all entries will be shown on map. You can even select a single entry and see the details with a layover.
Even from looking at the picture above we can tell that there is a north/south slope for the density of beer gardens. Without being a Data Scientist, I can tell that this must be correlated to the slope in temperature and brewery density.
To confirm this visual impression with some more facts, we can do a density based DBSCAN clustering. To identify all the clusters, where there are more than 10 Biergartens in a 10km distance, we execute the following statement.
SELECT
ST_CLUSTERID(),
ST_CONCAVEHULLAGGR(shape),
COUNT(*)
FROM GEOFABRIK.GERMANY
WHERE FCLASS = 'biergarten'
GROUP CLUSTER BY SHAPE USING DBSCAN EPS 10000 MINPTS 10
HAVING ST_CLUSTERID() <> 0
ORDER BY COUNT(*) DESC
Please note that the cluster with ID 0 is the outlier cluster, which contains all points, which cannot be associated to a spatial cluster.
Highlighting the first three entries of the result set (ordered by number of items per cluster) with SHIFT + Mouseclick, we can quickly identify the clusters for Munich, Mannheim and Leipzig.
If the window appears too small, we may open up the same view in a browser by clicking the browser icon on the lower left of the map window.
Finally and not surprisingly we can see that Munich is essentially a dense cluster of beer gardens.
No comments:
Post a Comment