The number one question I have heard from folks over the past few weeks is why Esri + HANA together? The short answer is increased performance, lower total cost of ownership and seamless integration. But how did we get here? And what have our customers experienced along this journey?
After SAP HANA was first released around 2011, our customers saw that roll-up or summary data could be calculated on the fly from base data at speed and scale, and the impact that this had on building and maintaining data warehouses. They obtained insight as soon as the data was loaded, not after 4, 12 or 20 hours of creating summary tables. This coupled with the fact that HANA doesn’t require user-created indices resulted in a much smaller footprint for a data warehouse resulting in lower TCO. Enterprises obtained answers much sooner and were able to ask many more questions without having to rebuild the data warehouse. Adding on top of this is HANA’s ability to mash up data, virtually or physically, from many different data sources and the ability to use HANA’s different engines like predictive, text analytics and graph without having to move data around.
Fast forward to January 2018, when Esri announced geodatabase support for HANA. This means HANA can be the system of record for geometries and related spatial metadata. Since 2014, HANA supports ArcGIS’s system of engagement – using the ability to access spatial data residing in HANA using query layers.
Late last year, Esri built a 311 demo on HANA to show the advantages that HANA brings to ArcGIS. This demo uses HANA as the system of record and the system of engagement. Before diving into what the demo shows, let’s look at some best practices taught to ArcGIS content creators. Following them ensures reasonable performance regardless of the underlying DBMS:
◈ When the number of rows affected by a query exceed around 10 million rows, you should create summary tables
◈ You should create indices
◈ Write narrow queries used by webmaps so that they return a handful of columns
With the 311 demo, Esri found these best practices simply aren’t needed when the underlying DBMS is HANA. Not having to create indices or build summary tables are some of the benefits observed by our customers when HANA was first released.
Let’s dive into the demo and look at the details. Here is a summary map by borough of the number of calls:
After SAP HANA was first released around 2011, our customers saw that roll-up or summary data could be calculated on the fly from base data at speed and scale, and the impact that this had on building and maintaining data warehouses. They obtained insight as soon as the data was loaded, not after 4, 12 or 20 hours of creating summary tables. This coupled with the fact that HANA doesn’t require user-created indices resulted in a much smaller footprint for a data warehouse resulting in lower TCO. Enterprises obtained answers much sooner and were able to ask many more questions without having to rebuild the data warehouse. Adding on top of this is HANA’s ability to mash up data, virtually or physically, from many different data sources and the ability to use HANA’s different engines like predictive, text analytics and graph without having to move data around.
Fast forward to January 2018, when Esri announced geodatabase support for HANA. This means HANA can be the system of record for geometries and related spatial metadata. Since 2014, HANA supports ArcGIS’s system of engagement – using the ability to access spatial data residing in HANA using query layers.
Late last year, Esri built a 311 demo on HANA to show the advantages that HANA brings to ArcGIS. This demo uses HANA as the system of record and the system of engagement. Before diving into what the demo shows, let’s look at some best practices taught to ArcGIS content creators. Following them ensures reasonable performance regardless of the underlying DBMS:
◈ When the number of rows affected by a query exceed around 10 million rows, you should create summary tables
◈ You should create indices
◈ Write narrow queries used by webmaps so that they return a handful of columns
With the 311 demo, Esri found these best practices simply aren’t needed when the underlying DBMS is HANA. Not having to create indices or build summary tables are some of the benefits observed by our customers when HANA was first released.
Let’s dive into the demo and look at the details. Here is a summary map by borough of the number of calls:
When you view the demo online, it’s important to note the map is live. It displays instantaneously and it doesn’t run against summary tables. Here is the query used to obtain the data displayed on the map:
Using query layers, ArcGIS pushes the aggregation into HANA where it executes against the base data. The query is also an example of a narrow query – it returns just 4 columns. In the next part of the demo, Esri chose to write one large query using SQL and SQL Script for use by all webmaps. Typically, each webmap would have its own narrow query like the one above. Below is the large query – which could be added on to with no attendant loss of performance:
The above query returns in 750 milliseconds against the base data. In other DBMSs, it would take 2 to 3 minutes or more to execute. This is GIS acceleration. In addition, instead of 30 separate queries, there’s only this one query. Here is a webmap that utilizes that query – it shows the count by year for each ZIP Code that’s clicked or tapped on. Each tap or click causes the query to execute against HANA.
There’s nine additional webmaps that use bivariate comparisons to show the impact of two different factors. This one shows summer v. winter calls:
ZIP Codes in yellow are those where noise complaints are high in the summer, medium-blue where noise complaints are high in the winter and dark blue are where noise complaints are high year-round.
To recap, for an enterprise that uses ArcGIS as the system of engagement, these advantages mean:
1. The data displayed by webmaps isn’t stale because it is from the base data, not summary tables
2. The ability to create new webmaps to answer new questions doesn’t rely on an existing summary table. The questions can drive the creation a new webmap immediately
3. HANA can process aggregate queries against data at speed and scale which means data for the whole enterprise can underpin any webmap – an organization can now create an organization-wide atlas of maps to show their KPIs across the entire operation
4. Increased agility and reduced query governance. Where there would have been 30 queries, there’s now one – resulting in faster innovation, reduced maintenance and governance and lower TCO
5. Because there are no summary tables and no user-created indices, the data footprint is smaller
But what if your production GIS is running on another DBMS or you’re on an older version of ArcGIS Enterprise? You can still leverage HANA’s advantages by creating a publication geodatabase (or sidecar) in ArcGIS Enterprise 10.6 or greater and with HANA 2 SP2 or greater. Any ArcGIS administrator can use the tools they already know to copy in the desired feature classes. Bottom line is there is no need to wait to gain GIS acceleration and agility that HANA and ArcGIS together offer.
No comments:
Post a Comment