Introduction:
SAP HANA Native Storage Extension(NSE) is a general-purpose, HANA built-in warm data storage solution. This solution is available on HANA 2.0 SP04. As we all know that now a day’s value of data is enormous from an organization standpoint of view and Hence data storage solution and Access mechanism of data matter a lot. Data is broadly categorized into 3 different temperature tiers – Hot, Warm and Cold. Frozen is also there, normally it is not considered in HANA data tiering options(DTO).
From the above image, we can see that SAP HANA Native Storage Extension(NSE) is one of the warm data solutions and Without replacing it complements DTO solutions – extension nodes, dynamic tiering. However, there are some key differences as-well that NSE is built-in disk-based or flash-drive based database technology solution with deeply integrated into the SAP HANA (It uses HANA in-memory buffer for operations and Persistence layer for storage of warm data) and Full functionalities like query optimizer, query execution engine, and column store are supported. The second difference is that NSE is based on paging concepts, meaning data is read in the form of pages (4KB-16MB) while with other DTO solutions minimum data unit can be read as either whole table or partition. That means if a partition contains 100 million of records, then it will pull all 100 million of records into HANA in-memory while in NSE, based on the query processing selection criteria and Page units, Only required records are pulled into the main memory which might be 10 million records required based on the selection condition in the form of a set of pages. Now let’s deep dive into NSE high-level architecture and Implementation details…
The below image shows the difference between standard SAP HANA and SAP HANA with Native Storage Extension(NSE) (HANA 2.0 SP04).
◈ Data capacity: In standard SAP HANA memory size is equal to the amount of hot data memory while in SAP HANA with NSE is the amount of hot data memory plus the amount of warm data (built-in). Warm data is stored into an existing persistence layer. The main memory(Hot data) recommended ratio with NSE (Warm data) is 1:4 that means if Hot Data in-memory size is 2TB then warm data size can be 10TB or Less. Refer below example, In this case, additional HANA memory is added for cache buffer, As an alternative solution buffer cache can be adjusted from existing hot data memory as well considering ratios are met as per recommendations.
◈ HANA in-memory: In normal cases, 50% used to store hot data and 50% used as working space.
◈ Hot Data “Column loadable”: Completely resides in main memory and Loads all data into memory for fast processing and Analytics.
◈ Warm Data “Page loadable”: With NSE, warm data may be specified as “page loadable”. As mentioned earlier data is loaded into memory in granular units of pages for query processing. NSE will reduce memory footprint as data is partially in memory and partially on disk.
◈ Performance: With NSE, as data is distributed between the main memory and disk storage. So, don’t expect data load to be very fast, Normal range would be 2-10 times slower than in-memory processing.
◈ Buffer Cache: It is used to load the page units from disk storage (warm data) into main memory (hot data). With the current NSE feature by default, 10% of the main memory is reserved for buffer cache and Not allocated.
Queries to view and update buffer cache size:
1. On 10 TB HANA DB server (Default):
SELECT VOLUME_ID as "Volume ID",
ROUND(MAX_SIZE/1024/1024/1024,3) AS "Max Buffer Size in GB"
FROM SYS.M_BUFFER_CACHE_STATISTICS
2. Update buffer cache size:
‘max_size_rel’: This is a relative parameter like 10%, 15% of total HANA memory.
‘max_size’: This is the exact size in GBs like 500,1000 etc.
ALTER SYSTEM ALTER CONFIGURATION ('indexserver.ini', 'system')
SET ('buffer_cache_cs','max_size_rel') = '10' WITH reconfigure;
Buffer cache ratio with NSE storage is normally 1:8 i.e. Buffer size should be at least 12.5% of the total size of NSE storage space.
NSE implementation:
◈ By default, in HANA 2.0 SP 04 NSE is enabled, there is no separate action required apart from executing correct SQL/DDL statements.
◈ “Page loadable” means data in NSE (Warm store) and “Column loadable” means data in main memory (Hot store).
◈“Page loadable” can be specified at table level, index level, partition level, and Column level.
◈Data may be converted between “page loadable” to “column loadable” or Vice-versa.
◈Load units: Loading behavior is determined by a load unit set for Column, Index, Partition, and Table.
◈ PAGE (Warm store): On NSE storage disk.
◈ COLUMN (Hot store): In main memory
◈ DEFAULT(Depends): If you do not specify a load unit for a given granularity, the LOAD_UNIT column in the system views shows the load unit as DEFAULT.
Standard HANA system views to view load unit set:
1. TABLES
2. M_CS_TABLES
3. TABLE_PARTITIONS
4. TABLE_COLUMNS
5. INDEXES – shows the index load unit for supported index types.
Example:
SELECT lOAD_UNIT from M_CS_TABLES
where table_name = 'VBRK' AND
schema_name = <SchemaName> ;
The behavior of the load unit setting:
Load unit is set to: | The effective load unit is: |
If the column-level load unit is configured as either PAGE LOADABLE or COLUMN LOADABLE | The configured load unit. |
If the column-level load unit is configured as DEFAULT LOADABLE | SAP HANA checks, and uses, the partition-level load unit. |
If the partition-level load unit is configured as PAGE LOADABLE or COLUMN LOADABLE | The configured load unit. |
If the partition-level load unit is configured as DEFAULT LOADABLE | SAP HANA checks, and uses, the table-level load unit. |
If the table load unit is configured as PAGE LOADABLE or COLUMN LOADABLE | The configured load unit. |
If the table load unit is configured as DEFAULT LOADABLE | The COLUMN LOADABLE load unit. |
◈ Table Partitioning support:
◈ Unbalanced RANGE
◈ Unbalanced RANGE-RANGE
◈ HASH (Note: For hash partitioning entire table or column must or index be “page loadable” or “column loadable”)
◈ DDL statements:
◈ Creating new table SAP HANA Native Storage Extension(NSE) enabled:
– Sets the load unit to page loadable for the entire table
CREATE COLUMN TABLE tab_jeet (emp_id INT, name VARCHAR (100)) PAGE LOADABLE;
– Sets the load unit to page loadable for the column name
CREATE COLUMN TABLE tab_jeet (emp_id INT, name VARCHAR (100) PAGE LOADABLE);
– Sets the load unit for table and Column.
CREATE COLUMN TABLE tab_jeet
(emp_id INT COLUMN LOADABLE, name VARCHAR(100)) PAGE LOADABLE;
– Single-level heterogenous partitioned table. Note that the double parenthesis surrounding the partition declaration is syntactically required by CREATE TABLE.
CREATE COLUMN TABLE tab_jeet (emp_id INT, name VARCHAR (100) )
PARTITION BY RANGE (emp_id)
((PARTITION 600000 < VALUES <= 99999 PAGE LOADABLE,
PARTITION 19999 <= VALUES < 600000 COLUMN LOADABLE,
PARTITION OTHERS COLUMN LOADABLE));
Please note: The double parenthesis surrounding the partition declaration is syntactically required for NSE. If any table is partitioned, please make sure it follows the correct syntax and It means a lot. If (PARTITION is created using old syntax and Then NSE is not supported. ((PARTITION syntax is required to enable NSE.
– If the table is already range partitioned, then you can below query to identify if it follows correct syntax or not. If it does not then you will have to ALTER, the partition to re-org the new format/structure.
SELECT partition_spec FROM tables WHERE table_name = <table_name>;
– Converting any existing partition to NSE enabled format with simple ALTER statement no need to DROP and CREATE
ALTER TABLE tab_jeet PARTITION BY RANGE (C1)
((PARTITION 600000 < VALUES <= 99999,
PARTITION 19999 <= VALUES < 600000 , PARTITION OTHERS));
– Create Second-Level NSE-Enabled Partitions
CREATE COLUMN TABLE tab_jeet ( emp_id INT, zipcode INT, name VARCHAR (100) )
PARTITION BY RANGE (emp_id)((PARTITION 100000 <= VALUES < 199999,
PARTITION 199999 <= VALUES < 299999 PAGE LOADABLE, PARTITION OTHERS)
SUBPARTITION BY RANGE (zipcode)
(PARTITION 55100 <= VALUES < 55200, PARTITION 55200 <= VALUES < 55700,
PARTITION 55700 <= VALUES < 55900 , PARTITION OTHERS));
◈ Changing the load unit of the existing table, column, partition.
— Changing the Load Unit for a Table and Use of CASCADE removes any existing partition or column-level load unit preferences.
ALTER TABLE tab_jeet PAGE LOADABLE CASCADE;
— Changing the Load Unit for a Column
ALTER TABLE tab_jeet ALTER (emp_id INT DEFAULT LOADABLE);
— Changing the Load Unit for a Partition
ALTER TABLE tab_jeet ALTER PARTITION 2 PAGE LOADABLE;
— Changing the Load Unit by specifying the RANGE clause
ALTER TABLE tab_jeet ALTER PARTITION RANGE(emp_id)
((PARTITION 19999 <= VALUES < 600000)) PAGE LOADABLE ;
– Alter First-Level NSE-Enabled Partitions
ALTER TABLE tab_jeet ALTER PARTITION RANGE (emp_id)
((PARTITION 100000 <= VALUES < 199999)) COLUMN LOADABLE;
– Alter Second-Level NSE-Enabled Partitions
ALTER TABLE tab_jeet ALTER PARTITIONRANGE (emp_id)
((PARTITION 100000 <= VALUES < 199999)
SUBPARTITION BY RANGE(zipcode) (PARTITION 55200 <= VALUES < 55700))
PAGE LOADABLE;
◈ Table Indexes (Inverted index):
Indexes are created on one or more than one column for a table and It contains the same value as column contains. Normally if the index is based on one or two columns then that can be kept in main memory and Does not impact much of memory consumption, however, in some cases, you may not need to keep all indexes in main memory to get the desired performance. With NSE inverted index can be moved to disk storage “Warm Data” and Memory can be freed.
An inverted index is a special type of HANA index which is created for column type tables.
Advantages:
◈ Significantly reduced memory footprint due to the absence of concat attribute
◈ Less I/O in terms of data and redo logs
◈ Reduced delta merge efforts
◈ DML change operations can be faster due to reduced concat attribute maintenance
– Here index idx is moved to NSE storage
ALTER INDEX idx PAGE LOADABLE;
No comments:
Post a Comment