Introduction
During BI implementation of a project, there could be a reporting requirement where row data need to be transposed as column name. In various BI landscapes, usually data is ingested from sources on which data cleansing/massaging is performed and then data get loaded into table which feeds to reporting layer.
Traditionally in reporting, we transpose row values as column value or vice versa, but here rows need to be transposed as column name (instead as column value). For better understanding, let me demonstrate this requirement by a use case where machinery data get loaded into table (in figure 1). In this data, for any ChassisNo value, EngineCapacity and Horsepower have 3 distinct value set such as 1100,2200,3500 and 74,163,270 respectively.
(Figure 1 shows different variant of Chassis number)
Let’s first list down various facts related above data as follows:
◈ For any ChassisNo, combination of EngineCapacity and HorsePower represents different variant. (for example, CIN100001 have 3 different variants with combination of EngineCapacity and Horsepower)
◈ For all Chassis, there exist 3 similar variants (which will grow in future to N variants)
◈ Column RPM, BodyType and Mileage represent attributes across different variants
Each chassis has 3 variants, thus for any chassis we can see 3 rows but in future this data is expected to grow with new variants launching across all chassis.
Requirement
Target representation is expected as shown in below figure2. Here all Chassis, data is represented into single row and column RPM/BodyType/Mileage get merged with transposed value of variants.
(Figure 2 – Target requirement)
Following transformation can be understood as follows:
Merged value of column EngineCapacity and HorsePower like (1100_74, 2200_163 and 3500_270) is transposed and append into column name such as RPM_1100_74 to Mileage_3500_270
Attribute RPM/BodyType/Mileage to displayed across these target columns
Solution
As we can see here target columns generating primarily based on variant data, and this data is expected to grow in future due to new variants, thus to handle this dynamic data we opt creating SQLScript write procedure which creates structure of target table, where column name will be built dynamically based on number of variants and its value.
PSEUDO – CODE
Step 1
Extract column name and its datatype (like RPM,BodyType and Mileage) from base table using system table TABLES_COLUMNS in a variable. For example, RPM and Nvarchar(50).
Step 2
Extract variant data that is (EngineCapacity and HorsePower) value for anyone ChassisNo, which can be used to create part of target column names of table. For example, 1100_74.
Step 3
Loop all column extracted in Step 1 and store its column name concatenated by ‘_’ (like RPM_) and its datatypes in respective variable like concatenated by ‘,’ (like Nvarchar(50), ).
Step 4
Inner loop for variant data extracted in Step 2 followed by concat operation as follows:
Concat (Column, Variant Data) that is RPM_1100_74
Concat (i_weight_tmp, DATATYPE) that is RPM_1100_74 Nvarchar(50),
This inner loop iterates for all variant for column RPM extracted in step –1 which then pass control to outer loop and similar iterations continues for other column like BodyType and Mileage.
Step 5
Hold this dynamically generated string (comprising target column names) into a variable and perform below operation.
v_weight_f := Concat (‘ChassisNo Nvarchar(50),’ variable holding string of target column)
v_weight:= (‘CREATE table TargetChasis (‘||:v_weight_f||’)’)
EXEC v_weight
With usage of EXEC clause (of Dynamic SQL), table can be created on-the-fly with all its metadata defined as per data in base table.
Detailed code used for creating table dynamically can be referred in figure 3 as follows:
(Figure 3. Sample code for creating table dynamically)
Upon execution of above procedure, column table is created successfully.
Note: According to existing system limitation mentioned in SAP Note 2154870 , a column table in SAP HANA database can accommodate maximum 64,000 column in column store versus 1000 in row store table, thus this dynamic creation of table will be valid for column count with in this range.
At this point, dynamically built table is ready now and can be inserted with data from RPM, BodyType and Mileage column.
No comments:
Post a Comment