- Joins are well optimized in the join engine. They are normally faster than Joins in the Calculation Engine. JOIN should be performed in Analytic/Attribute view instead of Calculation View (though in some cases that’s the only way).
- CE_JOIN is limited to join only two tables. Cascaded Joins are very expensive and not always lead to right results.
- Left outer join is normally faster than inner join.
- Left outer join is almost equally fast as Referential Join.
- Currently one single join operation is executed by 1 core at maximum (Multiple Join are parallelized across multiple cores).
- Avoid joins on calculated fields, e.g. concatenating two fields into one and execute a join on it. Materialize the concatenated field (if not done by the engine)
- Avoid non-equijoin predicates on column table as column engine does not natively support join predicates other than equality condition.
Best Practices for Join Operations
Subscribe to:
Comments (Atom)
No comments:
Post a Comment