SSAS Related Enhancements in SQL Server 2016
As has seemingly been the case for a while, the majority of SSAS-related enhancements slated for release with SQL Server 2016 fall mostly on the side of Tabular. And while I totally understand this from a product management perspective (Tabular is still much less mature than Multidimensional so there’s more ground to make up from a feature parity perspective), I just can’t help but feel bad for many of my clients who have made *substantial* investments in Multidimensional over the years. Yes they made the right choice at the time (most of them) and it has served them well as it should being a best-in-class product…but its hard to watch all these new exciting developments being released with a focus on SSAS Tabular…especially from the perspective of the reporting layer where at first it was PowerView, now its Power BI.
Then again, perhaps its just me who is feeling left out as none of my clients seem to want to move forward with Tabular in any capacity other than basic Proof-of-Concepts…
Such is life…now, on with the new features!
Let’s start with Tabular, after all that seems to be MSFT’s favorite SSAS child…
Advanced Modeling: 2 new “advanced” modeling features, both of which are already available in Power BI Designer desktop app which Chris Webb (b | t) did a nice job discussing them back in January 2015…so it really isn’t too surprising that they appear to have made the cut for SQL Server 2016.
Many-2-Many: This is a big one and something I was really expecting to see much sooner (e.g. in the SQL 2014 release which was pretty much a complete let down for Enterprise-level BI developers). Yes, finally, Many-2-Many relationships will be natively available in SSAS Tabular 2016. While most of us owe a great deal of thanks to Marco and Alberto for exposing us to a nice DAX workaround it will be nice to not have to use DAX-workarounds to support this modeling technique. I just hope the native version of this feature is as fast as their workaround
BI Directional Cross Filtering: Admittedly I’ve not seen much of a need for this functionality and have some concerns as to whether it will do more good than harm (preying on unsuspecting business users with little data modeling experience)…but it is certainly an interesting one. Basically, the idea is that you can define the direction of the filter (by defining the direction of the relationship) between 2 tables. In previous versions, the relationship was always 1:Many (i.e. dimension to fact) and the filtering was always applied from the 1-side to the many-side (e.g. filter many transactions by 1 product-category) which is pretty intuitive for folks like myself who are well-versed in the principles of dimensional modeling. But soon you’ll be able to change the direction of the relationship and, as a result, the direction of the filtering which may (or may not) open some interesting doors.
Notice anything missing from this list? What about native Parent-Child (PC) hierarchies? Tabular has been out for a while, it is consistently touted as a silver-bullet to sluggish analytical queries (especially those involving detail-level calculations)…and yet native functionality for PC hierarchies, one of the more common performance hurdles in multidimensional cubes (due to the complexities of optimizing before query-time), still hasn’t made it up the priority list?
Time Intelligence: now some of you are probably thinking to yourselves, “wait, this is already available in Tabular models” and you’re absolutely correct. DAX functions related to time-intelligence (e.g. SAMEPERIODLASTYEAR) have existed for as long as I can remember. However, sometimes they don’t work as expected…for example, when the developer leaves out 1 or more key steps (e.g. not having a separate date-table, not marking it as the date-table, and/or not specifying a date-column).
Note: This is actually a very common issue with new Tabular/PowerPivot developers who aren’t familiar with dimensional models and don’t understand (yet) how helpful a separate date can be for analytical queries. And so they either don’t create one at all, or they create one but base it on an existing table that doesn’t include a continuous gap-free range of dates (which can be a much more difficult problem to diagnose).
Now in v2016, SSAS Tabular will have built-in time intelligence that “automagically” works…at least that’s how it was described at MS Ignite. Accordingly, this feature will automatically create a separate date table (with continuous dates based on the min/max date values in the other tables) in the model and (I suspect) create relationships with other tables in the model that contain date columns. Furthermore, it can be customized to suit a variety of needs.
New DAX Functions: Just as I commented in the “Advanced Modeling” section, this is something to be expected as we’ve already seen several new DAX functions available with Power BI Designer and Excel 2016 (already in preview). Here’s a link to the MSDN page. Of special note, MEDIAN and MEDIANX are very welcomed additions.
Update 20150513: haven’t found official confirmation yet, but pretty confident DAX variables will make it into 2016 (which IMO is more important than any single new DAX-function being provided)…see here and here for more details.
Performance: Such a generic entry here and details are still pretty unclear, but this likely involves enhancements to the Tabular query-engine. One example provided was Query-based performance improvement related to reducing the “chattiness” of queries. “Chattiness” is a term used to describe certain query plans (typically involving “callbacks”) where many calls are made from the formula to storage engine in order to resolve the query – where as we’d rather see a calculation pushed down into the storage engine once (similar to block computation mode vs cell-by-cell for the Multidimensional folks out there). I also suspect similar improvements geared towards scenarios where DAX being generated by a front end reporting tool (e.g. Power View) which previously was being evaluated in a less than optimal way will now be more efficiently reduced and resolved…this last comment is, of course, simply a hunch/hope.
Parallel Partition Processing: good deal – this is a pretty important improvement especially for those working with larger models. And again, I was totally expecting to see this in the SQL2014 release. Previously only a single partition for a table could be processed at a time. This wasn’t a big deal for those with data models that easily fit an incremental loading framework…but that certainly isn’t everyone and those who needed to process large tables suffered from the lack of a seemingly obvious feature (at least obvious to those who were already familiar with Multidimensional).
New Scripting Language For Tabular Models: this one may have gone unnoticed as it was included during the discussion of “developer tools” and specifically when talking about enhancements to Visual Studio (for SQL Server developers). However, it certainly applies to Tabular developers so I’ve included it here.
At first glance it looks a little scary…new scripting language?…my initial reaction was along the lines of “f$#^, we just had to learn DAX less than 5 years ago and now you’re making us do it again!?!?!”. Fortunately that’s not the case. All this is referring to is that a native Tabular object-model will be exposed…which should be very VERY helpful for those who’ve had the “opportunity” to develop new (or port existing) SSAS focused tools/utilities using managed code.
Chris Webb recently had a post that sparked some very interesting discussion amongst SSAS professionals regarding the most desired enhancements to SSAS Multidimensional…so I was really eager to see what made the cut here.
Netezza Data Source: Got data in a Netezza system you want to to slice and dice, great! If not, move along.
Performance Improvements: again, similar to Tabular, not many details so I’d assume some query engine improvements. Below are the only 2 I’ve seen explicitly referenced so far…
Un-natural Hierarchies: It is (or should be) widely known by now that natural hierarchies perform much better than unnatural ones (see Mosha’s very detailed explanation of why). Therefore, when faced with an unnatural hierarchies in the wild, developers can either “naturalize” the hierarchy (which requires changes to the model) or hand-craft the MDX (which doesn’t require changing the model but does require a reporting tool that allows custom MDX). I’m very curious to see what this looks like under the covers…and expect it will save some development time down the road.
Distinct Count: there have been some very creative (and complex) optimizations over the years in order to overcome this hurdle (via partitioning, data modeling, partition/disk layout, etc). In some cases, I’ve heard of folks completely converting to Tabular to reach better distinct count performance.
Allegedly there are a few more, and I’m certainly hoping Parent-Child hierarchies is on the list.
DBCC Support: essentially going to be the same as the relational database “DBCC” commands for consistency checks. Personally, I’ve never had a need for this until recently when working on a large scale-out query server solution involving a complex cube and robocopy scripts. At one point, we ran into an issue where the cube on several (but not all) of the read-only query servers was throwing a physical-file error for some (but not all) queries. This was a difficult problem to diagnose and we could only trace it back to a transient error in the robocopy logs …so we simply blamed it on the SAN admins and moved on (jk…sort of). As a result, we temporarily lost some trust in the process and needed to monitor it for a while until confidence was regained. If we had the capability to (programmatically) run a consistency check against the database, we could have built a more robust process and not wasted as much time (and grey hairs).
According to MSFT, this list is based on feedback from top customers…I would really like to know how top customers are defined in this scenario…Netezza Data Source, really?
Finally, after a huge let down with SQL 2014, we are about to see some progress (from a BI-perspective) with SQL Server 2016…which is great news!
Though most of the SSAS-related enhancements are geared towards Tabular, I’m definitely happy to at least see some love for Multidimensional.