…brew your own Business Intelligence

Real Time OLAP = TRUE

While testing the performance of ROLAP (with a ColumnStore index on the fact table) for another post (I’m still working on), I noticed something that made me stop and think “wtf?”

confused

What I noticed was that, while sending an MDX query to the ROLAP cube multiple times, the initial query would show up in both the SSAS profiler trace and the SQL Server profiler trace…

click to zoom

click to zoom

…while for all subsequent instances activity would only show up in the SSAS profiler trace.  But this makes sense, because the results from the first instance of the MDX query are cached in the SSAS storage engine…so when the second instance of the MDX query comes in, the results are retrieved from cache and there’s no need to dive into the underlying SQL Server database to re-retrieve the results…

click to zoom

click to zoom

So what’s the big deal?

The big deal is that the primary use-case for ROLAP storage is (near) zero-latency…meaning if the data changes in the underlying source, these changes should be immediately reflected in the results of queries sent in by all client reporting tools (Excel, SSRS, etc).

So what happens when:

  1. a user runs a report against the ROLAP cube in the morning
  2. chanages occur in the underlying SQL Server database (inserts/updates/deletes)
  3. the user re-runs the same report (w/ same parameters) an hour later

According to the results above, the second run of the report will return the same results as the first run of the report (which are now wrong!!!) because the data is simply retrieved from the SSAS storage engine cache.

Note: the conclusion above is based on 2 assumptions: 1) the SSAS server is not very busy and therefore the results of the queries generated by the first report run aren’t forced/expired from cache… and  2) the reporting tool isn’t leveraging its own form of cache in which case queries would never even make it to the SSAS database (SSRS will do this)

Real Time OLAP = TRUE; to the rescue!!!!

Clearly you want the second run of the same report (w/ same parameters) to reflect the changes made to the underlying SQL Server database, otherwise you wouldn’t have configured ROLAP storage to begin with…right? In order to force MDX queries to always pass through the SSAS engine, ignore all cache options for ROLAP partitions and go straight to the source database, the following connection parameter can be specified:

ssms additional connection parameters

To confirm this behavior I’ve used SSMS + Profiler to trace the execution activity of a simple query against a cube with ROLAP partitions.  Below is a screenshot of the profiler traces during the first execution of the MDX query (SSAS server up top; source SQL server down below):

click to zoom

click to zoom

Notice that both the Query-Dimension and Query-Subcube events are retrieving non-cached data.  Now when I re-run the same query:

click to zoom

click to zoom

…we see that again the Query-Subcube event is retrieving non-cached data…which confirms its going back to the source SQL server database to get the latest and greatest data.  On the other hand, the Query-Dimension event is now retrieving data from cache…and that’s because our dimensions were not configured for ROLAP storage:

Dimension Storage Modes

If the storage mode in the screenshot above had been set to ROLAP, then we would have seen the Query-Dimension event retrieving non-cache data in the profiler trace for the second run of the same query. And don’t get too excited about the InMemory option you see listed…that doesn’t apply to Multidimensional 🙁

Before wrapping this post up I just want to bring you attention to one more small detail.  When the Real Time OLAP = TRUE; connection parameter is specified, not only do MDX queries against ROLAP partitions bypass the SSAS cache, but they also fail to cache the intermediate results of the data retrieved from the source SQL server database.

To illustrate this point, the screenshot below is a side-by-side trace of 2 MDX querys against the same cube.  The trace on the left is a second run of the MDX query using a connection without the Real Time OLAP = TRUE; parameter specified.   The one on the right is the same query over a connection with the Real Time OLAP = TRUE; parameter specified.

click to zoom

click to zoom

Notice the differences in the activity during the Query-Subcube phase.  The trace on the left retrieves the data from the source database once and caches it.  That cached result is then hit 3 additional times to satisfy the original MDX query.  The trace on the right (Real-Time OLAP = TRUE;) simply hits the source database 3 separate times to satisfy the original MDX query…which is much less efficient.  Notice the duration in each instance…112 ms vs 1161ms…just over a 10x difference in performance.

This is definitely something to keep in mind.  There may be special cases where some reports don’t really need to reflect the latest and greatest data.  In these cases, this connection string parameter can simply be omitted and the report will attempt to retrieve the results from the SSAS storage engine cache – if they exist 😉

Update 1/29/2015: Tested this property yesterday while trying weird stuff with a client (lol)…and we discovered that this connection string property, when used with a MOLAP cube, will go directly to disk for all data and nothing (from this session) will be cached…so unless you hate your storage admin, i wouldn’t recommend using this with MOLAP cubes.

Bottom line

Absolute zero-latency comes at an incredible performance penalty.  If zero latency is not an absolute requirement, and latency of a few seconds/minutes is acceptable, then there are (in my opinion) much better/cheaper alternatives such as:

  • MOLAP Switching – where you have 2 cubes, one for querying, the other for processing and you swap continuously which should bring latency down to minutes.
  • MOLAP+ROLAP design pattern – where you are able to place the most recent data in a leading ROLAP partition and older/historical data in MOLAP partitions which can bring latency down to seconds.

 

 

References: Analysis Services 2008 R2 Performance Guide

 

2 thoughts on “Real Time OLAP = TRUE

  1. Scott Barrett says:

    Can you give us more information about MOLAP switching?

    1. Bill says:

      Hi Scott,

      Thanks for reading. There are a few ways to achieve low latency w/ Molap. In the post above, I mentioned “molap switching” where I was basically preferring to a method where you setup 2 identical SSAS instances (NodeA and NodeB). User queries are directed to NodeA while NodeB is processing new data. Once processing of NodeB completes, user queries are redirected to Node B and Node A begins processing. The mechanism for directing user queries to the active query-node depends on your environment.

      Another option is to do continuous processing and synchronization from NodeA to NodeB. In this case, user queries are always directed at NodeB and NodeA is always processing new data. After each processing cycle, a synchronization from NodeA (source) to NodeB (target) is executed and then NodeA goes back to processing.

      Here’s a good post to get you started with more details:
      http://anoopsihag.wordpress.com/2012/01/21/analysis-services-scale-out-processing-architecture/

Leave a Reply