ADLC Step 3: Capture Query Workload
This is the fourth post in a series of posts covering the Aggregation Design Life-Cycle (ADLC).
- Create Initial Aggregation Designs
- Remove Ineffective Aggregations
- Capture Query Workload
- Run Usage-Based Optimization (UBO) Wizard
- Create Custom Aggregations
- Final Thoughts
In the previous post, we discussed the characteristics of effective/ineffective aggregations and why it is important to remove ineffective aggregations for our cubes. We then went on to discuss a few options (via BIDS Helper) for detecting ineffective aggregations. One of those options, Delete Unused Aggregations, allowed us to determine which aggregations are used by actual queries based on the information captured in a SQL Profiler trace. In that section, I expressed the importance of capturing a query workload that accurately represents the full range queries that the cube will face in production…otherwise we run the risk of mistakenly classifying effective aggregations as ineffective and removing them.
Now, in this post we’ll continue along that tangent and start to drilldown into the methods for capturing a query workload. As we will see, capturing a workload that fully represents the production activity (query-wise) is the key to maximizing the effectiveness of the Usage-Based Optimization process…and in turn creating Aggregation Designs that are customized to the demands of the end users.
In order to explain the purpose/importance of capturing a fully-representative query workload, we need to first discuss the Usage-Based Optimization (UBO) process. Simply put, UBO is a process where live queries hitting the cube (from users via reports, dashboards, etc) are captured in a relational table and fed into a wizard. The UBO wizard uses this information to design aggregations that specifically target these queries.
So this step (Capture Query Workload) is the input to the next step (Run UBO Wizard). And while it may seem logical to combine these 2 steps…I have made a conscious choice to keep them separate to stress the importance of this initial process. The success of the next step is absolutely dependent on getting this step correct!
You can read more about the Usage-Based Optimization (UBO) Wizard here…or you can just wait for the following post 😉
Configuring the Query Log
By default, the SSAS instance is not configured to capture incoming queries…so you’ll need to set it up via the following instance-level properties listed below:
Log \ QueryLog \ CreateQueryLogTable Log \ QueryLog \ QueryLogConnectionString Log \ QueryLog \ QueryLogSampling Log \ QueryLog \ QueryLogTableName
Note: Here’s a link to the MSDN page which provides descriptions for each (under the Query Log section).
Here is the configuration on my lab system – which is storing every single query in a table called SSAS_OlapQueryLog:
Once the QueryLog properties are configured…and the SSAS windows service is restarted…you will start to see records appearing in this table as MDX queries are executed against the cube:
Note: the default setting for QueryLogSampling is 10…which means to capture every 10th query. This, I believe, is intended to keep the table from growing out of control on busy systems. However, a good recommendation is to change this value to 1 and capturing every single query which makes it easier to capture a fully representative query load. And to prevent the table from growing out of control, we can simply create a SQL Agent job that deletes all records from this table with a duration under 100ms…after all, the purpose of this table is to capture queries so we can design aggregation to improve the performance…but queries that are already running <300ms don’t really need any optimization.
Each record in the query log table identifies the SSAS database, partition, and dimension attributes needed to satisfy (part of) a query. Additional metadata attributes are included as well (the user who executed the query, the time it was executed, and the duration of the query) to help narrow the scope when running through the UBO Wizard in the next step.
If we pay close attention, we might some strange behavior with regards to the number of records inserted for each MDX query. Depending on the size/complexity and cache’ability, a single query can result in 0, 1, or multiple records being inserted into the query log table. That is because the table doesn’t actually log queries…instead it logs requests made to the storage engine…which makes perfect sense because it is the call to the storage engine that can be improved via the addition of an aggregation. This is a key point to understand. So for a big complex query that makes multiple calls to the storage engine, we will see a record for each separate call. While on the other hand, for a query that gets resolved via the cache, we won’t see any records inserted because no call was ever made to the storage engine.
Collecting a Fully-Representative Query Workload
Now that we have everything setup, we can sit back and watch the query log table start to collect information as our users proceed with hammering the database via execution of reports, dashboards, scorecards, etc. Life is good 🙂 So how long do we wait and collect queries before moving on? That really depends on the environment.
Since the goal is to capture a fully-representative workload, we need to capture the queries from:
- daily reports
- weekly reports
- monthly reports
- ad-hoc analysis
So ideally we’d turn up the query log and let things go for at least a month (maybe 2 – just to be sure) and then move on. Unfortunately, that’s not always a possibility…business people are impatient (bless their hearts) and want faster queries as soon as possible…also logging query details to the query log table places an additional load on the server
Therefore, we often need to condense the cycle so that we capture a fully representative query load in less time. There are a number of ways to do this. The easiest is to simply run the weekly/monthly reports outside of schedule (such as at night during the week). Another, slightly more involved, method is to capture the queries generated by each report and simulate the load via tools like AS Performance Workbench (discussed here) or ASCMD.
Either of those methods will suffice for the standard reports (daily/weekly/monthly), dashboards, and scorecards…but what about ad-hoc analysis? By that I mean, the queries generated via self-service analytic tools such as Excel, Power View, PerformancePoint (decomposition tree), etc. This activity, by definition, is not very predictable and therefore difficult to simulate/condense. Don’t sweat it. If we leave the query log collecting activity for a week or 2 (with the sample rate set to collect every query) that should give us plenty to go on for the UBO wizard in the next step of the ADLC. Anything that we miss can be addressed in last step: Create Custom Aggregations.
- Give the OlapQueryLog table a better more descriptive name (Ex. OlapQueryLog_SSAS-QS01, OlapQueryLog_SSAS-QS02, etc) and place it somewhere that makes sense such as the Data Warehouse or ETL-Staging area. Depending on the complexity of the SSAS solution, you may already have some SSAS-related metadata tables defined in the ETL-Staging area for things like managing measure group partitioning via SSIS.Note: it would be really REALLY nice if you could control the schema in which this table is created…but unfortunately, that is not an option so you’re stuck with the dbo schema 🙁
- Remember to restart the SSAS windows service after setting the CreateQueryLogTable property to true…otherwise you’ll waste time wondering why the table was never created and your queries aren’t being captured.
- Consider changing the default value for the Log\QueryLog\QueryLogSampling property from 10 (or every 10th query) to 1 and adding a scheduled agent job to trim the table of queries with <100ms durations.
- In scale-out scenarios with multiple query servers, you only need to capture queries from a single SSAS query server…unless the workload is explicitly balanced by activity (Ex. 1 query server for ad-hoc analysis, 1 query server for operational reporting, etc)
- Be sure and turn off query logging (set QueryLogSampling property to zero) after your done collecting queries to alleviate unnecessary resource consumption.