Why You Need to Routinely ProcessFull
Earlier this week, one of the client’s routine SQL agent jobs failed during an Analysis Services processing task with the following error message:
Error: 2013-07-18 10:27:36.18
Source: Process Dimensions Analysis Services Execute DDL Task
Description: File system error: A FileStore error from WriteFile occurred. Physical file: \\?\E:\OLAP\Data\Sales.18.db\Product Review.0.dim\342.Comments.bsstore. Logical file: . .
This is a recurring job – running every 2 hours – that simply kicks off an SSIS package. The SSIS package loads new data into several dimensions and fact tables before refreshing the associated cube. The cube is refreshed using the following pattern:
- Dimensions: ProcessUpdate
- Cube: ProcessFull
Having never seen this exact error message before, my first thought was to check the free space on the SSAS server to make sure there was room to write new data. But a quick check showed other SSAS processing tasks completing successfully…if it the disk was full, everything would be failing – so drive space couldn’t be the issue.
Next stop –
Since none of the developers have access to the production server (a good idea), we alerted the DBA and and asked her to check the size of the *.bsstore file on the production SSAS server. In the meantime, mostly out of curiosity, I ran the following query against the production DW to get an idea of the potential size for this attribute:
SELECT SUM(DATALENGTH(ISNULL(f.column,1))) / 1024.0 / 1024.0 AS Size_In_MB
FROM dbo.fact f
This may seem odd that an attribute with a size of ~10 MB in the database could have ballooned to ~4GB on the SSAS server. But, if we recall, this dimension has been getting reprocessed every 2 hours (via ProcessUpdate) for the past few months without an occasional ProcessFull. So if the underlying data is changing somewhat frequently (which it was, as I later discovered) then it is definitely possible for the file size to balloon.
This is because of how the ProcessUpdate works…essentially it tries to do as little work as possible to accommodate underlying changes in the source table while avoiding the need to reprocess the related partitions:
- Inserts: data is added as new members in the dimension
- Deletes: corresponding members are logically marked as inactive in the SSAS-files, but not removed
- Updates: the existing member is logically marked as inactive and a new member (with the changed values) is added to the SSAS-files
So as you can see, the files are growing for every Insert/Update that occurs in the source table. And when a record is deleted from the source file, the corresponding member(s) remain in the SSAS physical file. And so the SSAS files grow and grow and grow.
This is why it is good practice to occasionally run a ProcessFull on the dimensions. ProcessFull wipes out everything and rebuilds it from scratch…so all those “logically marked as inactive” attribute members get removed from the files resulting in a smaller files (and smaller files = less I/O). The downside is that you then need to go back and reprocess all the related partitions which can take considerable time.
Once the DBA confirmed the *.bsstore file on the production SSAS server was ~4GB on disk, we executed a ProcessFull on the associated dimension, followed by a ProcessDefault on the related partitions. This shrunk the file size back down to ~14MB which is much more reasonable given the data size in the source.
In the future, we plan to implement a regularly scheduled ProcessFull to prevent a recurrence of this issue.