…brew your own Business Intelligence

Power Query = Mind Blown

The Power Query engine was first introduced as an Excel plugin (back when it was referred to as “Data Explorer”)… then we saw it again in Power BI Desktop… then again earlier this year in SSAS Tabular… and now (or very soon) as a cloud service.

Quite frankly, I’ve been ignoring the whole Power Query story for a while, as I only viewed it as a tool for the Information-Worker/Savvy-Business-Analyst to build better excel workbook reports. In my mind, IT-developers/BI-professionals would continue using more formal ETL tools (e.g. SSIS) for enterprise solutions. But then I saw the Power Query “get-data” interface added to SSAS-tabular along w/ an interesting PoC by Kay Unkroth… and it gave me pause.

This is happening, isn’t it? Guess I better brush up. So I bought a book and attended some PowerQuery/M sessions at SQL Saturday Atlanta… and I’m now convinced that the PowerQuery engine is going to revolutionize the Business Analytics space in the coming years because it brings a very crucial that’s been missing for years: “data exploration” for the masses!

Remember 5 years ago…

…when folks outside of the hyper-scale web startup demographic started talking about Hadoop and Big Data?

Traditional database conferences (read: PASS Summit) were being infiltrated by an army of open-source java-coding mercenaries trying to convince you & me that relational databases were going away and that we better come to their session, learn how to build a hadoop-based recommendation engine, or get left behind in the dust.

Fortunately/unfortunately (depending on your perspective and incentives) it didn’t really go that way, did it? Admittedly, I bought into the hype for a few months (a year?) and rewrote my entire continuing education plan centering it around these new technologies. During that period of time I learned a few things…

  1. Hadoop/Hive is slow and never going to work as a backend for adhoc analytical queries that need to occur at “speed of thought” (really don’t like that term but it flows better than “at a speed that doesn’t make me want to claw my eyes out of my skull”)
  2. Data Lake is a very appealing architecture (and strategy)
  3. schema on-demand is NOT a feature

The picture you were painted was this:

  1. Put all the data in the (data) lake
  2. Unleash the invincible army of data analysts to start exploring all this data
  3. Discover opportunities/business-value
  4. build (or incorporate it into) your traditional DW/BI solution
  5. ???
  6. Profit

Invincible Invisible Armies

Who is this army of data analysts that understand hadoop/hive let alone have the skills needed needed to analyze all this unstructured/semi-structured data sitting in the lake?

In my opinion, they didn’t exist. From what I could tell, there were only a handful of folks with the skills needed to be effective in this type of role and make this type of framework truly viable. Even now, while the number of skilled/enabled resources has grown and the tools have gotten a bit easier to use and deal with… I still think this number is not nearly large enough support a massive surge in big data architectures such as the data lake.

But I think that will change rather quickly w/ Power Query.

Game Changer

With only a brief intro to Power Query and the M language, I’ve found the ease at which it can be leveraged to wrangle unstructured/semi-structured data is nothing short of magic!

Having to interface w/ REST APIs, ODATA, XML/JSON files, and other non-traditional data sources is no longer a task to fear. Power Query is Microsoft’s answer to the Invisible Armies issue… as it solves the problem of data exploration and lack of skilled resources.

  • Connectivity to wide-range of data sources and interfaces
  • Interactive and intuitive UX for cleaning & mashing up disparate data sources
  • Built on extensible M-language

There’s still a long road ahead for Power Query and the M language, but there’s no doubt in my mind that it’s going to shape the data wrangling landscape.

 

Additional Resources:

4 thoughts on “Power Query = Mind Blown

  1. That’s crazy talk… or is it?

    1. Bill says:

      haha – right? wouldn’t shock me if PQ/M becomes the PaaS ETL offering from MSFT in Azure (aka Azure ETL)

  2. mim says:

    if only, we have PQ that load directly to SQL server, it will open a lot of new opportunities for us,

    1. Bill says:

      agreed! should only be a matter of time now that it’s being used to load CDS in Azure… that said, it wouldn’t surprise me to see it being released for Azure DB before SQL Server on-prem :-/

Leave a Reply to thesmilingdba Cancel reply