…brew your own Business Intelligence

Building an OLAP Farm in the Cloud with Azure Virtual Machines – Part 1

FYI: the features needed to implement the OLAP Farm architecture (Remote partitions, Remote linked measure groups, Linked dimensions) have been deprecated as of SQL 2014.

The “OLAP Farm” is an Analysis Services architecture that makes use of remote partitions on the backend and linked dimensions and measure groups on the front end in order to support solutions where datasets and latency requirements simply can’t be handled by a single processing server. It was discussed in more detail in the previous blog post – so if you’re not familiar, stop now and check it out before continuing.

image

Because of the amount of CPU cores and memory required for this architecture, I decided to finally make good use of the $150/month Windows Azure credit and build it out in the cloud using Azure VMs. The process of getting all of the VMs created and configured was quite interesting and will be the focus of this post. In the following blog posts, I’ll cover the details of creating the actual “OLAP Farm” SSAS solution in Visual Studio, deploying it across the SSAS servers, and then do some performance testing.

Getting Started

First things first – you need a Windows Azure subscription. If you have a MSDN subscription, you get one for free which includes $150/month credit – which actually goes a long way if you member to shutdown the VMs when your not using them.

Next you will definitely want to get Azure PowerShell installed on your local machine and configure your Windows Azure Subscription with the Windows Azure PowerShell Cmdlets. Using Azure PowerShell will save you an insane amount of time when working with Windows Azure. In fact, Azure PowerShell is the only way to complete certain tasks (i.e. setting static IP addresses for VMs).

In the following sections, we’ll be covering the steps for creating the lab environment which includes setting up the VMs (listed below), installing SQL Server and SSAS instances, configuring firewall rules for connectivity (VM-to-VM and OnPrem-to-VM) …basically everything that needs to be in place before we can start building the SSAS solution in visual studio.

  • AZVM-DC01: domain controller for lab environment
  • AZVM-SQL01: sql server containing data source for SSAS distributed database
  • AZVM-SSAS-MSTR: master SSAS instance
  • AZVM-SSAS-BE01: backend server for storing, processing, and scanning remote partitions
  • AZVM-SSAS-BE02: backend server for storing, processing, and scanning remote partitions
  • AZVM-SSAS-QS01: query server and target for user queries

Create the Virtual Network (VNET)

A Virtual Network is a must when creating an environment in Windows Azure where multiple VMs will need to communicate with each other (i.e. SSAS server sourcing data from a SQL Server instance running on a separate VM). There’s actually way more to it than that (e.g. extending an existing on-premise network to the cloud) so I suggest at least reading this MSDN page to get a basic idea of the scope.

From the Azure management portal, simply navigate to the NETWORKS section click the NEW button to begin creating the Virtual Network. If you choose the CUSTOM CREATE option, a wizard will walk you through the process. Basically you just need to specify a name (e.g. AZVM-VNet), a location (e.g. East US), and then Next (right arrow button), Next, Submit (check mark button) and a notification will show up in the bottom of the browser window informing you that the virtual network is being created.

image

Create the Domain Controller

The first VM created will be the Domain Controller (AZVM-DC01) on which we’ll enable Active Directory. This system will also serve as the DNS server for the lab environment.

There are several options for creating Azure VMs – uploading an existing image, PowerShell script, or manually via the Azure management portal.

From the Windows Azure management portal, navigate to the VIRTUAL MACHINES tab and click the NEW bottom (bottom left) to start the process of creating a VM based on an image from the gallery. In this case, I simply chose to use Windows Server 2012 R2 Datacenter image and used the following values while moving through the wizard:

  • VIRTUAL MACHINE NAME: AZVM-DC01
  • TIER: STANDARD
  • SIZE: A1 (1 core, 1.75 GB memory)
  • NEW USER NAME: <username>
  • PASSWORD: <password>
  • CLOUD SERVICE: AZVM-LAB

It takes a few minutes for the VM to be created so now is a good time to grab another cup of coffee. Once the VM is up and running, you can navigate to the DASHBOARD and see a whole bunch of good info…

image

…including button at the bottom that downloads an RDP file with everything needed to connect to the VM via remote desktop. Alternatively, you can simply fire up remote desktop and connect using the DNS NAME (in this case azvm-lab.cloudapp.net) and the port for RDP which is listed on the ENDPOINTS tab.

Note: using Remote Desktop Connection Manager makes like super easy when working on a lot of remote systems as will be the case in this setup – learn it love it live it!

Before connecting to the VM to start configuring it as a Domain Controller, you will want to configure the VM to use a static IP address. This must be completed using Azure PowerShell cmdlets. Below is a sample script:

$vnet = "AZVM-VNet"
$cloud_svc = "AZVM-LAB"
$vm_name = "AZVM-DC01"
$ip = "10.0.0.99"
$ip_test = Test-AzureStaticVNetIP -VNetName $vnet -IPAddress $ip

# if IP is available, configure VM to use it
if($ip_test.IsAvailable) {
    $staticVM = Get-AzureVM -ServiceName $cloud_svc -Name $vm_name
    Set-AzureStaticVNetIP -VM $staticVM -IPAddress $ip | Update-AzureVM
    Get-AzureStaticVNetIP -VM $staticVM
} else {        
    "IP Address Not Avail (" + $vm_name + ", " + $ip + ")"

    # $staticVM = Get-AzureVM -ServiceName $cloud_svc -Name $vm_name
    # Set-AzureStaticVNetIP -VM $staticVM -IPAddress $ip_test.AvailableAddresses[0] | Update-AzureVM
    # Get-AzureStaticVNetIP -VM $staticVM
}

The sample script above checks to see if the specified IP address is available and if so, assigns it to the VM. If the IP Address is not available, then the an message is returned stating so. Or alternatively, in the else-branch, you could use the lines commented-out to pluck the first item from the “AvailableAddresses” list object returned by the Test-AzureStaticVNetIP command – though personally I’m too much of a control freak.

image

Note: this doesn’t actually set the IP address of the VM’s network adapter to a static IP address. Its better thought of as a DHCP reservation via the VNet – such that the VM will always get the same IP address even though the network adapter remains configured for DHCP.

Once that’s out of the way, you can proceed with connecting to the VM via remote desktop, enable active directory services and create a domain just like you would with any other domain system (instructions). Finally, before moving on with creating the rest of the VMs, you will want to update the configuration of the VNet we created earlier to specify AZVM-DC01 as the DNS server for the VNet.

image

Creating Additional VMs

At this point, we can go ahead and create the rest of the VMs using the same process we did for the Domain Controller. The only difference is that I specified 2 cores and 3.5 GB of memory for these VMs…but we’ll probably adjust them further once we get into the performance testing section.

VM Name IP Address
AZVM-SQL01 10.0.0.21
AZVM-SSAS-MSTR 10.0.0.31
AZVM-SSAS-BE01 10.0.0.32
AZVM-SSAS-BE02 10.0.0.33
AZVM-SSAS-QS01 10.0.0.34
Join the VM to the Domain

Once the VMs are online, connect via remote desktop and join the machine to the domain. Below is the script you can use at the command line to speed things up:

NETDOM join <vm-name> /Domain:<domain> /userd:<username> /passwordd:<passwd>

e.g.
NETDOM join AZVM-SQL01 /Domain:byobi.local /userd:<username> /passwordd:<passwd>
NETDOM join AZVM-SSAS-MSTR /Domain:byobi.local /userd:<username> /passwordd:<passwd>
...

When prompted to restart the VM, hold off until you complete the next step (which also requires a reboot). Rebooting Azure VMs takes a bit of time so minimizing the number of them is ideal.

Assign Static IP Addresses

Below is a PowerShell script that iterates through a list of VM-Name/IP-Address combinations and sets the static IP address for each…

$vnet = "AZVM-VNet"
$cloud_svc = "AZVM-LAB"

# create matrix containing vm/ip pairs
$vm_list = @(
    ("AZVM-SQL01","10.0.0.21"),
    ("AZVM-SSAS-MSTR","10.0.0.31"),
    ("AZVM-SSAS-BE01","10.0.0.32"),
    ("AZVM-SSAS-BE02","10.0.0.33"),
    ("AZVM-SSAS-QS01","10.0.0.34")
)

# loop through list of vm/ip pairs and set static IP addresses
for($i=0; $i -le ($vm_list.Length-1); $i++) {
    # get current VM name and desired IP
    $vm_name = $vm_list[$i][0]
    $ip = $vm_list[$i][1]

    # get status of IP
    $ip_test = Test-AzureStaticVNetIP -VNetName $vnet -IPAddress $ip

    # if IP is available, configure VM to use it
    if($ip_test.IsAvailable) {
        $staticVM = Get-AzureVM -ServiceName $cloud_svc -Name $vm_name
        Set-AzureStaticVNetIP -VM $staticVM -IPAddress $ip | Update-AzureVM
        Get-AzureStaticVNetIP -VM $staticVM
    } else {
        "IP Address Not Avail (" + $vm_name + ", " + $ip + ")"

        # $staticVM = Get-AzureVM -ServiceName $cloud_svc -Name $vm_name
        # Set-AzureStaticVNetIP -VM $staticVM -IPAddress $ip_test.AvailableAddresses[0] | Update-AzureVM
        # Get-AzureStaticVNetIP -VM $staticVM
    }
}

Running this script will automatically reboot the VM. Don’t forget to update the login info to use domain-credentials (e.g. byobi\anton) when reconnecting via remote desktop.

Add Additional Data Disks

Now that we have all of our VMs created and joined to the domain, we can go ahead and create additional data disks which will be used to store the SQL/SSAS server data files. This is one of a number of best practices for working with SQL Server on Azure VMs.

And just like most every other task when working with Windows Azure, you can use PowerShell to script it out…

# --------------------------------------------------------------------------------
# Create Data Disks
# --------------------------------------------------------------------------------
# Requires default storage account
# Still need to go into VMs and set to "Online" and create volume

$vnet = "AZVM-VNet"
$cloud_svc = "AZVM-LAB"

$vm_name = "AZVM-SQL01"
$vm = Get-AzureVM -ServiceName $cloud_svc -Name $vm_name
$disk_lbl = "AZVM-SQL01-Data01"
Add-AzureDataDisk -CreateNew -DiskLabel $disk_lbl -DiskSizeInGB 999 -LUN 0 -VM $vm | Update-AzureVM
Get-AzureDataDisk -VM $vm

$vm_name = "AZVM-SSAS-MSTR"
$vm = Get-AzureVM -ServiceName $cloud_svc -Name $vm_name
$disk_lbl = "AZVM-SSAS-MSTR-Data01"
Add-AzureDataDisk -CreateNew -DiskLabel $disk_lbl -DiskSizeInGB 999 -LUN 0 -VM $vm | Update-AzureVM
Get-AzureDataDisk -VM $vm

$vm_name = "AZVM-SSAS-BE01"
$vm = Get-AzureVM -ServiceName $cloud_svc -Name $vm_name
$disk_lbl = "AZVM-SSAS-BE01-Data01"
Add-AzureDataDisk -CreateNew -DiskLabel $disk_lbl -DiskSizeInGB 999 -LUN 0 -VM $vm | Update-AzureVM
Get-AzureDataDisk -VM $vm

$vm_name = "AZVM-SSAS-BE02"
$vm = Get-AzureVM -ServiceName $cloud_svc -Name $vm_name
$disk_lbl = "AZVM-SSAS-BE02-Data01"
Add-AzureDataDisk -CreateNew -DiskLabel $disk_lbl -DiskSizeInGB 999 -LUN 0 -VM $vm | Update-AzureVM
Get-AzureDataDisk -VM $vm 

The script above requires you to have a default storage account associated with your Azure Subscription. You can use the Get-AzureSubscription command to check if a default storage is already specified (or you can just wait and see if it errors out). If not, then you can use the command below to set it:

 Set-AzureSubscription -SubscriptionName <azure-suscription-name> -CurrentStorageAccount <storage-account> 

The values for <azure-subscription-name> and <storage-account> in the command above can be found in the Windows Azure management portal under STORAGE:

image

 

Once the disks are created, you will want to connect to each VM, bring the disk online, and create a volume. Don’t worry too much about estimating the necessary disk space, you only pay for what you use so just go ahead and max it out at 999 GB:

image

Install SQL Server, SSAS Server, and Tools on various VMs

Now we’re ready to start installing SQL Server, SSAS Server on the VMs. I highly recommend using a Configuration File so that you can simply kick off the entire installation process from the command line with all the options already specified instead of having to go through each step, one by one. This frees you up to grab a buttery scone w/ cherry and chocolate and another coffee from your favorite neighborhood coffee shop Winking smile

Below is a table breaking down the features to specify in the Configuration File for each of our VMs:

VM Name Features to Install
AZVM-SQL01 SQL, Tools
AZVM-SSAS-MSTR AS, Tools
AZVM-SSAS-BE01 AS, Tools
AZVM-SSAS-BE02 AS, Tools
AZVM-SSAS-QS01 AS, Tools

Also, don’t forget to specify the newly provisioned data disk volume(s) for the data files for SQL/AS.

Note: for detailed information on using a ConfigurationFile and installing SQL Server from the command line, start here.

Connectivity

At this point, we have a fully functioning domain with multiple servers and instances of SQL/SSAS. However, before we can dive in and start building, deploying, and processing Sql Server and Analysis Services databases, we need to poke a few holes in the firewall.

Below are the two scenarios we want to support:

  1. VM-to-VM: this is pretty obvious. The SSAS servers (i.e. AZVM-SSAS-MSTR, AZVM-SSAS-BE01, AZVM-SSAS-BE02) need to be able to connect to the SQL Server (i.e. AZVM-SQL01) during cube processing.
  2. Local-to-VM: this scenario is somewhat optional. We could technically do all of the development on one of the Azure VMs and deploy to the targets using VM-to-VM connectivity. However, if we want to do our development using an On-Premise workstation and deploy to Windows Azure, there’s a few additional steps.
VM-to-VM

In order to enable VM-to-VM connectivity we must open up the necessary ports in the firewall – this is nothing new from a standard on-premise scenario where SQL Server and Analysis Services instances are spread across multiple servers. As a time-saver, we’re going to use Group Policy to setup firewall rules – instead of going from server to server.

Personally, I like to separate SQL Server VMs (i.e. AZVM-SQL01) from SSAS Server VMs (i.e. AZVM-SSAS-MSTR, AZVM-SSAS-BE01, etc) in Active Directory by creating 2 separate organizational units (e.g. SQL-Servers, SSAS-Servers) and moving the computer objects out of the default bucket (i.e. Computers) and into their respective OU.

image

This allows us to create separate GPOs (group policy objects) and only open up the ports necessary for each type of server:

image

SQL Servers: there’s only one SQL Server instance in this lab and it was installed as the default instance so I can either specify the port (TCP 1433) or grant access to the program (sqlservr.exe). Initially I chose to grant access to the program, but later realized I needed to specify access to the port as well in order to accommodate external access via endpoints (discussed later)…so technically, I don’t need the second rule shown below and could have gotten by with just the TCP1433. If you decide to use a named instance, don’t forget to open up access for SQLBrowser.

image

SSAS Servers: using the default configuration, SSAS named instances will take on a new port after each restart. As a control freak, I hate this and so I prefer to override the default configuration in the msmdsrv.ini file (i.e. <Port>0</Port>) and set it to a specific one (e.g. <Port>50001</Port>). I believe this is also a requirement of working with SSAS instances in Windows Azure according to this article.

Named Instance

For a named instance of Analysis Services, the SQL Server Browser service is required to manage port access. The SQL Server Browser default configuration is port 2382.

In the virtual machines firewall, open port 2382 and create a static Analysis Services named instance port.

At the same time, this is a lab, and I’m eager to get on to the juicy bits of the project so I decided to use the same port for the SSAS instance on each VM…just as I chose to use the same active directory service account for all instances. It saves time at the expense of security. Don’t do this in production!

Below is a screenshot showing the inbound firewall rules for the SSAS group policy object. As you can see I’ve opened access to the static port (i.e. 50001) as well as the SqlBrowser program.

image

Note: more details about access ports for SSAS instances, read this.

Once firewall rules have been added to the GPO objects, you can either wait for the GPOs to get pushed out to the VMs or you quickly toggle between each VM (via Remote Desktop Connection Manager) and force a group policy update from the command prompt via the GPUpdate command…

image

At this point, we should be able to connect any instance (SQL or SSAS) from any of the VMs in the domain.

Local-to-VM

Note: If you plan to work entirely in the Windows Azure cloud, then this section is unnecessary. However, if you’d prefer to do development on your local workstation, then you’ll need to configure public facing endpoints (or point-to-site-connectivity)

There are two primary options for connecting to the SQL Server and SSAS instances from an on-premise workstation. The easiest option is to create endpoints which expose internal ports via the Virtual Network – this is basically the same as setting up port-forwarding. The other option is to enable point-to-site connectivity which is a bit more involved, but necessary in order to connect to SSAS instances via Windows Authentication – so that’s the one we’re going to use Winking smile

To enable point-to-site-connectivity, you just need to configure the Virtual Network, create and distribute some certificates, and create the VPN configuration. A good walk-through of the configuration steps can be found here.  Also, this post by Scott Gu (b | t) may prove helpful in explaining what’s actually happening under the covers and covers some other more advanced scenarios that you might be interested in.

Once everything is setup, we can connect to the VPN connection that was created as part of the point-to-site connectivity instructions.

image

We can now access all of the VMs in the Azure Virtual Network from our local system:

image

However, in order to connect via windows authentication (a requirement for SSAS), we will need to “fake” our credentials so that it appears we are logged into our local workstation with an Azure domain user account (i.e. byobi.local\anton).

Note: In a production scenario, your on-prem and Windows Azure environment are likely to use the same domain and this isn’t necessary.

To do this, we can simply create a shortcut using the following command as the target:

C:\Windows\System32\runas.exe /noprofile /netonly /user:byobi.local\anton ssms.exe

image

When I double click the shortcut, I am prompted for the password associated with the domain account listed…

image

After entering the password, SSMS will open and we can now connect to the SQL Server and SSAS instances on our Azure VMs via Windows Authentication:

image

We also need to do the same for Visual Studio so that we can build our SSAS database project on our On-Premise workstation and deploy to the Azure VMs.

Lessons Learned

  • Use PowerShell to script tasks (e.g. StartUp/Shutdown of VMs, ) and/or use RunBooks for automation. For info on the latter, check out this post by Devin Knight (b | t).
  • Use Remote Desktop Connection Manager – you will be switching between VMs frequently and this makes it easy to keep track of the rdp sessions.
  • $150/month in Windows Azure credit goes a long way when you’re only using 11 cores and ~20 GB of memory across 6 VMs.
  • If you’re going to setup point-to-site-connectivity (i.e. VPN access to the Azure VMs), don’t bother creating Endpoints.

This is by no means a comprehensive step-by-step guide for standing up the infrastructure using Windows Azure. Think of it more as a high-level guide for standing up a lab environment. There are many important considerations that should be taken into account for production situations that were not touched on. So if you have any questions please feel free to reach out through the comments or via twitter and I’ll do my best to answer/respond.

In the next post, well setup our source database and begin building our OLAP Farm in visual studio.

12 thoughts on “Building an OLAP Farm in the Cloud with Azure Virtual Machines – Part 1

  1. gbrueckl says:

    great post!
    I did something similar and concluded my findings here:
    http://files.gbrueckl.at/blog/Sessions/ScalingAnalysisServicesInTheCloud.pdf
    it may be worth taking a look at it

    it also includes info about IO statistics that may be valuable for your researches

    cant wait for your part 2 🙂

    -gerhard

    1. Bill says:

      Gerhard – WHOAH! that’s some seriously good information – can’t tell you how much time I spent tweaking drive configurations trying to maximize IO throughput.

      Would love to see the actual presentation…is there a recording?

      1. gbrueckl says:

        No, i’m afraid it was not recorded, sorry
        but we could do a short call if you are interested in details

  2. sqlchicken says:

    Absolutely brilliant post, thank you! Looking forward to part 2 and beyond.

    1. Bill says:

      Jorge – thanks, glad you liked it.

      unfortunately, there’s been a bit of a delay in getting part 2 out (client deadline and working around the clock). It will probably be late August at the earliest.

  3. pmdci says:

    Hi there,

    Very interesting article. Thank you a lot for sharing this. I was wondering if there is a way to make HA with an additional query server (I only see one query server in your farm). Is this possible?

    Regards,
    P.

    1. Bill says:

      Thanks – glad you found the post interesting.

      With regards to your question of adding additional query servers – absolutely possible! In fact, it would be UNcommon to see this architecture with only a single query-server.
      The reason there’s only one in this post is because I was trying to conserve CPU cores (and time)…MSDN subscriptions limit you to 20 cores for Azure VMs.

      1. gbrueckl says:

        the 20 core limit is only a theoretical limitation. you can write a mail to the support and increase it to any number you want without any additional costs.
        I did it recently and it works just fine

        -gerhard

        1. Bill says:

          that’s definitely a good thing to know…email sent! thanks Gerhard!

Leave a Reply