Setup Azure Data Lake Analytics federated U-SQL queries to Azure SQL Database

Posted by

One of the major value propositions of U-SQL is that it allows to query data where it lives. For external systems, such as Microsoft Azure SQL Database, this is achieved with federated queries against data sources.

image

 

 

In order to query these external data sources, a data source object has to be created and referenced that abstracts the connection information as well as information about its capabilities to execute query expressions passed and translated from U-SQL to the its local query engine.

 

Prerequisites

·         An Azure Subscription

·         Azure Data Lake Store (ADLS)

·         Azure Data Lake Analytics (ADLA)

·         Azure SQL Database (ASQLDB) or Azure SQL Data warehouse (ASQLDWH) with SQL login/password

·         Visual Studio 2015. Optional, to create and execute U-SQL queries, this can also be done in the Azure portal.

https://www.visualstudio.com/products/free-developer-offers-vs

·         Azure Data Lake Tools for Visual Studio 2015

https://www.microsoft.com/en-us/download/details.aspx?id=49504

·         Azure PowerShell

http://aka.ms/webpi-azps

·         RX Permissions on your ADLS (data level) to be able to create the Catalog Secret

o   Navigate to ADLS in Azure Portal

o   Click Data Explorer

o   Click Access

o   Validate you have Read/Execute permissions.

·         Allow IP range in the ASQLDB server firewall for the ADLA services that fire the U-SQL queries

o   Navigate to ASQLDB server in Azure Portal

o   Click Settings

o   Click Firewall

o   Create new rule with range 25.66.0.0 to 25.66.255.255

 

 

Steps:

1.       Create ADLA Database using a U-SQL query:

DROP DATABASE IF EXISTS YourADLADatabase;

CREATE DATABASE YourADLADatabase;

 

2.       Create Catalog Secret in ADLA Database which contains the password for the SQL login and connection string for the ASQLDB database using the Azure PowerShell script below:

#Login (login pop up appears)
Login-AzureRmAccount
#Show your available Azure Subscriptions
Get-AzureRmSubscription
#Connect to the Azure Subscription in which your ADLA Database exists 
Set-AzureRMContext -SubscriptionId 00000000-0000-0000-0000-000000000000

#$passwd: password for ASQLDB / ASQLDWH which you want to federate
#-Account: ADLA account name
#-DatabaseName: Data Lake Analytics Database name
#-Host: Host of ASQLDB / ASQLDWH 
$passwd = ConvertTo-SecureString "YourPassword" -AsPlainText -Force
$mysecret = New-Object System.Management.Automation.PSCredential("YourASQLDB_Secret", $passwd)
New-AzureRmDataLakeAnalyticsCatalogSecret -Account "youradlaaccount" -DatabaseName "YourADLADatabaseName" -Secret $mysecret -Host "yourasqldb.database.windows.net" -Port 1433

 

3.       Create CREDENTIAL with IDENTITY that matches the AzureRmDataLakeAnalyticsCatalogSecret name as used in the PowerShell script (YourASQLDB_Secret) in ADLA Database using the U-SQL query below:

//Connect to ADLA Database

USE DATABASE YourADLADatabaseName;

 

//Create CREDENTIAL

//USER_NAME: ASQLDB Username

//IDENTITY: ADLA Catalog Secret, must match name chosen in PowerShell script

CREATE CREDENTIAL IF NOT EXISTS YourASQLDB_Secret WITH USER_NAME = “YourASQLDB_Username”, IDENTITY = “YourASQLDB_Secret”;

 

4.       Create Data Source in ADLA Database with a reference to the ASQLDB using the U-SQL query below:

// Create External Data source on AZURESQLDB

CREATE DATA SOURCE IF NOT EXISTS ASQL_YOURDB FROM AZURESQLDB WITH

       ( PROVIDER_STRING = “Database=YourASQLDB;Trusted_Connection=False;Encrypt=True”

       , CREDENTIAL = YourASQLDB_Secret

       , REMOTABLE_TYPES = (bool, byte, sbyte, short, ushort, int, uint, long, ulong, decimal, float, double, string, DateTime)

       );

 

5.       Create an External Table in ADLA Database based on the Data Source using the U-SQL query below:

// CREATE EXTERNAL TABLE

// External tables are optional. You can refer to data source tables in queries directly (Lazy Metadata loading):

// FROM EXTERNAL ASQL_PATIENTS LOCATION “dbo.sometable”

CREATE EXTERNAL TABLE someExternalTable (

       [someColumn] string

) FROM ASQL_YOURDB LOCATION “dbo.someTable”;

 

6.       Query the federated external ASQLDB table and output result to file using the U-SQL query below:

@query =

    SELECT someColumn

    FROM someExternalTable;

 

 OUTPUT @query TO “/Output/file.csv”

 USING Outputters.Csv();

16 comments

  1. Creation of the secret can also be performed using the Azure Command Line Interface instead of PowerShell. Instead of using the PS command “New-AzureRmDataLakeAnalyticsCatalogSecret” you can use the CLI command “azure datalake analytics catalog secret create”.
    CLI syntax: azure datalake analytics catalog secret create “myADLAAccount” “myADLADatabase” “tcp://mysqldbserver.database.windows.net:1433”

    Like

  2. I’ve followed your steps and it’s working like a charm until last week everything stopped working and I can’t create an external data source anymore I got login failed for user contosoadmin, nothing changed from our end. Are you aware of any changes to the backend.

    Like

  3. New-AzureRmDataLakeAnalyticsCatalogSecret : Operation returned an invalid status code ‘NotFound’
    Anyone got this?

    Like

  4. This post just saved me. Got me unblocked for trying to run CLI command “az dla catalog credential create”. The big help was you made it clear the expected database to name is a Data Lake DB, and not the source Azure SQL database. All other articles/docs I found led me to think otherwise. So, a big THANK YOU to you!

    Liked by 1 person

  5. Hi Jorg,
    for some reason I cannot seem to create a catalog secret. You should enter the azure data analytics account name right? It keeps on telling me to re login. Any ideas what this can be?

    Like

  6. Hi guys, I saw that someone is getting errors, and don’t know if they might be related, but the CREATE CREDENTIAL has been deprecated and it doesn’t work anymore, therefore you should use the New-AzureRmDataLakeAnalyticsCatalogCredential cmdlet to do that. The rest of the U-SQL script is fine.
    HTH, bye, Francesco

    Liked by 1 person

Leave a comment