One of the major value propositions of U-SQL is that it allows to query data where it lives. For external systems, such as Microsoft Azure SQL Database, this is achieved with federated queries against data sources.
In order to query these external data sources, a data source object has to be created and referenced that abstracts the connection information as well as information about its capabilities to execute query expressions passed and translated from U-SQL to the its local query engine.
Prerequisites
· An Azure Subscription
· Azure Data Lake Store (ADLS)
· Azure Data Lake Analytics (ADLA)
· Azure SQL Database (ASQLDB) or Azure SQL Data warehouse (ASQLDWH) with SQL login/password
· Visual Studio 2015. Optional, to create and execute U-SQL queries, this can also be done in the Azure portal.
https://www.visualstudio.com/products/free-developer-offers-vs
· Azure Data Lake Tools for Visual Studio 2015
https://www.microsoft.com/en-us/download/details.aspx?id=49504
· Azure PowerShell
· RX Permissions on your ADLS (data level) to be able to create the Catalog Secret
o Navigate to ADLS in Azure Portal
o Click Data Explorer
o Click Access
o Validate you have Read/Execute permissions.
· Allow IP range in the ASQLDB server firewall for the ADLA services that fire the U-SQL queries
o Navigate to ASQLDB server in Azure Portal
o Click Settings
o Click Firewall
o Create new rule with range 25.66.0.0 to 25.66.255.255
Steps:
1. Create ADLA Database using a U-SQL query:
DROP DATABASE IF EXISTS YourADLADatabase;
CREATE DATABASE YourADLADatabase;
2. Create Catalog Secret in ADLA Database which contains the password for the SQL login and connection string for the ASQLDB database using the Azure PowerShell script below:
#Login (login pop up appears) Login-AzureRmAccount #Show your available Azure Subscriptions Get-AzureRmSubscription #Connect to the Azure Subscription in which your ADLA Database exists Set-AzureRMContext -SubscriptionId 00000000-0000-0000-0000-000000000000 #$passwd: password for ASQLDB / ASQLDWH which you want to federate #-Account: ADLA account name #-DatabaseName: Data Lake Analytics Database name #-Host: Host of ASQLDB / ASQLDWH $passwd = ConvertTo-SecureString "YourPassword" -AsPlainText -Force $mysecret = New-Object System.Management.Automation.PSCredential("YourASQLDB_Secret", $passwd) New-AzureRmDataLakeAnalyticsCatalogSecret -Account "youradlaaccount" -DatabaseName "YourADLADatabaseName" -Secret $mysecret -Host "yourasqldb.database.windows.net" -Port 1433
3. Create CREDENTIAL with IDENTITY that matches the AzureRmDataLakeAnalyticsCatalogSecret name as used in the PowerShell script (YourASQLDB_Secret) in ADLA Database using the U-SQL query below:
//Connect to ADLA Database
USE DATABASE YourADLADatabaseName;
//Create CREDENTIAL
//USER_NAME: ASQLDB Username
//IDENTITY: ADLA Catalog Secret, must match name chosen in PowerShell script
CREATE CREDENTIAL IF NOT EXISTS YourASQLDB_Secret WITH USER_NAME = “YourASQLDB_Username”, IDENTITY = “YourASQLDB_Secret”;
4. Create Data Source in ADLA Database with a reference to the ASQLDB using the U-SQL query below:
// Create External Data source on AZURESQLDB
CREATE DATA SOURCE IF NOT EXISTS ASQL_YOURDB FROM AZURESQLDB WITH
( PROVIDER_STRING = “Database=YourASQLDB;Trusted_Connection=False;Encrypt=True”
, CREDENTIAL = YourASQLDB_Secret
, REMOTABLE_TYPES = (bool, byte, sbyte, short, ushort, int, uint, long, ulong, decimal, float, double, string, DateTime)
);
5. Create an External Table in ADLA Database based on the Data Source using the U-SQL query below:
// CREATE EXTERNAL TABLE
// External tables are optional. You can refer to data source tables in queries directly (Lazy Metadata loading):
// FROM EXTERNAL ASQL_PATIENTS LOCATION “dbo.sometable”
CREATE EXTERNAL TABLE someExternalTable (
[someColumn] string
) FROM ASQL_YOURDB LOCATION “dbo.someTable”;
6. Query the federated external ASQLDB table and output result to file using the U-SQL query below:
@query =
SELECT someColumn
FROM someExternalTable;
OUTPUT @query TO “/Output/file.csv”
USING Outputters.Csv();
Creation of the secret can also be performed using the Azure Command Line Interface instead of PowerShell. Instead of using the PS command “New-AzureRmDataLakeAnalyticsCatalogSecret” you can use the CLI command “azure datalake analytics catalog secret create”.
CLI syntax: azure datalake analytics catalog secret create “myADLAAccount” “myADLADatabase” “tcp://mysqldbserver.database.windows.net:1433”
LikeLike
But what… I feel the Azure CLI is completely useless. We have PowerShell… why reinvent the wheel?
LikeLike
Why that is…
LikeLike
Hi Graham, CLI gives you PowerShell capabilities on all platforms. You can use same commands no matter what platform you use: Windows, Linux or Mac.
LikeLike
I’ve followed your steps and it’s working like a charm until last week everything stopped working and I can’t create an external data source anymore I got login failed for user contosoadmin, nothing changed from our end. Are you aware of any changes to the backend.
LikeLike
I’m in the same boat 😦
LikeLike
Exactly the same error? Can you share the exact message Graham?
LikeLike
Very well explained, this works well
LikeLike
New-AzureRmDataLakeAnalyticsCatalogSecret : Operation returned an invalid status code ‘NotFound’
Anyone got this?
LikeLike
This doesn’t seem to work anymore. Any updates to work around?
LikeLike
This post just saved me. Got me unblocked for trying to run CLI command “az dla catalog credential create”. The big help was you made it clear the expected database to name is a Data Lake DB, and not the source Azure SQL database. All other articles/docs I found led me to think otherwise. So, a big THANK YOU to you!
LikeLiked by 1 person
Hi Jorg,
for some reason I cannot seem to create a catalog secret. You should enter the azure data analytics account name right? It keeps on telling me to re login. Any ideas what this can be?
LikeLike
how do I Insert into the federated external ASQLDB table
LikeLike
Hi Vijay,
You can only read from external tables, inserting data is not possible.
LikeLike
Hi guys, I saw that someone is getting errors, and don’t know if they might be related, but the CREATE CREDENTIAL has been deprecated and it doesn’t work anymore, therefore you should use the New-AzureRmDataLakeAnalyticsCatalogCredential cmdlet to do that. The rest of the U-SQL script is fine.
HTH, bye, Francesco
LikeLiked by 1 person
Hi , I’m trying to insert data into AzureSQLDB from USQL . Is it possible?
LikeLike