You want to create an encrypted Azure Data Lake Store (ADLS) with a master encryption key that is stored and managed in your own existing Azure Key Vault.
Using this setup, which is showed in the diagram below, all data in your Data Lake Store will be encrypted before it gets stored on disk. To decrypt the data, a master encryption key is required.
In this scenario a “customer managed” key will be used, this means the key is created and managed in your own Azure Key Vault. This as an alternative to a key that is managed and owned by the Data Lake Store service, which is the default. Managing keys in the Key Vault gives additional possibilities like revoking access to the key for the ADLS service identity or even permanently deleting the key from the Key Vault.
In this blog post I’ll guide you through to the 3 steps below, all in an automated way using PowerShell scripting and an Azure Resource Manager (ARM) template to create your encrypted ADLS. I plan to blog later about the possibilities that Visual Studio Team Services offers to perform these deployment tasks.
1. Create new “customer managed” key in existing Azure Key Vault
2. Create a new ADLS with data encryption enabled
3. Grant ADLS service principal access to Azure Key Vault and enable Key Vault managed encryption using your “customer managed” key
· Create Azure Resource Group I have created one named “adls-keyvault-demo” (akd)
· Create Azure Key Vault if you do not already have one. I have created one named “akd-keyvault”
· AzureRM 4.1.0. Module from the PowerShell Gallery. Required since we will use the new Enable-AzureRmDataLakeStoreKeyVault PowerShell function
Executing the PowerShell script below creates the new key in your existing Azure Key Vault, it then creates a new ADLS using an ARM template (see below) and finally it will enable Key Vault managed encryption for your new ADLS. The comments in the script give further explanation and messages during execution will be written to the Windows PowerShell console to inform you on what’s happening. Make sure you have at least AzureRM 4.1.0 installed and the account you will use have sufficient permissions.
The following variables are used:
· subscriptionId – Azure Subscription ID
· rg – Azure Resource Group name
· keyVaultUri – Key Vault DNS Name. Check your Key Vault Properties in Azure Portal.
· keyName – Name of Key Vault key that will be used for the ADLS
· armTemplateFileAdls – Path of your ADLS ARM template JSON file. You can find the definition below the PowerShell script, copy/paste it into a JSON file and store it on disk
· adlsName – Name of your ADLS
# Variables; modify
$subscriptionId = “00000000-0000-0000-0000-000000000000”
$rg = “adls-keyvault-demo”
$keyVaultUri = “https://akd-keyvault.vault.azure.net/”
$keyName = “akd-adls-key”
$armTemplateFileAdls = “C:\CreateEncryptedADLS.JSON”
$adlsName = “akdadls”
#Authenticate to Azure and set the subscription context
Set-AzureRMContext -SubscriptionId $subscriptionId
Write-Host “Get Key Vault Name from URI $keyVaultUri“
$keyVaultHost = ([System.Uri]$keyVaultUri).Host
$keyVaultName = $keyVaultHost.Substring(0, $keyVaultHost.IndexOf(‘.’))
Write-Host “Creating software-protected key $keyName in Key Vault $keyVaultName“
$adlsKey = Add-AzureKeyVaultKey -Destination Software -Name $keyName -VaultName $keyVaultName
#Get current Version identifier of key which will be used for the creation the ADLS using the encryptionKeyVersion parameter
$adlsKeyId = $adlsKey.Version.ToString()
Write-Host “Create new encrypted ADLS by deploying ARM script $armTemplateFileAdls in resource group $rg“
New-AzureRmResourceGroupDeployment -ResourceGroupName $rg -TemplateFile $armTemplateFileAdls `
-DataLakeStoreName $adlsName -KeyVaultName $keyVaultName -DataLakeStoreKeyVaultKeyName $keyName -DataLakeStoreKeyVaultKeyVersion $adlsKeyId
#Get the ADLS account and it’s Service Principal Id
$adlsAccount = Get-AzureRmDataLakeStoreAccount -Name $adlsName
$adlsAccountSPId = $adlsAccount.Identity.PrincipalId
Write-Host “Grant ADLS account Service Principal $adlsAccountSPName required permissions on the Key Vault”
#Grant ADLS account access to perform encrypt, decrypt and get operations with the key vault
Set-AzureRmKeyVaultAccessPolicy -VaultName $keyVaultName -ObjectId $adlsAccountSPId -PermissionsToKeys encrypt,decrypt,get -BypassObjectIdValidation
Write-Host “Enable ADLS Key Vault managed encryption”
Enable-AdlStoreKeyVault -Account $adlsAccount.Name
Write-Host “ADLS $adlsName is now encrypted using key $keyName in Key Vault $keyVaultName“
ARM Template ADLS
“location”: “North Europe”,
“displayName”: “Datalake Store”
“keyVaultResourceId”: “[resourceId(‘Microsoft.KeyVault/vaults’, parameters(‘KeyVaultName’))]”,
After you successfully execute the PowerShell script, navigate to the Azure portal to check if everything is OK.
Data Lake Store à Settings à Encryption
The account is successfully encrypted using the Key Vault key. The ADLS account has a generated Service Principal named “RN_akdadls” which we granted permissions to the Key Vault in the PowerShell script.
Key Vault à Settings à Keys
The key has been created and is enabled.
Key Vault à Settings à Access policies
The ADLS Service Principal has an access policy that we set with the PowerShell script.
Opening it shows the key permissions:
Special thanks to my Macaw colleague Simon Zeinstra for working together on this solution!