Last week Microsoft released CTP3 of SQL Server Integration Services (SSIS), code name: Denali. In this blog post I will look into the new key features and some of the minor improvements in SSIS Denali.
You can also create Shared Cache Connection Managers for your cached lookups. This will be very useful for lookups that are performed multiple times, for example when you look up dimension tables multiple times from your fact table ETL. You can just convert the local cache connection manager by right clicking it:
What happened to the Shared Data Sources we knew from prior SSIS versions? Well they are no more. Shared Data Sources only lived at design time and not at runtime so they were not really useful anyway. Shared Connection Managers do live at runtime and offer even more possibilities as I will show later on, so they replace the old Shared Data Sources.
Data Flow – Column mappings
SSIS always mapped columns from source to transformations or destinations with the help of lineage ids. Every column had a unique metadata ID that was known by all components in the data flow. If something changed in the source this would break the lineage ids and raised error messages like: The external metadata column collection is out of synchronization with the data source columns.
To fix this error you would re-map all broken lineage ids with the “Restore Invalid Column References Editor”.
In Denali lineage-ids are no longer used. Mappings are done on column names, which is great because you can now use auto map on column names and even copy/paste pieces of another data flow and connect them by mapping the corresponding column names.
Data Flow – Flexible order of authoring
This improvement helps you edit data flow components even when they don’t have an input attached. Theoretically you can build your data flow backwards; start with the destination and track back to the source.
Data flow groups are 100% eye candy; you can’t set any properties on them.
Data Flow – Data Quality Services Cleansing transformation
With this transformation you can apply data quality rules in the data flow. This is done by using a Knowledge Base which can be created by yourself or downloaded from the Windows Azure Marketplace. For example you could apply a rule that checks if a given postal code column is valid for a particular town column in your record.
Data Flow – Data Tap
In Denali, we have the possibility to attach a “tap” at a data flow path (arrow). This tap captures all data coming through and dumps it in CSV files.
SSIS Denali is fundamentally different to its predecessors when it comes to the concept and usage of configurations. SSIS package configurations are obsolete (they will still be available if you really want to use them ;-)) and parameters and environments are in the new kids in town.
SSIS Parameters look a lot like SSIS variables but there are some differences. There are two types of these parameters:
1. Package Parameters:
Look at Package parameters as C# parameters, which are passed as input to a C# function(=your package). You can set them when executing (call) a package and the lifecycle of the parameters are limited to the SSIS package itself.
The difference with SSIS variables? You can set the parameters while executing a package with SSMS or an Execute Package Task.
2. Project Parameters:
Identical to Package Parameters, except for the scope, these Parameters exist on project level and can be referenced throughout all SSIS packages in your project. Package Parameters only live inside a single SSIS package.
Loads of capabilities here! I now realize that I have created a workaround for Package Parameters with my SSIS Package design pattern for loading a data warehouse where I (mis)used Package Configurations as Package Parameters. Creating a new package design pattern for Denali definitely goes on my TODO list!
Environments are a collection of SSIS package settings that can be define on the SSIS Server. At runtime, the environment will override these settings in the SSIS packages. You can create multiple environments and when you run your SSIS packages you can select which environment it should use. It’s also possible to let multiple SSIS projects run under one environment, so flexibility all around the clock.
To make you understand the principle of Environments right away I have created a simple example that you will commonly use: Create two Environments, one with development settings and one with production settings.
Next create another Environment with the name Production and also create an Environment Variable with the name SCM_EnvironmentVar. The only difference between these two variables should be the value of the variable; it contains the production server connection string instead of the development value.
You now have two Environments with one Environment Variable for the Shared Connection Manager each:
Both environment are now referenced from the Test SSIS project and are available. Let’s configure the Shared Connection Manager so it will use the value specified in one of the environments. Click on the Parameters page, the Connection Managers tab and the … next to the ConnectionString property of the Shared Connection Manager:
We have now succesfully attached the Environment Variable to override the ConnectionString property with a value from either the Development or the Production Environment. Executing the package under one of both Environments is very easy. To do so, right click the package and choose Run:
Pretty straightforward and easy to use. The big difference? All the configuration work has been done in SQL Server Management Studio! Exactly the place where it should be done. In the current SSIS versions most of this work was done within Visual Studio and the configuration was done by editing tables. So unfriendly! Big improvement, great work MS!
In this example I have set a property of a Shared Connection Manager but of course it is possible to set Package or Project Parameters with Environment Variables.
We have had the option to deploy SSIS packages from Visual Studio for a long time thanks to BIDS Helper. Microsoft finally added this functionality out of the box in Denali. They even did more: you can now also build your projects from within Visual Studio!
There are two ways to deploy (push) your projects to the new SSIS server, directly from Visual Studio or by using an Integration Services Project Deployment File. They both use the new Integration Services Deployment Wizard.
Deployments directly from Visual Studio can be done by right clicking your project and then choose Deploy. Now, the Deployment Wizard will pop straight up:
The other way is building your project first. This was already possible in SSIS version(s) prior to Denali but did not really have useful meanings.
In Denali this is different. Building your project will generate an .ispac file which contains your entire project and this can be used by the Deployment Wizard. Again a very nice new feature.
Handing over a new release of a SSIS solution to a customer’s IT department for deployment can now be done in a very nice, professional, manner. Ispac files can also be opened by Visual Studio to import your SSIS project.
Building your project can be done by right clicking your project and choosing Build:
Double clicking (or choosing Deploy in Visual Studio) this file will start the new Integration Services Deployment Wizard:
You can now select the project that you wish to deploy. You can either select to use the .ispac Project deployment file or choose to deploy an existing project located in any Integration Services catalog. The last option is useful when you wish to deploy from, for example, a TEST SSIS solution that is already located in a catalog to an ACCEPTATION environment in another catalog.
The results show a successful deployment. Notice the protection level has been changed. What happened here?
The wizard clears the protection level, as we know from Visual Studio, to prevent “cannot decrypt password” errors.
If you deploy a newer version later on, the existing version(s) stay on the server and you can easily role back to a previous version if you’ve made a bad deployment.
The new SSIS Server is the central storage and administration point of your SSIS solutions. No longer is this a standalone server that you’ll need to manage. Basically it is a database with Stored Procedures.
Under the Integration Services node you will find your SSISDB catalog which holds all your SSIS solutions with its packages, environments, etc. The physical storage of these objects will be conducted in a SQL Server database with the same name as the catalog:
With the help of these stored procedures you can manage your SSIS Server: e.g. your parameter values, connection managers, and override properties by using environments.
Next to the configuration functionalities I have discussed earlier you can also implement security to control access to the catalog, both on folder level and package level. When you right click a folder/package in the catalog and choose the properties you’ll get the following window where you can manage security access:
Finally you’ll get an out of the box reporting dashboard which is built on the out of the box logging functionality that SSIS offers. On this dashboard you’ll get information about the execution history of your package and its sub packages, view which parameters were used, view specific messages and get a performance report over time.
All the information is logged automatically if any package runs on the server. The information is very detailed; you can even get the row counts between SSIS data flow transformations!
When running a package you are able to select a logging level on the Advanced tab:
You can choose for:
– None: turn logging of for performance reasons
– Basic: error and warning logging
– Performance: detailed trace information
– Verbose: diagnostics and fault debugging
So no need for creating your own logging framework anymore, it’s all out of the box!