SSIS – Unpack a ZIP file with the Script Task

Posted by

A while ago I needed to unpack a couple of zip files from SSIS. There is no Microsoft SSIS task that contains this functionality so I searched the Internet. It seems that there are quite some third party tools that offer this functionally. It’s also possible to download custom SSIS tasks. I personally always try to avoid third party tools and custom tasks so I searched on.
It seemed there is a way to unzip files from SSIS with the Script Task. With some Visual Basic code using the Visual J# Library you can do the job. In this blog post I will use a Foreach Loop Container to loop through a folder that contains multiple zip files and unzip them one-by-one.

Make sure you have the Microsoft Visual J# Redistributable Package installed because a reference to vjslib.dll (Visual J# Library) is needed in the Script Task. Download it here for free.

Drag and drop a Foreach Loop Container on the Control Flow and create three variables with scope on the Foreach Loop Container:
Variables

Now configure the Foreach Loop Container:
– Enumerator: Foreach File Enumerator
– Files: *.zip
– Retrieve file name: Name and extension
Editor

Next click on the + next to Expressions add the following expression to connect the SourceFolder variable to the Directory property of the Foreach Loop Container:
Properties

Now go to the Variable Mappings and select the FileName variable on Index 0. Doing this we will be able to access the current file name when the Foreach Loop Container enumerates the zip files.
Editor2

Now drag and drop a Script Task on the Control Flow, inside the Foreach Loop Container:
Control Flow

Open the Script Task Editor and do the following:
– Set the ScripLanguage on: Microsoft Visual Basic 2008
– Select our three ReadOnlyVariables using the new SSIS2008 Select Variables window:
SelectVariables

Now click Edit Script and copy/paste the following script:

Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Runtime
Imports java.util.zip

    Public Sub Main()

        Try

            Dim strSourceFile As String
            Dim strDestinationDirectory As String

            ‘MsgBox(“Current File: ” & Dts.Variables(“FileName”).Value.ToString)

            strDestinationDirectory = Dts.Variables(“DestinationFolder“).Value.ToString 
            strSourceFile = Dts.Variables(“SourceFolder“).Value.ToString & Dts.Variables(“FileName“).Value.ToString

            Dim oFileInputStream As New java.io.FileInputStream(strSourceFile)
            Dim oZipInputStream As New java.util.zip.ZipInputStream(oFileInputStream)
            Dim bTrue As Boolean = True
            Dim sbBuf(1024) As SByte

            While 1 = 1

                Dim oZipEntry As ZipEntry = oZipInputStream.getNextEntry()

                If oZipEntry Is Nothing Then Exit While

                If oZipEntry.isDirectory Then

                   If Not My.Computer.FileSystem.DirectoryExists(strDestinationDirectory & oZipEntry.getName) Then

                        My.Computer.FileSystem.CreateDirectory(strDestinationDirectory & oZipEntry.getName)

                    End If

                Else

                    Dim oFileOutputStream As New java.io.FileOutputStream(strDestinationDirectory.Replace(“\”, “/”) & oZipEntry.getName())

                    While 1 = 1

                        Dim iLen As Integer = oZipInputStream.read(sbBuf)

                        If iLen < 0 Then Exit While

                        oFileOutputStream.write(sbBuf, 0, iLen)

                   End While

                    oFileOutputStream.close()

               End If

            End While

            oZipInputStream.close()
            oFileInputStream.close()

        Catch ex As Exception

            Throw New Exception(ex.Message)

        End Try

    End Sub

End Class

Now only one thing needs to be done, add a reference to vjslib.dll (Visual J# Library):
Add Reference

&
Libary

Your unzip solution is ready now! For testing purposes you can uncomment the following line in the script to see the file name of each processed zip file in a message box at runtime:

‘MsgBox(“Current File: ” & Dts.Variables(“FileName”).Value.ToString)

MsgBox

You can use this solution in many ways, for example, I used it in the solution below where I download multiple zip files from an FTP. These zip files contain CSV’s that are used as source for the loading of a data warehouse.
Solution

 

18 comments

  1. Hello Jorg Klein,
    Excellent artical. I really like this but your code is in j# can you please provide me same code in C#. It will really help me.
    Thanks
    Rahul.

    Like

  2. Hi Rahul,
    The code is in ‘Microsoft Visual Basic 2008’ and is ready to use in a script task. If you really need C# I advice you to search for a code converter…
    Good luck!
    -Jorg

    Like

  3. Can you please extend this to extract one single file from the .ZIP file please? I need this and dropped here during google.
    Thanks,
    Prabhat

    Like

  4. Hi Jorg,
    I used the same method as yours.
    Previously I worked fine.But this month, the zip file size is increased to 288 MB (After Manual unzip it is 5.58 GB files)
    And this script task failed with this error msg :
    Error: System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation. —> System.Exception: compressionMethod used: 9 (neither STORED nor DEFLATED)
    Do you know how to handle this case?
    Thanks in Advance

    Like

  5. Hi Jorg,
    Your method works brilliantly for us, even extended this by making this a parameters driven callable package.
    However, we now need to do the same thing for an external file that is supplied to us in .GZ format. Is there a similar java or other utility we can call from SSIS to unzip these types of files? May well need to deal with 7zip file soon as well.
    And before anyone assks, no we cannot request the format to be changed, as this is the standard format the data ias delivered in, and we are expected to handle it. We can do it manually, but that sort of defeats the whole purpose of “automating” the loading process.
    Thanks,
    Graeme

    Like

  6. stement is not valid  in name space
    End Class must be preceded by a matching class
    I pasted your script I ge the the erros above

    Like

  7. I fixed the script and now it does not complain. but the following happens. The last task to unzip file blinks green a few time and says complete but no files were unzipped.

    Like

  8. Good lord Jorg, these guys writing in reply to your extremely helpful article are freaking LAZY!!
    They should use this as a sample for what’s possible in SSIS and configure what you’ve provided to use in their own environment…  
    I’m so used to seeing this when we interface with typical IT folks where everything has to be spoon fed…  Try paying attention and LEARNING what Jorg did instead of looking to take his guide and using it without any idea of what’s going on…  
    Cheers!!

    Like

  9. Hi Jorg,
    Can you please help me,
    I am getting below errors.
    at System.RuntimeMethodHandle._InvokeMethodFast(IRuntimeMethodInfo method, Object target, Object[] arguments, SignatureStruct& sig, MethodAttributes methodAttributes, RuntimeType typeOwner)
      at System.RuntimeMethodHandle.InvokeMethodFast(IRuntimeMethodInfo method, Object target, Object[] arguments, Signature sig, MethodAttributes methodAttributes, RuntimeType typeOwner)
      at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture, Boolean skipVisibilityChecks)
      at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
      at System.RuntimeType.InvokeMember(String name, BindingFlags bindingFlags, Binder binder, Object target, Object[] providedArgs, ParameterModifier[] modifiers, CultureInfo culture, String[] namedParams)
      at Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTATaskScriptingEngine.ExecuteScript()
    Thanks
    Anilkumar

    Like

  10. Hi Jorg,
    Thanks for the great article , i have a doubt here , what if the zip files are having the passwords ? how can we handle in the script task ?
    Awaiting for your reply
    Thanks,
    Narendra

    Like

  11. Jorg,
    I like your solution, however I’m migrating to 2012 and seems like the library is not recognized by the system. Appreciate your input.
    Regards,
    Gabe

    Like

  12. Jorg,
    I used this solution. it is working brilliantly in my development environment but, when I move it to the testing environment and run it using a scheduler, it is throwing an error which is indicated below:
    Error: 2015-06-22 09:46:02.04                                                  
      Code: 0x00000002                                                            
      Source: SCT UNZIP FILE                                                      
      Description: The script threw an exception: Exception of type ‘java.io.IOException’ was thrown.
    End Error                                                                      
    any thoughts?

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s