Friday, July 9, 2010

Dynamic Database Connection using SSIS ForEach Loop Container

Did you ever come across a requirement where you have to run the same SQL statement(s) in multiple database server/instances using SSIS?

Many of my friends have asked about how to connect through multiple Databases from different Server using single Dynamic Connection. I want to explain this feature in this article. Basically, I want to execute one query (to calculate Record Counts for a given table) on a set of servers (which can be Dev, Test,UAT, PreProduction and Production servers). In my example, I am using ForEach Loop to connect to the servers one by one--> Execute the Query --> Fetch and store the data.

So here is the approach:
  • Create a Table in your local database (whatever DB you want) and load all the connection strings. Within SSIS package, use Execute SQL Task to query all the connection strings and store the result-set in a variable of object type.
  • Use ForEach Loop container to shred the content of the object variable and iterate through each of the connection strings.
  • Place an Execute SQL task inside ForEach Loop container with the SQL statements you have to run in all the DB instances. You can use Script Task to modify your query as per your need.



Below is the details with an example:

STEP1:
To begin, Create two tables as shown below in on of the environment:

-- Table to store list of Sources
CREATE TABLE SourceList (
   ID [smallint],
   ServerName [varchar](128),
   DatabaseName [varchar](128),
   TableName [varchar](128),
   ConnString [nvarchar](255)
)

GO

-- Local Table to store Results
CREATE TABLE Results(
   TableName  [varchar](128),
   ConnString [nvarchar](255),
   RecordCount[int],
   ActionTime [datetime]
)
GO

STEP 2:
Insert all connection strings in SourceList table using below script:
INSERT INTO SourceList

SELECT 1 ID,
'(local)' ServerName, --Define required Server
'TestHN' DatabaseName,--Define DB Name
'TestTable' TableName,
'Data Source=(local);Initial Catalog=TestHN;Provider=SQLNCLI10.1;Integrated Security=SSPI;Auto Translate=False;' ConnString
 
Insert as many connections as you want.
 
STEP 3:
Add new package in your project and rename it with ForEachLoopMultipleServers.dtsx. Add following variable:
 
VariableTypeValuePurpose
ConnStringStringData Source=(local);
Initial Catalog=TestHN;
Provider=SQLNCLI10.1;
Integrated Security=SSPI;
Auto Translate=False;
To store default connection string
QueryStringSELECT '' TableName,
N'' ConnString,
0 RecordCount,
GETDATE() ActionTime
Default SQL Query string.
This can be modified at runtime based on other variables
SourceListObjectSystem.ObjectTo store the list of connection strings
SourceTableStringAny Table Name.
It can be blank.
To store the table name of current connection string.
This table will be queried at run time

STEP 4:
Create two connection managers as shown below:


Local.TestHN: For local database which has table SourceList. Also this will be used to store the result in Results table.
DynamicConnection: This connection will be used for setting up dynamic connection with multiple servers.
Now click on DynamicConnection in connection manager and click on ellipse to set up dynamic connection string. Map connection String with variable User::ConnString.

STEP 5:
Drag and drop Execute SQL Task and rename with "Execute SQL Task - Get List of Connection Strings". Now click on properties and set following values as shown in snapshot:
Result Set: Full Result Set
Connection: Local.TestHN
ConnectionType: Direct Input
SQL Statement: SELECT ConnString,TableName FROM SourceList

Now click on Result Set to store the result of SQL Task in variable User::SourceList.

STEP 6:
Drag and drop ForEach Loop container from toolbox and rename with "Foreach Loop Container - DB Tables". Double click on ForEach Loop container to open Foreach Loop Editor. Click on Collection  and select Foreach ADO Enumerator as Enumerator. In Enumerator configuration, select User::SourceList as ADO object source variable as shown below:

STEP 7: Drag and drop Script Task inside ForEach Loop container and double click on it to open Script Task Editor. Select User::ConnString,User::SourceTable as ReadOnlyVariables and User::Query as ReadWriteVariables. Now click on Edit Script button and write following code in Main function:

public void Main()

{
   try
   {
      String Table = Dts.Variables["User::SourceTable"].Value.ToString();
      String ConnString = Dts.Variables["User::ConnString"].Value.ToString();
      MessageBox.Show("SourceTable = " + Table + "\nCurrentConnString = " + ConnString);
      //SELECT '' TableName,N'' ConnString,0 RecordCount,GETDATE() ActionTime
      string SQL = "SELECT '" + Table + "' AS TableName, N'" + ConnString + "' AS ConnString, COUNT (*) AS RecordCount, GETDATE() AS ActionTime FROM " + Dts.Variables["User::SourceTable"].Value.ToString() + " (NOLOCK)";

      Dts.Variables["User::Query"].Value = SQL;
      Dts.TaskResult = (int)ScriptResults.Success;
   }
   catch (Exception e)
   {
      Dts.Log(e.Message, 0, null);
   }
}
 
STEP 8:
Drag and drop Data Flow Task and double click on it to open Data Flow tab. Add OLE DB Source and Destination. Double click on OLE DB Source to configure the properties. Select DynamicConnection as OLE DB connection manager and SQL command from variable as Data access mode. Select variable name as User::Query. Now click on columns to genertae meta data.

Double click on OLE DB Destination to configure the properties. Select Local.TestHN as OLE DB connection manager and Table or view - fast load as Data access mode. Select [dbo].[Results] as Name of the table or the view. now click on Mappings to map the columns from source. Click OK and save changes.
Finally DFT will look like below snapshot:

STEP 9: We are done with package development and its time to test the package.
Right click on the package in Solution Explorer and select execute. The message box will display you the current connection string.
 Once you click OK, it will execute Data Flow Task and load the record count in Results table. This will be iterative process untill all the connection are done. Finally package will execute successfully.


You can check the data in results table:
Here is the result:

SELECT * FROM SourceList




SELECT * FROM Results

17 comments:

  1. I was looking for the same approach... Thanks a lot Hari.

    ReplyDelete
  2. This is great article. I had this question for a while and this helps. Thanks

    ReplyDelete
  3. Thanks folks.... I am happy to help you!

    ReplyDelete
  4. can i have dtsx file to download it.
    Thaks for the help.

    ReplyDelete
  5. Nice article. Could you expand this article to:
    Given variables are: SourceServer,SourceDatabase, SourceTable, DestinationServer, DestinationDatabase,DestinationTable. if we have the same tablename in destinationDatabase, rename it with current date(talename+date) and import the new table.
    all source and destination variables are dynamic.

    ReplyDelete
  6. Follow this link to find out dynamic connection in SSIS
    http://www.abhyast.com/abhyastcom/post/How-to-connect-to-multiple-servers-dynamically-and-collect-information-using-SSIS.aspx

    ReplyDelete
  7. It helps me alot. Thank you!

    After retrieving data from OLEDB Datasource, I got all Strings returned with limit size =1. And an error of truncate raised.

    Please help me to fix it.

    ReplyDelete
  8. How have you taken "DynamicConection" Connection manager

    ReplyDelete
  9. Azar, I am using a table (SourceList) to loop through server details. Please go through STEP 3 and let me know if you still have questions..

    ReplyDelete
  10. How do you get the foreach loop to change the connstring variable in each iteration? I only get execution against the default connection.

    ReplyDelete
  11. Hi Hari, Just wondering if package configuration would achieve the same goal?

    ReplyDelete
  12. also hari, did you use windows authentication or sql server authentication? I did not find the step which you supplied user id and password to the soucelist.

    Thanks

    ReplyDelete
  13. Didn't see the Parameter Mapping in the ForEachLoop container to map the Connection String sitting in the object to the USER::ConnString variable. Without it it will continue to use the default connection string for each loop instead of changing it.

    ReplyDelete
  14. I m using for Flat file
    while setting variable to connection string validation getting fail , so am not able to move forward
    is it possible in case of flat files ?

    ReplyDelete
  15. Sorry for the slow response Jeff, but I did not get your question. Point 4 clearly defines how to configure connection manager called DynamicConnection.

    ReplyDelete
  16. SaS, Are you looking for a solution something like this:
    http://sql-bi-dev.blogspot.com/2010/06/foreach-loop-container-in-ssis.html

    ReplyDelete

Note: Only a member of this blog may post a comment.