Monday, February 20, 2012

Online Data Transform

Dear All,

I am new to SSIS, and just watched some of the webcasts and take a look at SSIS Hands on Lab on the MSDN site.

I want to handle an online data transformation scenario which has some data mappings between. I want the SSIS to add each new record in source table, insert in the destination tables online. If this is possible, please let me know where can I find a sample package and configurations.

Happy new year,
Sassan

What is an "online data transformation"?

What do you mean by "insert into the destination table online"?

Do you mean that you want each record in a source table to be inserted into a destination table? If so, what exactly are you having trouble achieving?

Please be more specific.

-Jamie

|||

Dear Jamie,

Thanks for taking time to answer my question.

Your assumption from my scenario is right. I just wanted to know whether SQL 2005 support this scenario and if yes, I need a sample package and its configuration.

Look forward to hearing from you ASAP.

Regards,
Sassan

PS. I use BizTalk Server 2004 now and I want to change it using SSIS

|||

There are lots of SSIS samples provided with SQL 2005, take a look.

The task you are describing seems to fit exactly Import/Export Wizard goals - just run the wizard, follow it, and it will create the package you need.

If needed, you may then open this package in BI Dev Studio and modify it further.

|||

Dear Michael

As I am working with the BI Dev Studio and ran some of the Hands On Labs, I see that after I create the package, I must run it (Or schedule to run it regurarly) and then all the data wil be transformed by the rules I set.

I need something like trigger in order my package run by each record addition in source.

Therefore I need a sample for a scenario I explained.

Regards,

Sassan

|||

So, you mean you want the SSIS package to execute whenever a certain event occurs? Well, you can execute a SSIS package using a command line utility called dtexec.exe. It is up to you how you capture the events that you need trigger the call to dtexec.exe

-Jamie

|||

Dear Jamie,

You are right. I have two questions which needs help:

1- I have an application server which stores data via its Data Access Layer and My DB Server is not on the same server, Therefore is there another way except using triggers?

2- How can I call the package for transforming just the new record that I want to be added in the old system. I see that running package cause all data to be transformed. The only solution I see is that I set a flag in the source record after transformation and then in my package use a conditional block to just transform the records with the flag unset.

Look forward to hearing from you ASAP.

Regards,
Sassan

|||

sassankz wrote:

You are right. I have two questions which needs help:

1- I have an application server which stores data via its Data Access Layer and My DB Server is not on the same server, Therefore is there another way except using triggers?

Don't really understand what you're trying to do. When a record arrives in your "data access layer" (which I presume is a database), do you want to automatically put it into your DB server? If so....SSIS is NOT the tool that you should be using to do this. SQL Server has a number of methods of achieving this. Look at replication, log shipping or data mirroring.

sassankz wrote:

2- How can I call the package for transforming just the new record that I want to be added in the old system. I see that running package cause all data to be transformed. The only solution I see is that I set a flag in the source record after transformation and then in my package use a conditional block to just transform the records with the flag unset.

If you are going to continue using SSIS (which I don't think you should) then that sounds like a possible solution.

-Jamie

|||

Dear Jamie,

Let me explain the whole scenario. I am developing a Human Resource Management System for my client. Therefore I have my database design based on the customer requirements. The client has a payroll application which is working now and will continue to use it also. Therefore only HRM handles the Employee Introduction and I must feed the Payroll required data in order to work properly parallel with HRM.

Now there exists some data transformation from HRM to Payroll and also the data must be synchronized. It means that if I even can delete an employee from HRM, for sure I can do that it in payroll and must do that.

Now I want to have SSIS for this scenario. I want to develop some packages which handles the transformation and like you said some how data mirroring.

Please let me know whether you see SSIS useful for the above task or not.

Regards,
Sassan

|||

SSIS is inherently a tool for batch operations. If you want the changes between databases to be propogated immediately then you should be loking at some other tool in my opinion.

If the data has to be transformed in some way then perhaps you may want to look at BizTalk. I don't know much about BizTalk but this seems to be the sort of scenario for which it is designed.

-Jamie

|||

Dear Jamie,

Thanks for your recommendations. I decided to schedule package runs.

Is there a solution in SSIS in order to update destination source if the row exists, since it truncate the whole table or append data in its wizard? (I know I can write SQL Codes, I just want to know whether it has a ready configuration to support this.)

Regards,
Sassan

|||

Sassan,

SSIS is really not good in updates... Doing that in SQL is much better... So it might be a good idea to design something like an "update table" which is filled within the data flow where you "buffer" all the changes and commit that updates to the "real" table using a SQL update...

|||

Sassan,

just to add a little thing...

I just wrote something about the updates on the whislist maintained by Kirk in this forum... So I thought about how to implement that "update table"...

There might be (at least) to approaches, one is "clean" and one is "dirty", but might be a little bit more performant...

The "clean" way would be using a lookup to find out if a record already exists in the destination. If it does, then it's an update (redirect the record to the update table), if not it's new and you can send it to the destination.

The "dirty" way would be to send all records to the destination. If the write fails (because the primary key already exists) you redirect the error records to the update table. I didn't try this approach, yet. But it might work...

You might get into trouble if you can have a record more than once in a run. Then it might be a "new" and an "update" in the same batch. That might be interesting to find out...

No comments:

Post a Comment