Home Dashboard Directory Help
Search

[SSIS] Consolidate three type systems into one by Jamie Thomson


Status: 

Closed
 as Won't Fix Help for as Won't Fix


39
0
Sign in
to vote
Type: Suggestion
ID: 651324
Opened: 3/14/2011 9:37:00 AM
Access Restriction: Public
0
Workaround(s)
view

Description

SSIS today includes three different type systems:
1) Types used in the dataflow (DT_STR, DT_I4 etc...)
2) Types used for variables (String, Int32 etc...)
3) Types used for Execute SQL Task parameters (VARCHAR, LONG etc...)

This is clearly not an ideal situation because
a) It makes the product more difficult to learn
b) Packages are rife with implicit/undocumented conversions which are liable to lead to errors and are possibly detrimental to performance too

Clearly it would be better to have one heterogenous type system used throughout the product, especially as we constantly use these different type systems in the same place (e.g. Assigning a value to a variable from inside the Execute SQL Task, comparing a dataflow column value to a variable)
Details
Sign in to post a comment.
Posted by Brett Gerhardi on 5/17/2011 at 9:28 AM
A pointer to a website that contains a mapping between the three would be very useful here!

If we can't fix it we might as well reduce the pain levels to minimum. Is there such a thing that exists?
Posted by Michael J. Swart on 4/7/2011 at 9:34 AM
Thanks Jeff, I hope your Microsoft colleagues on other SQL Server teams take your lead in providing feedback on connect items.
Posted by Jamie Thomson on 4/6/2011 at 1:32 PM
Jeff,
Thank you very much for the extremely candid feedback. I knew that, realistically, I was clutching at straws asking for this but figured I would try anyway. Your efforts here are very much appreciated.

Regards
Jamie
Posted by NULLgarity on 4/6/2011 at 1:18 PM
Thank you, Microsoft, for your frank feedback and history lesson below. I wish I had this type of back story to a number of annoyances and perceived issues that I have run into over the years. I have usually assumed they were there and often attempted to guess at what led to current circumstances, but usually assumed I would never truly know.
Posted by Microsoft on 4/6/2011 at 11:58 AM
Thanks for the frank feedback, sometime we need to be reminded to give more details on these things.

In this case, there was some unfortunate timing and bad assumptions between when this bug was filed and when the ‘triage’ happened. Enough excuses on the mechanism, let me get to the excuses on why we have to say “won’t fix”



The three type systems make me crazy too. They make all of the developers working on the guts of SSIS crazy. The history is interesting so I will share it:

The first part of SSIS that was built in the proto-days of 1999 was the replacement for the DTS import export wizard. I know, hard to believe. This was the first ‘host’ for the dataflow pipeline.

The pipeline was built to use the OLEDB type system to make reading and writing from OLEDB really fast. Our internal buffers are really just OLEDB bound memory layouts. The DT_Foo type system is an exact copy of the OLEDB type system with some extra types added for SSIS specific use. That type system is built in deep to the dataflow and all transforms.

Next to arrive was the Runtime, we needed a real host for pipelines and a way to coordinate activities. In those days, the way software was built and extended here at Microsoft was to use COM and OLE Automation friendly interfaces. We expected that most folks that embedded and extended SSIS (then still called DTS) was to use native code and COM interfaces. Naturally, that is how the Runtime was designed and built; the COM type system is pervasive and runs deep inside the native Runtime. This is why we see the VARIANT types with BSTRs, SAFEARRAYS, etc. (as a fun note, the VARIANT type was design to copy the internal type system of Visual Basic, a VARIANT is a Var J )

When it came time to build a designer for these interfaces (and believe it or not, the designer started a year or so after the internals were working) we took a bet on managed code and c#. This was a bet at the time; managed code was new and un-proven. There was a lot of anxiety. The UI is all managed code and so its chock full of the new and fancy CLR type system. We expected that some people might use managed code to host or extend SSIS so a lot of ‘wrappers’ were put together to make that work. Of course the seams on the type system show through.



So here we are with three type systems. Getting rid of the DT_Foo system would mean re-writing the pipeline (probably in managed code) and putting in a big shunting system to maintain backward compatibility with older transforms.

Getting rid of the VARIANT type system would mean re-writing the Runtime (probably managed) and providing some shunting and upgrade system for old packages and tasks.

Re-writing the UI is crazy talk, and would still need all of the back-compat layers put in.

So as much as we hate where we are, fixing this issue is a HUGE undertaking. For instance, it may have been the only work we took on for Denali.

Certainly we thought about it, but the value delivered to customers of this one change is low when compared to all of the other things we could do for the same effort.



What does the future hold? It is hard to imagine us ever investing the effort to fix the SSIS type problem, mostly because of all of the backward compatibility issues that it would create.

As we look to new problem spaces and new ways of solving problem in the cloud world, I expect the CLR type system will become the common language of data movement for us.

We will remember the pain that our short sighted decision making in 2001 unleashed on our devoted developer friends and we will strive to avoid making these mistakes again.



-Jeff Bernhardt

Posted by Russ Loski on 3/19/2011 at 12:39 PM
The different type systems have been confusing from the beginning. Trying to figure out how to map variables to ExecuteSQL parameters is unnecessarily difficult. I have perhaps lost several days of productivity trying to map date variables to their appropriate Execute SQL Parameter type.

At the least, I would appreciate a comment from MSFT explaining why they closed this issue.
Posted by Valentino Vranken on 3/17/2011 at 1:45 AM
I agree with Jamie here: just the fact that you need to know three different data type definition systems to be able to use just one product is plainly confusing!

As several people already mentioned: please explain why the request got closed only 3 hours after it was opened. To be honest, I feel that there's often a lack of communication from Microsoft's side here on Connect. It feels too much like one-way communication which in the end makes people feel like they're wasting their valuable time. So please, try to give more feedback and we're all happy.
Posted by Cliff Buckley on 3/15/2011 at 9:20 AM
I'm going to add to the voices here. Just closing it as "Resolved, won't fix" is not good enough. We all would like to know why it won't be fixed. While it is not a bug, at least considering a facade over top of the data types would make the tool more intuitive vs having to have a mapping of Data Type X maps to Data Type Y for this system. I understand not using the SQL data types as they are different among database venders, but why not use the variable type system throughout? It's clear what the data types are. DT_STR vs DT_WSTR vs String, give me String and let me set a property as to indicate if it's unicode or not.
Posted by Eric Wisdahl on 3/15/2011 at 8:41 AM
Just closing as by design is not really good enough here unless if this was already considered and rejected during the Denali design sessions. If so, please give more information when closing the issue to let us know that it has been at least considered and you aren't just ignoring the suggestions presented.

Thanks!
Posted by Dave Lumley on 3/15/2011 at 2:31 AM
I'm with Jamie on this one - whilest it may not cause an issue, it can certainly complicate matters and cloud the waters as it were.

MS should reall follow the KISS philosiphy ..... Keep It Simple Stupid.
Posted by notalreadytaken on 3/14/2011 at 9:50 PM
This *is* very annoying.

If you won't fix it, I think you should at least make it easy to find a mapping to Database Engine datatypes when having to choose one from another list. There are a few online, but some of them are wrong.

If this is part of the SQL Server, then the database type should be clear, or easily discoverable.

Things like this make SSIS look duct taped on.
Posted by Jamie Thomson on 3/14/2011 at 1:54 PM
Michael,
Just to be clear, I haven't filed it as a bug.
Posted by Jamie Thomson on 3/14/2011 at 1:54 PM
This was closed approximately 3 hours after I raised it, without explanation. I find it hard to believe that in those 3 hours you have given this the due diligence that it warrants. Did anyone even read it before it was closed?

I would appreciate a reply.

thanks
Jamie
Posted by Michael K Campbell on 3/14/2011 at 12:12 PM
This isn't a bug - it's by design.

But the DESIGN SUCKS and Microsoft should REALLY address this issue as it's a very common problem that their customers constantly have to address.
Sign in to post a workaround.