Dan McKinley
Math, Programming, and Minority Reports

@mcfunley.com

DataSetSurrogate Remoting Sink
July 4th, 2005

I have received a trickle of requests for some code that I have written and alluded to here and here: namely, a custom Remoting sink that swaps DataSets for DataSetSurrogates as they pass by.

Unfortunately, it would not be legal for me to release this code. Sorry. However, I think I can give a general outline of it without giving away the farm.

The first thing I should say is that the cure for your remoting performance woes is probably not the DataSetSurrogate in every case. (Review what I have said here and here for more information).

If it’s at all practical, my recommendation to you is to simply avoid using DataSets in n-tier situations.

Still here? Great.

Ok, with the disclaimers out of the way, the code works in two pieces: the remoting sinks and the ISerializationSurrogate. The serialization surrogate is relatively easy to implement, if you’ve gotten this far.

The sinks are a little more clever. I wanted to continue using the framework BinaryFormatter sinks, since they added a lot of things to the mix that I didn’t feel like re-implementing.

However, there’s no way for you to modify how the BinaryFormatterClient/Server sinks do their serialization. More explicitly, there’s nowhere to plug in the ISerializationSurrogate object that you’ve written.

The client and server sinks actually pull-the-old-switcheroo on the messages coming through the pipe. That’s the clever bit.

So I am leaving the framework sinks there, but altering the messages that go through them. The messages they see have a binary representation of the “real” messages appended as a parameter.

That’s the fastest way to write the sinks. If you implement complete formatter sinks, you can accomplish it a little more efficiently.

Hope this helps. Sorry that I can’t be more specific. If you’ve gotten this far without your head exploding, trust me, you’re smart enough to write these classes.

Anyway, the result is significantly faster for many kinds of data. Not for all kinds of data.

Back home