Posts Tagged ‘C#’

Native Posix Python Condition Implementation

So I wrote this replacement version of Condition using the native posix support. Event and Semaphore are both written in terms of Condition, so you can use this as a fast route to getting native versions of those synchronization primitives. (Note, though, that there is a native posix semaphore, so implementing it in terms of a condition variable is not really necessary.)

I have trained myself to ask, "why hasn't anybody done this before?" when writing this sort of thing. And as always there's a really good reason: for a percentage of applications that is probably very close to one hundred percent, the difference in performance between this and the pure python version (which is implemented using polling) is not going to amount to a hill of beans. That was indeed the case for the application where I was trying to put this to use, and I reverted to the much simpler python Condition.

But there was no way to know for sure that that was the case without trying this, so if you find yourself in a similar situation, here is the code.

#include "Python.h"
#include "pthread.h"
#include "structmember.h"

typedef struct {
  PyObject_HEAD
  int set;
  pthread_cond_t cond;
  pthread_mutex_t lock;
} ConditionObject;

static int cond_init( ConditionObject* self, PyObject* args,
              PyObject* kwargs );
static void cond_free( ConditionObject* self );
static PyObject* cond_acquire( ConditionObject* self );
static PyObject* cond_release( ConditionObject* self );
static PyObject* cond_wait( ConditionObject* self, PyObject* args );
static PyObject* cond_notify( ConditionObject* self );
static PyObject* cond_notifyAll( ConditionObject* self );

static PyMemberDef cond_members[] = {
  {NULL}
};

static PyMethodDef cond_methods[] = {
  { "acquire", (PyCFunction)cond_acquire, METH_NOARGS, "" },
  { "release", (PyCFunction)cond_release, METH_NOARGS, "" },
  { "wait", (PyCFunction)cond_wait, METH_VARARGS, "" },
  { "notify", (PyCFunction)cond_notify, METH_NOARGS, "" },
  { "notifyAll", (PyCFunction)cond_notifyAll, METH_NOARGS, "" },
  { NULL }
};

static PyTypeObject ConditionType  = {
  PyObject_HEAD_INIT(NULL)
  0,                         /*ob_size*/
  "_pthread_cond.Condition", /*tp_name*/
  sizeof(ConditionObject),   /*tp_basicsize*/
  0,                         /*tp_itemsize*/
  (destructor)cond_free,     /*tp_dealloc*/
  0,                         /*tp_print*/
  0,                         /*tp_getattr*/
  0,                         /*tp_setattr*/
  0,                         /*tp_compare*/
  0,                         /*tp_repr*/
  0,                         /*tp_as_number*/
  0,                         /*tp_as_sequence*/
  0,                         /*tp_as_mapping*/
  0,                         /*tp_hash */
  0,                         /*tp_call*/
  0,                         /*tp_str*/
  0,                         /*tp_getattro*/
  0,                         /*tp_setattro*/
  0,                         /*tp_as_buffer*/
  Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE,        /*tp_flags*/
  "",                        /* tp_doc */
  0,                       /* tp_traverse */
  0,                       /* tp_clear */
  0,                       /* tp_richcompare */
  0,                       /* tp_weaklistoffset */
  0,                       /* tp_iter */
  0,                       /* tp_iternext */
  cond_methods,              /* tp_methods */
  cond_members,              /* tp_members */
  0,                         /* tp_getset */
  0,                         /* tp_base */
  0,                         /* tp_dict */
  0,                         /* tp_descr_get */
  0,                         /* tp_descr_set */
  0,                         /* tp_dictoffset */
  (initproc)cond_init,       /* tp_init */
  0,                         /* tp_alloc */
  0,                         /* tp_new */
};

static int cond_init( ConditionObject* self, PyObject* args,
              PyObject* kwargs ) {
  self->set = 0;
  int err = pthread_mutex_init( &self->lock, NULL );
  if( err != 0 ) {
    PyErr_SetFromErrno( PyExc_OSError );
    return -1;
  }
  err = pthread_cond_init( &self->cond, NULL );
  if( err != 0 ) {
    PyErr_SetFromErrno( PyExc_OSError );
    return -1;
  }
  return 0;
}

static void cond_free( ConditionObject* self ) {
  pthread_mutex_destroy( &self->lock );
  pthread_cond_destroy( &self->cond );
}

static PyObject* cond_acquire( ConditionObject* self ) {
  int err = 0;
  Py_BEGIN_ALLOW_THREADS;
  err = pthread_mutex_lock( &self->lock );
  Py_END_ALLOW_THREADS;
  if( err != 0 ) {
    PyErr_SetFromErrno( PyExc_OSError );
    return NULL;
  }
  Py_RETURN_NONE;
}

static PyObject* cond_release( ConditionObject* self ) {
  int err = pthread_mutex_unlock( &self->lock );
  if( err != 0 ) {
    PyErr_SetFromErrno( PyExc_OSError );
    return NULL;
  }
  Py_RETURN_NONE;
}

static PyObject* cond_wait( ConditionObject* self, PyObject* args ) {
  // for now timed waits are not supported (it's ignored)
  int err = 0;
  while( !self->set && err != EINVAL ) {
    Py_BEGIN_ALLOW_THREADS;
    err = pthread_cond_wait( &self->cond, &self->lock );
    Py_END_ALLOW_THREADS;
    if( PyErr_CheckSignals() ) {
      return NULL;
    }
  }
  if( err != 0 ) {
    PyErr_SetFromErrno( PyExc_OSError );
    return NULL;
  }
  Py_RETURN_NONE;
}

static PyObject* cond_notify( ConditionObject* self ) {
  self->set = 1;
  int err = 0;
  Py_BEGIN_ALLOW_THREADS;
  err = pthread_cond_signal( &self->cond );
  Py_END_ALLOW_THREADS;
  self->set = 0;
  if( err != 0 ) {
    PyErr_SetFromErrno( PyExc_OSError );
    return NULL;
  }
  Py_RETURN_NONE;
}

static PyObject* cond_notifyAll( ConditionObject* self ) {
  self->set = 1;
  int err = 0;
  Py_BEGIN_ALLOW_THREADS;
  err = pthread_cond_broadcast( &self->cond );
  Py_END_ALLOW_THREADS;
  self->set = 0;
  if( err != 0 ) {
    PyErr_SetFromErrno( PyExc_OSError );
    return NULL;
  }
  Py_RETURN_NONE;
}

static PyMethodDef module_methods[] = {
  {NULL}
};

PyMODINIT_FUNC
init_pthread_cond(void)
{
  ConditionType.tp_new = PyType_GenericNew;
  if( PyType_Ready( &ConditionType ) < 0 ) {
    return;
  }
  PyObject* mod = Py_InitModule( "_pthread_cond", module_methods );
  if( mod == NULL ) {
    return;
  }
  Py_INCREF( &ConditionType );
  PyModule_AddObject( mod, "Condition", (PyObject*)&ConditionType );
}

Crazy Hacker Tricks

Let me be the first to say that this is probably a really bad idea, unless you are very desperate.

On the other hand, this was really fun to write.

Today I wrote some code that patches the first few bytes of user32!MessageBoxExW, in order to keep a pesky third-party library from showing a prompt when nobody was around to click on it. This overwrites the start of that function with a relative jump to one of my own, which then forwards to a managed function that logs what the message box was trying to tell us.

Actually, this was for a friend's program. I would probably shy away from doing this if I weren't only indirectly responsible.

void InstallHook()
{
    HMODULE hUser32 = LoadLibrary(L"user32.dll");
    PROC pMsgBox = GetProcAddress(hUser32, "MessageBoxExW");
    FailIf(pMsgBox == NULL);

    // Need to change the memory protection for the
    // (read/execute) page in user32.dll before we
    //write to it.
    MEMORY_BASIC_INFORMATION mbi = {0};
    VirtualQuery(pMsgBox, &mbi,
        sizeof(MEMORY_BASIC_INFORMATION));
    FailIf(!VirtualProtect(mbi.BaseAddress, mbi.RegionSize,
        PAGE_READWRITE, &mbi.Protect));

    // Copy in the jump instruction... you could of course
    // opt to not write this in assembly, but if we've gone this
    // far, what's the difference?
    __asm
    {
         // edi <- user32!MessageBoxExW
        mov edi, pMsgBox;              

        // write the opcode for JMP rel32
        mov byte ptr [edi], 0xe9;       

        // eax <- _MessageBoxEx, then
        // eax <- _MessageBoxEx - user32!MessageBoxExW
        mov eax, offset _MessageBoxEx;
        sub eax, edi;                   

        // One byte for jmp, four for the address.
        sub eax, 5;
        inc edi;
        stosd;
    };

    // Restore the old protection
    DWORD oldProtect = 0;
    FailIf(!VirtualProtect(mbi.BaseAddress,
        mbi.RegionSize, mbi.Protect, &oldProtect));
}

With that in place it's not hard to erect some managed scaffolding around it, and nobody needs to know your secret. Until they try to run it on an x64, I guess:

static void Main(string[] args)
{
    HookMonkey.Install(delegate(MessageBoxInfo mbi)
    {
        Console.WriteLine(mbi.Text);
        return DialogResult.OK;
    });
    MessageBox.Show("foo bar");
}

You can download some sample code here.

TLB to XML

I wrote a small program that generates xml from a type library. The type library can be a .tlb, or embedded as a resource in a PE (.dll, .ocx, .exe, etc).

I'm using this as a build tool–basically it's the glue that makes a big project using C#, C++, VB6, WiX and NAnt hold together.

The source can be downloaded from this link. This doesn't grab everything from the tlb, but it's pretty simple and not hard to extend.

/* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

	Copyright (c) 2005, Dan McKinley
	All rights reserved.

	Redistribution and use in source and binary forms, with or without
	modification, are permitted provided that the following conditions
	are met:

	-	Redistributions of source code must retain the above copyright notice,
		this list of conditions and the following disclaimer. 

	-   Redistributions in binary form must reproduce the above
		copyright notice, this list of conditions and the following
		disclaimer in the documentation and/or other materials
		provided with the distribution.

	-	The name of Dan McKinley may not be used to endorse or promote products
		derived from this software without specific prior written permission. 

	THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
	CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED
	WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
	OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
	DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS
	BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
	EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
	TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
	DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
	ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
	TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
	THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
	SUCH DAMAGE.

 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * */

using System;
using System.Collections.Generic;
using System.Text;
using isvc = System.Runtime.InteropServices;
using System.IO;
using System.Xml;
using System.Runtime.InteropServices.ComTypes;

namespace tlbspit
{
	class Program
	{
		[isvc.DllImport("oleaut32.dll", PreserveSig = false)]
		static extern void LoadTypeLib(
			[isvc.MarshalAs(isvc.UnmanagedType.LPWStr)] string szFile,
			[isvc.MarshalAs(isvc.UnmanagedType.Interface)] ref ITypeLib pLib);

        static int Main(string[] args)
        {
            if (args.Length == 0)
            {
                Usage();
                return -1;
            }
            try
            {
                string path = string.Join(" ", args);
                using (XmlTextWriter w = new XmlTextWriter(Console.Out))
                {
                    w.Formatting = Formatting.Indented;
                    Run(path, w);
                }
            }
            catch (Exception e)
            {
                Console.WriteLine(e.ToString());
                return -1;
            }
            return 0;
        }

        static void Run(string path, XmlWriter w)
        {
            w.WriteStartElement("library");
            ITypeLib tlb = null;
            LoadTypeLib(path, ref tlb);
            WriteTlbAttribs(tlb, w);
            int ct = tlb.GetTypeInfoCount();
            for (int i = 0; i < ct; i++)
            {
                ITypeInfo t = null;
                tlb.GetTypeInfo(i, out t);
                WriteType(t, w);
            }
            w.WriteEndElement();
        }

		static void WriteType(ITypeInfo t, XmlWriter w)
		{
			w.WriteStartElement("type");
			ITypeInfo2 t2 = (ITypeInfo2)t;
			TYPEKIND k;
			t2.GetTypeKind(out k);
			w.WriteAttributeString("name", NameOf(t));
			w.WriteElementString("guid", GuidOf(t));
			w.WriteElementString("kind", k.ToString());
			w.WriteEndElement();	// type
		}

		static unsafe string GuidOf(ITypeInfo t)
		{
			IntPtr pAttr = IntPtr.Zero;
			try
			{
				t.GetTypeAttr(out pAttr);
				TYPEATTR* attr = (TYPEATTR*)pAttr;
				Guid g = attr->guid;
				return g.ToString();
			}
			finally
			{
				if (pAttr != IntPtr.Zero)
					t.ReleaseTypeAttr(pAttr);
			}
		}

		static string NameOf(ITypeInfo t)
		{
			string name = null, doc = null, hlpfile = null;
			int i = 0;
			t.GetDocumentation(-1, out name, out doc, out i, out hlpfile);
			return name;
		}

		static string NameOf(ITypeLib tlb)
		{
			string name = null, doc = null, hlp = null;
			int hc = 0;
			tlb.GetDocumentation(-1, out name, out doc, out hc, out hlp);
			return name;
		}

		static unsafe void WriteTlbAttribs(ITypeLib tlb, XmlWriter w)
		{
			IntPtr pAttr = IntPtr.Zero;
			tlb.GetLibAttr(out pAttr);
			try
			{
				TYPELIBATTR* attr = (TYPELIBATTR*)pAttr;
				w.WriteAttributeString("guid", attr->guid.ToString());
				w.WriteAttributeString("version",
					string.Format("{0}.{1}", attr->wMajorVerNum, attr->wMinorVerNum));
				w.WriteAttributeString("name", NameOf(tlb));
			}
			finally
			{
				if (pAttr != IntPtr.Zero)
					tlb.ReleaseTLibAttr(pAttr);
			}
		}

		static void Usage()
		{
			Console.WriteLine(@"
TLBSPIT by Dan McKinley

  Writes out (some) of the attributes of a type library as xml to
  standard output.

  Usage: tlbspit.exe <type library file>

  The file can be a dll/ocx/etc. or a tlb. It can optionally include a
  slash followed by the number of a resource.

  Examples:
	tlbspit foo.tlb
	tlbspit bar.dll\3
");
		}
	}
}

Wheels I Have Wasted Time Re-inventing

This site has been silent for quite a while. My lame excuse for this is as follows.

Sometime around the beginning of the summer I started trying to learn functional programming in earnest. I had some exposure to this in school (ML, if I remember correctly), but not nearly enough, as it turns out. C#'s anonymous methods provided the "eureka" moment that made me want to revisit the topic. I had also been asked to contribute to The Hub, but I had felt like I was lost in the world of F# and that fizzled out altogether (maybe now I might have a few things to say).

So that's my story–I have been exploring strange new(-ish) topics and I've found that I haven't had anything insightful to contribute to the world.

Recently I've been trying to complete a few large projects in Common Lisp. I'm pleasantly surprised to find myself saying that most or all of the highfalutin crap that is spouted about Lisp is true. As to why it isn't the most-used language in the world right now, I am too much of a novice to speculate. But purely from a language standpoint it is certainly making all of the others feel, well, tedious.

Lisp seems to supersede every single neat trick I have ever come up with programming professionally. It either contains them outright, or it makes them trivial through macros or completely unnecessary for some other reason. Let me give you one simple example.

Writing applications in C#, I've occasionally had a singleton or some kind of global object that I wanted to be redefined within a certain call stack. Let's say I have this API:

class Foo
{
    public static Foo Default { get; }
}

And I want this to be redefined for a little while while I call some method called Bar(). Impersonation of users is one concrete example of where this is useful.

My neat-trick solution to this problem is to make the implementation of Default use thread local storage (and maybe fall back on a static instance too), then create a "scope" class that allows me to write code that looks like this:

// Original value of SomeOtherFoo out here
using(FooScope s = new FooScope(someOtherFoo))
{
    // In here, Foo.Default gives someOtherFoo
    Bar();
}
// Original value of SomeOtherFoo out here

(I am skipping over a lot of irritating details here and let me be clear, I don't claim to have invented this. I just mean that I burned a lot of hours arriving at it basically independently of the ten million other programmers who probably recognized the same need and did the same thing. It is purely my own ignorance and stupidity that made "re-discovering" it necessary.)

Anyway, I thought that was clever at the time. But with dynamic scope (as opposed to global scope, which is what static fields have in C#) Lisp has this as part of the language. Writing this:

(let ((*default-foo* some-other-foo))
  (bar))

Is basically the same thing with no effort. The Lisp syntax has the added advantage of working with objects other than the ones I author myself.

This reminds me of the first time I was exposed to .NET events. I realized that I had implemented the same god-damned thing in other languages a handful of times (albeit as a less-seasoned programmer) and it never worked as beautifully or as simply.

Naturally, CLOS seems to have some features that I now realize I have been mimicking imperfectly with events. It also has features that will probably replace events in many of the problems where I have applied them. That is an altogether different story.

First Impressions of CCR

I got a chance to mess around with the Concurrency and Coordination Runtime (CCR) bits recently. Before I get into that, first check out this real-life code I wrote this week.

public static class NotificationQueue
{
    private static Queue<Notification> _queue;
    private static Semaphore _work;

    static NotificationQueue()
    {
        _queue = new Queue<Notification>();
        _work = new Semaphore(0, int.MaxValue);
    }

    public static void Enqueue(Notification n)
    {
        lock (_queue)
        {
            _queue.Enqueue(n);
        }
        _work.Release();
    }

    public static Notification Dequeue()
    {
        _work.WaitOne();
        lock (_queue)
        {
            return _queue.Dequeue();
        }
    }
}

The idea here is to post notifications to the queue from many threads, and have a single notification thread sending the messages out from something like this:

private static void NotifyThreadProc()
{
    while(!Abort())
        NotificationQueue.Dequeue().Send();
}

The documentation for CCR is pretty sparse at this point, limited to just this paper [pdf], the Channel9 video, and this Wiki. Some of the particulars seem to have changed significantly, but I was able to figure out how to replicate my explicitly-threaded notification queue:

public class NotificationService : CcrServiceBase
{
    private Port<Notification> _port;

    public NotificationService()
        : base(new DispatcherQueue("foo"))
    {
        _port = new Port<Notification>();
        Activate(Arbiter.Receive(true, _port,
        delegate(Notification n)
        {
            n.Send();
        }));
    }

    public void Post(Notification n)
    {
        _port.Post(n);
    }
}

That's pretty awesome: notice that no explicit locks or waits are necessary, and I don't need to write the NotificationThreadProc or the code required to start it. I just need to make a NotificationService and start posting to it.

I can't say it's the most immediately comprehensible API I've ever seen, but hopefully that will change with more documentation and some polish. It's also possible that I am just warped from years of using the lower-level concepts. Overall this is awfully impressive for a library.

Debugging and Symbols FAQ

This is something I typed up internally, to help resolve the confusion that precipitates when Visual Studio begins stepping through comments. Hopefully this will be helpful to someone else.

What is required for source mode debugging?

  • Binaries (.dll, .exe, .ocx, …).
  • Symbols (the .pdb file).
  • Source files.

The symbol files tell your debugger how points in the binary relate to the source files. When you set a breakpoint in a source file, the debugger uses the symbol file to look up the corresponding instruction in the binary.

I have all of these things. Why is my debugger behaving strangely or not working at all?

Well, there are a bunch of different things that can go wrong.

Your symbols might be mismatched.

You can't (as an example) use the symbols for version 2.0 of a DLL to debug version 1.0 of that DLL. Perhaps, in this case, you have copied a new binary into your run directory without copying the corresponding .pdb file. It's also possible that your IDE or compiler did that for you.

Some debuggers, like WinDbg, will NOT load a mismatched PDB unless you force this to happen. Others, like Visual Studio 2003, will do this without so much as a complaint. That can be very confusing in some situations.

The module you are trying to debug is not loaded into the process.

If you are attempting to hit a breakpoint, a good sanity check is to make sure that the module you are trying to debug is being loaded. If it isn't, you are experiencing something other than a PDB problem. For example, your application could be experiencing an exception before your code is called. Make sure you have "break on exceptions" turned on.

In Visual Studio, you can look at the "modules" window to make sure your binary is loaded. The corresponding command in WinDbg is lm.

You are attached with the wrong debugger.

This is another non-symbol issue. There are many different kinds of debuggers:

  • Native debuggers: cdb, ntsd, windbg, Visual Studio.
  • CLR debuggers: cordbg, mdbg, deblector, Visual Studio.
  • Oddballs: the VB6 IDE, VS.NET's script debugger, etc.

You should make sure you are attaching to or starting your process using the right kind of debugger. Visual Studio gives you a dialog to choose the appropriate debugger(s).

The source file is not recognized.

Say I build a .dll from c:\foo\x.cpp, and you take that library from me and try to set a breakpoint in c:\bar\x.cpp. You might need to tell the debugger that your source path is different. Visual Studio is usually pretty good at figuring this out on its own and/or asking you when it is having a problem.

The _NT_SOURCE_PATH environment variable is honored as a search path for source files by most of the Windows debuggers. The .srcpath command in WinDbg/CDB can be used to display and change the source search path in those debuggers.

The symbols are corrupt.

This is rare, but it can happen. The fix is usually to just rebuild the binary and regenerate the PDB.

The binary is not debuggable.

.NET assemblies have the added limitation that they must be built in /debug mode for conventional CLR debugging to work. You can tell if a module is debuggable by opening it up in Reflector or ILDASM. You should see this attribute:

[assembly: Debuggable(true, true)]

Where does the debugger get the symbols?

Most Windows debuggers work like this:

  • The folder containing the binary is searched for a matching PDB.
  • The debugger's symbol search path is then searched. Most Windows debuggers (including Visual Studio) take their default search path from the _NT_SYMBOL_PATH environment variable.

How can I tell where the debugger is getting its symbols?

In WinDbg/CDB, the !sym noisy command can be used to show verbose output when the debugger is trying to resolve symbols. Visual Studio 2005 shows the symbol path as a column in the Modules window.

I am not aware of a simple way to identify where Visual Studio 2003 is loading its symbols from. The easiest way might be to use the handle utility from Sysinternals:

C:\src>handle foo.pdb

Handle v3.01
Copyright (C) 1997-2005 Mark Russinovich
Sysinternals - www.sysinternals.com

devenv.exe      pid: 4832   C:\WINDOWS\Microsoft.NET\Framework\v1.1.4322\Temp
orary ASP.NET Files\someapp\dbf846f9\1ab0397c\assembly\dl2\c33a0bcb\eca89e48_3
b78c601\foo.PDB

In this case ASP.NET has copied the binary and the PDB to a temporary location. This is another way a PDB can wind up being mismatched.

How are PDB's matched to binaries?

The short version: the PDB contains a date stamp and a checksum which is matched against the binaries.

How can I tell if a PDB matches a binary?

The symchk utility that comes with the Debugging Tools for Windows can be used to do this. There are many options for this utility, so you should read the output of symchk /?.

Pointless Minesweeper Source Code

The source code for my Pointless Minesweeper Clone can be found here. Enjoy it.

The license is BSD (you may know this as the "hey, go nuts" license).

Pointless Minesweeper Clone

Minesweeper++

My roommate, a grizzled veteran of Cornell's CS 211, made an announcement a week or two ago. It was something along the lines of, "gee whiz, I'm glad they didn't make us write minesweeper as a project. It looks impossible." To prove my vast intellectual superiority I made a winforms clone in about six or seven hours, including the time it took to draw the graphics and perform extensive "QA."

You can start the clickonce app from this link.

Screenshot:

Minesweeper++ screenshot

The source will be forthcoming. Bugs and so forth, feel free to contact me. The rendering is a bit on the slow side for my tastes, so you needn't mention that.

Debugger Exercise: Displaying a Function’s Return Value

I found myself needing to automatically manipulate the return value of a managed function in release code today. I thought this would make an interesting little writeup.

For the purposes of the demonstration, let's use this example program:

class Program
{
    static void Main()
    {
        while (!Console.KeyAvailable)
        {
            Console.WriteLine(FooBar(false));
            Console.WriteLine(FooBar2());
        }
    }

    static string FooBar(bool b)
    {
        if (b)
        {
            return "b";
        }
        return "a";
    }

    static string FooBar2()
    {
        return FooBar(true);
    }
}

The function we're interested in will be FooBar. I've made things interesting by calling the function from two places–let's assume we're not sure where the function is called from. We'll make things more interesting still by accepting the limitation that we won't cheat and set a breakpoint on the ret instruction in FooBar. The function I needed to instrument today was a little monolithic, so let's take it as a given that we need to be able to handle a function that might be long and might have more than one exit point.

Compile this program and start it in WinDbg. Start by loading SOS:

0:003> .loadby sos mscorwks

The first thing we'll do is set a breakpoint on FooBar. (I'm going to do this manually so that these steps will work with the CLR v1.1. The new version of SOS makes this slightly easier. I'm also going to cover this quickly, so if you want a clearer explanation of what's going on in this next step see Eran Sandler's writeup here.)

To set the breakpoint, let the program run for a second so that the method is compiled by the JIT. Then we need to find and dump out the MethodTable for the class:

0:003> !name2ee retaddr!retaddr.Program
Module: 00912d5c (retaddr.exe)
Token: 0x02000002
MethodTable: 00913140
EEClass: 009112ec
Name: retaddr.Program

0:003> !dumpmt -md 00913140
EEClass: 009112ec
Module: 00912d5c
Name: retaddr.Program
mdToken: 02000002
BaseSize: 0xc
ComponentSize: 0x0
Number of IFaces in IFaceMap: 0
Slots in VTable: 8
--------------------------------------
MethodDesc Table
   Entry MethodDesc      JIT Name
...
00d00070   00913120      JIT retaddr.Program.Main()
00d000d8   00913128      JIT retaddr.Program.FooBar(Boolean)
00d00130   00913130      JIT retaddr.Program.FooBar2()
...

Now that we have the MethodTable, we need to dump out the MethodDesc for FooBar to get the virtual address of the jitted code.

0:003> !dumpmd 00913128
Method Name: retaddr.Program.FooBar(Boolean)
Class: 009112ec
MethodTable: 00913140
mdToken: 06000002
Module: 00912d5c
IsJitted: yes
m_CodeOrIL: 00d000d8

The compiled method begins at m_CodeOrIL (this is called "Method VA" in SOS 1.1). We can set a breakpoint there with "bp 00d000d8."

Here comes the trick. We know that the return address to the method should be on the top of the stack when the method is called. In other words, the ESP register points at the return address. We can set a dynamic breakpoint on that return address using a command like this:

bp 00d000d8 "bp poi(@esp); g"

If you try this, you'll notice that we break right after FooBar has returned to its caller. As with any function obeying one of the standard calling conventions, the return value is stored in the EAX register. We can give the dynamic breakpoint a command of its own that manipulates this value like this:

bp 00d000d8 "bp poi(@esp) \"!do -nofields @eax; g\"; g"

If we run the sample program now, we'll see this output repeating itself:

breakpoint 1 redefined
Name: System.String
MethodTable: 790fa3e0
EEClass: 790fa340
Size: 20(0x14) bytes
String: a

breakpoint 2 redefined
Name: System.String
MethodTable: 790fa3e0
EEClass: 790fa340
Size: 20(0x14) bytes
String: b

Voila.

The Russian Doll Approach to Web Services

Here's an anecdote for the WTF inbox. I assure you this is very real, but I cannot divulge any of the specifics.

Some time ago a friend of mine was talking to a web services vendor, who was explaining his versioning scheme. The vendor's approach was to make all of the web service functions accept a single parameter describing the function being called, and the version of the function requested. Prototypically:

public object Foo(FunctionCallInfo)

And what is FunctionCallInfo, you ask? Why, it is a strict XML document adhering to a schema that he would provide.

My observation, which my friend also arrived at independently, was that this person was basically creating an implementation of web services inside of web services. So nat'ralists observe, a flea / Hath smaller fleas that on him prey.

XML: the cause of, and solution to, all of your development problems.

What would do in this situation?