Posts Tagged ‘C#’

C++ and VS2005 Deployment PSA

This is a straightforward fix, but I couldn't find anything about it anywhere else. I have a handful of old ATL projects that have been upgraded over the years from Visual C++ 6.0, through VS2003 and are now being built using Visual C++ 8. I was trying to add them to a deployment project today, and kept running into this error:

Two or more objects have the same target location
('[targetdir]\regsvr32.trg').

regsvr32 error building a migrated C++ project

The issue is that the way libraries are registered has changed over the various incarnations of Visual Studio. My old projects were still using a post-build step to perform registration:

VC++ project using a post-build step to register itself

This apparently can create the output file that the deployment package is complaining about. The fix is to delete the post build steps and enable the modern registration option on the Linker's General tab.

VC++ project being registered via the new method.

The Worst Possible Way to Handle Exceptions

The worst possible way to "handle" exceptions is to show the user a message box with __FILE__ as the text. This is extremely poor form.

Intel VTune error dialog

Apparently, Intel did not get the memo. Worse, this happens as many as ten times when I open Visual Studio. Why do I feel like I was mistakenly given a debug build of VTune?

Generic and Threadsafe Singleton Implementation

I googled around, and couldn't find a generic singleton implementation that was 1) correct and 2) met all of my needs. This is a clever approach, but unfortunately it is limited to objects that are created by calls to constructors.

So I went ahead and wrote the singleton generic that I fully expect to be included in the next BCL. (Not really. It's completely straightforward. However, the Steelers won the super bowl yesterday, so I'm not really in a modest mood.)

I started out defining a class factory type, which is responsible for creating the an instance in a purposefully vague way. I also wrote the default version which just calls new().

ClassFactory object model

/// <summary>
/// Interface for objects that create instances of
/// another type.
/// </summary>
public interface IClassFactory<T> where T : class
{
        T CreateInstance();
}

/// <summary>
/// A default <see cref="IClassFactory"/> implementation,
/// which uses a parameterless constructor to create the
/// instance.
/// </summary>
public class DefaultClassFactory<T> : IClassFactory<T>
        where T : class, new()
{
        public T CreateInstance()
        {
                return new T();
        }
}

From there, I went for the slam dunk in writing both a singleton class with a class factory and a default version that doesn't require one.

Singleton object model

/// <summary>
/// A base (or helper) singleton class. Defines the
/// singleton instance.
/// </summary>
/// <typeparam name="T">
/// The type of the singleton object.
/// </typeparam>
/// <typeparam name="class_factory">
/// The type of the class factory to use to create an
/// instance of type <typeparamref name="T"/>.
/// </typeparam>
public class Singleton<T, class_factory>
    where T : class
    where class_factory : IClassFactory<T>, new()
{
    private static object _sync = new object();
    private static T _default;

    /// <summary>
    /// Gets the singleton instance.
    /// </summary>
    public static T Default
    {
        get
        {
            EnsureDefault();
            return _default;
        }
    }

    /// <summary>
    /// Ensures that the singleton has been created.
    /// </summary>
    private static void EnsureDefault()
    {
        if (_default == null)
        {
            lock (_sync)
            {
                if (_default == null)
                {
                     CreateDefault();
                }
             }
        }
    }

    /// <summary>
    /// Uses the class factory to create the instance.
    /// </summary>
    private static void CreateDefault()
    {
        class_factory cf = new class_factory();
        T value = cf.CreateInstance();

        // This ensures that writes in the creation of
        // the default instance won't be shuffled beyond
        // the write to _default. Only matters on multiproc
        // machines where the hardware allows this. Does
        // nothing on an x86.
        Thread.MemoryBarrier();
        _default = value;
    }
}

/// <summary>
/// A basic singleton type that can be used with objects
/// that are created with a parameterless constructor.
/// </summary>
public class Singleton<T> :
    Singleton<T, DefaultClassFactory<T>>
    where T : class, new()
{
}

(For more on why I added that call to Thread.MemoryBarrier(), see this writeup.)

Here's a really simple example that does not take advantage of the customizability:

public class Foo : Singleton<Foo>
{
}

If you can't use Singleton as a base class, you have to write a little more code.

public class Foo : Bar
{
    public static Foo Default
    {
        get { return Singleton<Foo>.Default; }
    }
}

Faking Multiple Inheritance with Event Handlers

I'm going to take a page from the Book of Darrel and post some design patterns whenever I realize that I'm using them. I'd like to be careful because I think that there are a few camps when it comes to design patterns:

  1. Design patterns are very important, very specific, and are a great thing to quiz people about in an interview.
  2. Design patterns are more of a feel thing, something that is figured out when you're factoring code. There is no way you will not discover these on your own, if you are experienced and smart.

I lean to the second category. The quizzers tend to be solipsists and may know little else (you'll see the same attitude with plenty of other development topics).

Now that you know how to take my writing about design patterns, let's suppose that you want to make a Windows form that becomes a little translucent when it is inactive. That may or may not be annoying, but there is a very straightforward way to make it happen:

// Form that becomes transparent when it is inactive
public class Foo : Form
{
    private const double c_InactiveOpacity = 0.25d;
    private double _prevOpacity;

    protected override void OnDeactivate(EventArgs e)
    {
        _prevOpacity = this.Opacity;
        this.Opacity = c_InactiveOpacity;
        base.OnDeactivate(e);
    }

    protected override void OnActivated(EventArgs e)
    {
        this.Opacity = _prevOpacity;
        base.OnActivated(e);
    }

    public Foo()
    {
        _prevOpacity = 1.0d;
    }
}

Which works fine, unless you want TWO forms that do that, the second with a special base class not related to the heirarchy of the first.

public class Foo2 : SpecialForm
{
    // criminy
}

This conundrum could be resolved in a few ways:

  1. Ctrl+C, Ctrl+V.
  2. Multiple base classes.
  3. What I am going to call the "Extender Pattern."

If you picked the first option, I think it'd be better for both of us if you unsubscribed from my feed. The second option would work fine, but CLR languages don't support it. Some days this bothers me, but most days I think it's a good thing. Especially when I factor a solution that is just as elegant.

I'm calling that solution the "Extender Pattern" without bothering to find out if anyone has used this term to describe anything else. As I explained above, this is more about artistic sensibility than it is about hard rules and book learning. Not that I'm anti-book learning.

The basic idea is this: you have a set of components that support certain properties, methods, and events. The forms, in this case. You add in to the mix a set of small objects whose only responsibility is to wait for events from the components and make very directed changes to them.

So here is such a class for the problem stated above:

// Alters the target form's transparency level
// when it is activated/deactivated.
public class TransparencyExtender
{
    private Form _target;
    private double _prevOpacity;
    private const double c_opacity = 0.25d;

    public TransparencyExtender(Form target)
    {
        _target = target;
        _prevOpacity = target.Opacity;
        target.Activated +=
             new EventHandler(Activated);
        target.Deactivate +=
             new EventHandler(Deactivate);
    }

    void Deactivate(object sender, EventArgs e)
    {
        _prevOpacity = _target.Opacity;
        if (_prevOpacity >= c_opacity)
        {
            _target.Opacity = c_opacity;
        }
    }

    void Activated(object sender, EventArgs e)
    {
        _target.Opacity = _prevOpacity;
    }
}

Given this, we can write our forms very easily:

public class Foo : Form
{
    private TransparencyExtender _tExt =
        new TransparencyExtender(this);
}

public class Foo2 : SpecialForm
{
    private TransparencyExtender _tExt =
        new TransparencyExtender(this);
}

Keep in mind that I am not claiming that this is original. It's really the same as a small secondary base class in C++. However, I think there's value in looking at the same problem in different ways.

In a real application I'm working on, I've written a handful of other window/control styles that can be applied and removed on the fly or from the designer. Again, not original, but I like my little framework. Perhaps I'll post this someday, or turn it into a product.

I like event/message-driven programming the more I try to use it. I'm using Windows Forms as an example here, but it would be wrong to assume that's the only place this could be applied. I've done the same kinds of things in ASP.NET.

Using LeakDiag to Debug Unmanaged Memory Leaks

I have been getting quite a few google hits for search strings like this:

unmanaged memory leaks windbg

It's the second-most-common combination of search terms, trailing "hank goldberg picks" by a hell of a lot. I don't think the searches are coming from the same demographic. Anyway, I thought I would write up one of the easiest techniques that I'm aware of for debugging a memory leak in unmanaged code. This one doesn't touch WinDbg, but rather uses a few other Microsoft PSS tools specifically built for this purpose.

For this example, I fired up the MFC wizard and created a new scratch application. To that I added some logic to leak roughly 2K of memory every tenth of a second.

#include <vector>

using namespace std;

BEGIN_MESSAGE_MAP(CMainFrame, CFrameWnd)
    ON_WM_CREATE()
    ON_WM_TIMER()
END_MESSAGE_MAP()

// Incredibly stupid memory leak
void CMainFrame::OnTimer(UINT_PTR nIDEvent)
{
    UNREFERENCED_PARAMETER(nIDEvent);

    vector<int>* pvec = new vector<int>();
    for(int i = 0; i < 500; i++)
    {
        pvec->push_back(i);
    }
}

int CMainFrame::OnCreate(LPCREATESTRUCT lpCreateStruct)
{
    // ...
    this->SetTimer(1, 100, NULL);
}

Supposing we had not done this on purpose, it would be clear from looking at the process in Perfmon that we were dealing with a memory leak. The Private Bytes counter for this process grows steadily while the application is doing nothing in particular.

Perfmon output for a program leaking memory

The tools that we'll be using to look at this problem are LeakDiag, LDParser, and LDGrapher. You can download them all from ftp://ftp.microsoft.com/PSS/Tools/ (LeakDiag and LDParser are bundled together).

After opening the problem application, start LeakDiag.exe. In Tools->Options, we want to increase the stack depth to the maximum (32). The reason for this is because in an application written in any medium to high level language you are typically pretty far from the actual call to malloc when you are leaking memory.

Adjusting the stack depth

There are a few options available (on the main dialog) for the specific allocator to monitor. Several may generate hits for the same leak (The CRT malloc will ultimately call the NT APIs, for example), but try to pick the one that most describes your application. Click Start and create a few logs as the leak manifests itself. In the MFC application I wrote, the leak is occurring constantly. Your application may need to run for many hours before you can get any worthwhile data.

LeakDiag

After doing this, you can use the LDParser application to open up one of the log files. You'll see something like this:

LDParser

The upper-right pane is a list of unique stack traces when the specified allocator was invoked. The list should be sorted by the total amount of data allocated by each. The bottom pane shows the stack trace for the active stack ID. In my case, the stack allocating the most memory is my intentional leak (notice that CMainFrame::OnTimer is in frame ten).

If your situation is more complicated than mine, as it almost certainly will be, there is one other tool you should be aware of. LDGrapher can take a set of logs generated by LeakDiag and generate a set of graphs of allocations over time. Here is the output of my application over a few minutes:

LDGrapher

Each stack ID is represented by a line on the graph. Hopefully, this will help some of you debugging unmanaged memory leaks.

Concurrency Approaches Contrasted

Here are a few roughly-equivalent class declarations using different languages and libraries. I say "roughly equivalent" because the implementation details and performance characteristics may actually be quite different in each case. However, the end result of each is that a function on a member variable is called in a non-blocking fashion, synchronized (presumably against other operations on the same class) with the use of some resource A.


// Vanilla C# with locks
public class FooBar
{
    private object A = new object();
    private T _member = new T();
    private delegate void FooBarDel();

    public void Foo()
    {
        FooBarDel del = new FooBarDel(FooBar);
        del.BeginInvoke(
          new AsyncCallback(FooBarDone), del);
    }

    private void FooBar()
    {
        lock(A)
        {
            _member.DoSomething();
        }
    }

    // You can typically get away without this, but
    // the documentation recommends otherwise.
    private void FooBarDone(IAsyncResult r)
    {
        FooBarDel d = r.AsyncState;
        d.EndInvoke(r);
    }
}

// Comega using a chord join pattern
public class FooBar
{
    private T _member = new T();
    private async A();
    public async Foo() & A()
    {
        _member.DoSomething();
        A();
    }

    public FooBar()
    {
        A();
    }
}

// C# 2.0+ using CCR
public class FooBar
{
    private T _member = new T();
    private Port<int> _p = new Port<int>();

    public void Foo()
    {
        _p.Post(1);
    }

    public FooBar()
    {
        activate(!_p.with(delegate(int i)
        {
            _member.DoSomething();
        }));
    }
}

Looking at the other two examples I think it's clear that C# as it stands today is lacking. That's not exactly a groundbreaking statement, I realize, but I haven't seen many examples putting all of these together in one spot.

Comega is very elegant in a situation as simple as this. CCR also seems less mistake-prone than the plain C# version, but the example I picked isn't where CCR really shines.

(One thing I should say is that I'm not sure that the CCR example is correct or even compiles, since the library isn't public yet. I worked this out from reading the whitepaper [pdf]. It's only a few pages, so I recommend giving it a read.)

The goal of CCR has been to allow complex constructs such as this (also from the whitepaper) to be made easily, and dynamically if necessary:


    // The finish operation or the update operation
    // execute with exclusive control, whereas the
    // GetState and QueryState operations execute
    // concurrently. Operations of either type are
    // interleaved safely (using the '^' operator).
    activate(exclusive(p.with(DoneHandler),
        !p.with(UpdateState))
        ^
        concurrent(!p.with(GetState),
        !p.with(QueryState))
    );

Comega, on the other hand, provides static language features. The good news is that since CCR is just a library, it would be usable from a future C# (4.0, one would hope) that incorporated some of the ideas in Comega at the language level for the easier cases.

I had thought it was a shame that none of the Comega concurrency constructs had made it into the 3.0 C# specification. However, with CCR I think I can see why this is the case. CCR, or something like it, should offer a lot of power and flexibility well within the 3.0 timeframe without taking the risk of baking the 'wrong' keywords into C# itself.

Wrote a Concurrency Library for Everyone

Herb Sutter says the concurrency revolution is coming. This guy says, “*** you!! the concurrency revolution is coming!”

That is the kind of attitude we need in the industry. We need more in-your-face greeks and fewer genial mustachioed C++ architects. Call me crazy.

I know this is the type of enthusiasm that is typically found in insane people building anti-gravity machines in their backyards, but I think these guys might be on to something with their Concurrency and Coordination Runtime (CCR).

I highly recommend watching the video–even if you have no idea what these people are talking about. Seriously mind-blowing stuff.

Making a DataSet Read-Only

Let's say you're writing a component that makes a DataSet available to a number of clients. The DataSet is expected to persist in your application for a while, and be used in code written by many different developers.

You could carefully craft an email explaining that,

Hey everybody, this DataSet is shared. Please don't change the data in it for the purposes of your own piece of the application. That could create some odd bugs that would be really hard to track down. If you need to do something like that, please make a copy of the DataSet first.

I suppose you could do that, if you wanted to waste your time. You could always create a copy of the DataSet before handing it over to anybody, but that may introduce some unnecessary performance issues if the DataSet is large. It feels a little like punishing everyone for the bad behavior of a few. If you wanted to actually prevent such problems, you could write a clever class like this that acts like a "lock" on the DataSet.

/// <summary>
/// Class that ensures that clients keep their pesky hands
/// off of a particular dataset.
/// </summary>
internal class DataSetLock
{
    [Conditional("DEBUG")]
    public static void Install(DataSet ds)
    {
        if (ds == null)
        {
            throw new ArgumentNullException("ds");
        }
        EventHandler thrower = delegate(object sender, EventArgs e)
        {
            string msg = string.Format(
                "You can't change: {0}.", ds.DataSetName);
            throw new InvalidOperationException(msg);
        };

        foreach (DataTable t in ds.Tables)
        {
            t.RowChanging += new DataRowChangeEventHandler(thrower);
            t.RowDeleted += new DataRowChangeEventHandler(thrower);
            t.ColumnChanging += new DataColumnChangeEventHandler(thrower);
            t.TableClearing += new DataTableClearEventHandler(thrower);
            t.TableNewRow += new DataTableNewRowEventHandler(thrower);
        }
    }
    private DataSetLock() { }
}

The "lock" is really an anonymous method that we attach, willy nilly, to all of the change events in the DataSet. We use lexical closure to add the DataSet name passed in to the exception message dozens of caffeine-addled developer brains will be processing shortly. [Note: if you don't think it's lexical closure, go talk to this guy. He's really into the topic.]

I put the ConditionalAttribute on the Install method as well, so as long as you have reasonably good testing coverage this check will just be compiled away in retail code.

Here's a quick usage example:


static void Main(string[] args)
{
    DataSet ds = new DataSet();
    ds.DataSetName = "My Data Set";
    DataTable t = ds.Tables.Add();
    t.Columns.Add("foo", typeof(string));

    DataSetLock.Install(ds);

    // Pow!
    t.LoadDataRow(new object[] { "asdf" }, true);
}

This prints:

System.InvalidOperationException: You can't change: My Data Set.

Enjoy.

The Debugger Extension, Part 6 - Scanning Threads

The Debugger Extension

We already have an extension that might be pretty useful in some scenarios, but another common situation is determining what a particular thread is doing. You might want to look at instances of a particular type on the stack of a thread that is maxing out your CPU, or you might want to look at two or more threads that appear to be deadlocked.

We can get something like this out of the extension we have written already if we alter it to search only the stack for the current thread. How do we do that? On an x86 machine, the stack looks something like this:

Thread Stack

Finding one end of the stack region is very easy. The top of the stack (but the "bottom" address, since the stack grows down) should always be in the ESP register. To get the base of the stack we need to be able to read an NT structure called the Thread Environment Block, or TEB.

The TEB is defined as follows in the Platform SDK.

typedef struct _TEB {
    BYTE Reserved1[1952];
    PVOID Reserved2[412];
    PVOID TlsSlots[64];
    BYTE Reserved3[8];
    PVOID Reserved4[26];
    PVOID ReservedForOle;
    PVOID Reserved5[4];
    PVOID TlsExpansionSlots;
} TEB, *PTEB;

We're all ecstatic that the TEB is undocumented when this allows the kernel team to freely implement new features, I guess, but this is no help to us right now. This is closer to what the TEB header really looks like.

typedef struct tagTEB_INTERNAL
{
    DWORD dwExceptionList;
    DWORD dwStackBase;
    DWORD dwStackLimit;
    DWORD lpTIB;
    DWORD lpFiberInfo;
    DWORD lpUserPointer;
    DWORD lpSelf;
    DWORD lpEnvironmentPointer;
    DWORD dwProcessId;
    DWORD dwThreadId;
    DWORD dwActiveRPCHandle;
    DWORD lpPEB;
    DWORD dwLastError;
    // More fields follow but are not included here.
} TEB_INTERNAL, *PTEB_INTERNAL;

I took that from chapter six of Microsoft Windows Internals by Mark Russinovich. He's done many useful and awe-inspiring things besides discovering the Sony Rootkit DRM. The DbgEng SDK exposes a method to us (IDebugSystemObjects::GetCurrentThreadTeb) that makes it trivial to write a function to read in this structure in the debugger (download the source if you want to see it).

We can now write a templated search function (much like those we wrote in part four) to search only the current stack. Since the stack will contain handles/pointers to the objects, we'll also need a function that searches with a level of indirection.

// Performs a range search on the current
// thread's stack.
//
template<class search_command>
inline void SearchStack(ULONG64 pattern)
{
    TEB_INTERNAL teb = {0};
    HRESULT hr = GetCurrentTEB(&teb);
    if( FAILED(hr) )
    {
        Out("Could not retrieve the TEB.\n");
        return;
    }

    ULONG64 esp = 0;
    hr = m_Registers->GetStackOffset(&esp);
    if( FAILED(hr) )
    {
        Out("Could not read the stack pointer.\n");
        return;
    }
    Out("Thread %d:\n", teb.dwThreadId);

    // Thunk the dword to a 64-bit integer.
    // Otherwise we'll take the previous dword
    // field in the TEB structure with us into
    // the Search call.
    ULARGE_INTEGER li = { teb.dwStackBase, 0L };
    this->SearchPointers<search_command>(
         pattern,
         esp,
         li.QuadPart);
}

// Searches for the pattern with a level
// of indirection.
//
template<class search_command>
inline void SearchPointers(ULONG64 pattern,
     ULONG64 start, ULONG64 end)
{
    search_command sc;
    int hits = 0;
    for(ULONG64 offs = start;
        offs <= end;
        offs += m_PtrSize)
    {
        ULONG64 ptr = 0;
        HRESULT hr = m_Data->ReadPointersVirtual(
            1L, offs, &ptr);
        ULONG64 ptrVal = 0;
        hr = m_Data->ReadPointersVirtual(
            1L, ptr, &ptrVal);
        if( hr == S_OK && ptrVal == pattern )
        {
            if( sc.HandleMatch(ptr) )
            {
                ++hits;
            }
        }
    }
    sc.ShowResults(hits);
}

The EngExtCpp framework makes it very easy to add a switch to enable searching with this method:

0:000> !atstat /?
!atstat [/s] <The MethodTable for SampleApp.ArbitraryType.>
  <The MethodTable for SampleApp.ArbitraryType.>
  /s - Searches only the current stack.
Displays statistics about ArbitraryType instances in memory.

That lets us build some cool composite commands in WinDbg like this:

0:000> ~*e!atstat /s 009131b0
Thread 2624:
Searching for ArbitraryTypes...
--------------------------------------------
01272bf8 : Purple
01272bf8 : Purple
01272bdc : Blue
01272bf8 : Purple
--------------------------------------------
Found 4 total instances.
Totals:
  Blue: 1
  Purple: 3

Thread 1924:
Searching for ArbitraryTypes...
--------------------------------------------
--------------------------------------------
Found 0 total instances.
Totals:

Thread 2096:
Searching for ArbitraryTypes...
--------------------------------------------
--------------------------------------------
Found 0 total instances.
Totals:

I think that's where we'll leave it for now. Not bad for a day of work or so, when you consider that we're empowered to crank out similar utilities in no time at all. I hope you've enjoyed the debugger extension series.

The Debugger Extension, Part 5 - Manipulating Managed Types

The Debugger Extension

In the last post in this series, we succeeded in writing a working extension that searched memory for instances of a particular type. Trouble is, we haven't done anything useful yet. We've merely duplicated a very small subset of the functionality offered by SOS's !DumpHeap command, and poorly at that.

In the problem setup, we said we wanted to show statistics about a particular property of these instances–for the purposes of this example, we're calling that their "Color." A sensible step in this direction would be to write some utility C++ code to accompany the Colors enumeration that we wrote earlier in C#.

// Some definitions that correspond to the managed
// SampleApp.Colors enum.
typedef ULONG Color;
const Color COLOR_RED = 0;
const Color COLOR_GREEN = 1;
const Color COLOR_BLUE = 2;
const Color COLOR_PURPLE = 3;
const Color MAX_COLOR = COLOR_PURPLE;

PCSTR g_szColorNames[] = { "Red", "Green",
    "Blue", "Purple" };

bool IsColor(Color c)
{
    if( c <= MAX_COLOR )
    {
        return true;
    }
    return false;
}

PCSTR ColorName(Color c)
{
    if( !IsColor(c) )
    {
        return NULL;
    }
    return g_szColorNames[static_cast<int>(c)];
}

Part of the point of this series has been to develop a framework that deals with instances of .NET objects. To that end, we should try to write a generic base class that loads a managed instance in the debuggee into the debugger's process. This is my implementation of such a class.

// ------------------------------------------------------
// mtypes.h
//        Some base classes for dealing with instances
//        of managed objects.
//
#pragma once

template<class object_fields>
class ManagedInstance
{
protected:
    object_fields m_Fields;
    ULONG64 m_offset;
    bool m_valid;

    // Can be used by derived classes can to refer
    // to this class.
    typedef ManagedInstance<object_fields> base_t;

    // Constructor - pass the offset of the managed
    // object. Check the result of IsValid() before
    // using an instance derived from this class.
    ManagedInstance(ULONG64 offset) : m_Fields()
    {
        // Load the data for the fields from the
        // debugee process / memory dump
        IDebugDataSpaces* pData = g_Ext->m_Data;
        m_offset = offset;
        ULONG read = 0L;
        HRESULT hr = pData->ReadVirtual(
            m_offset,
            reinterpret_cast<void*>(&m_Fields),
            sizeof(object_fields),
            &read);
        m_valid = SUCCEEDED(hr) &&
            (read == sizeof(object_fields));
    }

    virtual ~ManagedInstance() {}

public:
    // Returns true if the object was successfully
    // read from the debugee. Doesn't validate that
    // it actually is a managed object of the desired
    // type, but overridden implementations should
    // do this.
    virtual bool IsValid() { return m_valid; }
};

// This can be used as a base class for classes used
// as the object_fields template parameter.
class ObjectFields
{
public:
    ULONG pMethodTable;
};

The template parameter for the ManagedInstance class takes a POD ("plain old data") type that should just list the fields in the instance. The constructor for ManagedInstance loads the data at the specified offset as those fields. I declared a virtual function that indicates whether or not the managed instance is valid. In this base class, all we can really say about that is whether or not we could read the data at the provided address.

I've also defined a base class for the fields of a managed object, and put the MethodTable address in it. Given these classes, it's not a lot of work to write the implementations for ArbitraryType.

// Represents the fields of a
// SampleApp.ArbitraryType instance.
class ArbitraryTypeFields
     : public ObjectFields
{
public:
    Color col;
    ULONG id;
};

// Represents a single ArbitraryType instance.
class ArbitraryType :
    public ManagedInstance<ArbitraryTypeFields>
{
public:
    ArbitraryType(ULONG64 offset)
        : base_t(offset) {}
    virtual bool IsValid();
    Color GetColor() { return m_Fields.col; }
};

// Returns true if the loaded data is a valid
// ArbitraryType instance.
bool ArbitraryType::IsValid()
{
    if( base_t::IsValid() &&
        IsColor(m_Fields.col) )
    {
        return true;
    }
    return false;
}

In the last post, the only criteria we used for finding ArbitraryType instances was that we had found a INT_PTR containing the address of its MethodTable. That's obviously going to result in false positives, because the CLR's execution engine will certainly have this pointer in several places in its own internal data structures. We can do a little better now by making sure that the Color field is within the range of expected values. While we're changing our HandleMatch function to implement this, we'll add an STL map to the mix to keep track of the colors we find.

class AtStatCmd : public SearchCommand
{
protected:
    typedef map<Color, int>  CountMap_t;
    CountMap_t m_counts;

public:
    virtual void ShowResults(int totalHits);
    virtual bool HandleMatch(ULONG64 offset);
    AtStatCmd();
};

bool AtStatCmd::HandleMatch(ULONG64 offset)
{
    ArbitraryType at(offset);
    if( at.IsValid() )
    {
        Color c = at.GetColor();
        g_Ext->Out("%08I64x : %s\n",
           offset, ColorName(c));
        m_counts[c]++;
        return true;
    }
    return false;
}

The output of the extension in WinDbg now looks like this:

    0:000> !atstat 009131b0
    Searching for ArbitraryTypes...
    --------------------------------------------
    Searching 00000000 to 7fffffff.
    009100cc : Red
    009130e0 : Red
    01271ce4 : Red
    01271d00 : Blue
    01271d1c : Red
    01271d38 : Blue
    01271d54 : Red
    --------------------------------------------
    Found 7 total instances.
    Totals:
      Red: 5
      Blue: 2

Incidentally, this is debugging the same sample code that, in the previous post, was declared to have 20 instances in memory. As you can see, adding some basic validation reduced the number of bogus hits considerably. A more complex object–one you might actually use, I suppose–could have an even smaller incidence of errors.

There's one obvious problem with this that I can think of, and that is the fact these objects are not necessarily alive (rooted) on the GC heap. They could very well be collected and sitting in freed memory. In my case, this is not a concern since I am interested mostly in debugging leaked memory. This may be an issue for other users, however.

Since the extension we set out to build is basically complete, I'll post the code now even though I have one more feature in mind for the next post. You can download the code here.