The Debugger Extension, Part 3: A Crash Course in .NET Object Layout
November 24th, 2005

The Debugger Extension

To write this extension, we need at least a cursory understanding of the way JIT-compiled objects are represented in memory. The basic structure on a 32-bit machine is:

Offset
        +---------------------+
 +0x0   |  MethodTable*       |
        +---------------------+
 +0x4   |  Field 1            |
        +---------------------+
 +0x8   |  Field 2            |
        +---------------------+
        |  ...                |
        +---------------------+
 +0x4*N |  Field N            |
        +---------------------+

If you are wondering what the method table is, well, it is what it sounds like. It’s a list of pointers to functions that the object defines. And some other stuff. If we wanted to dive deeper into the type metadata that supports Reflection and other magical CLR api’s, the MethodTable is where we would start. But that is beyond the scope of this series.

The object’s fields follow the MethodTable. Types derived from System.ValueType (structures) that are held as fields are inlined into the object instance. So if we have a class that has a DateTime field,

Offset
         +---------------------+
  +0x0   |  MethodTable*       |
         +---------------------+
  +0x4   |  _dt.dateData       |
         |                     |
         +---------------------+
  +0xc   |  Field 2            |
         +---------------------+
         |  ...                |
         +---------------------+
+0x4*N+4 |  Field N            |
         +---------------------+

The datetime field would occupy two slots since it’s a 64-bit value. Reference types (objects) that are held as fields are kept as pointer-sized handles.

The fields may not be in the same order as they are written in the source, but they should be stable from one process to the next. I’m not able to guarantee that since I don’t work for Microsoft and I don’t have access to the source for the CLR’s JIT, but I’ve observed this consistency quite a bit. To view the field layout of any managed type, you can use the !do (DumpObject) command in the SOS extension, or !DumpClass in the same if you do not have an instance handy. Below is the output for an instance of the type we are using in this example, ArbitraryType. This instance has a _color field of zero (Colors.Red) and a _id field of 22.

0:000> !do 012723e4
Name: SampleApp.ArbitraryType
MethodTable: 009131b0
EEClass: 00911410
Size: 16(0x10) bytes
(C:\src\samples\dmext\SampleApp\bin\Release\SampleApp.exe)
Fields:
MT    Field   Offset                 Type VT     Attr    Value Name
00913104  4000006        4         System.Int32  0 instance        0 _color
790fed1c  4000007        8         System.Int32  0 instance       22 _id

We can use the dd command to display the raw memory for the same instance. I’ve added comments here for the object’s data.

0:000> dd /c1 012723e4 l3
012723e4  009131b0        ; MethodTable*
012723e8  00000000        ; _color = (int)Colors.Red = 0
012723ec  00000016        ; _id = 0x16 = 22e

The MethodTable pointer will be the same for each instance of the type, but its value will be different every time you run the program. The constancy of this field enables a nice hack solution to our problem.

Rather than root through the CLR’s internal structures to find ArbitraryType instances, we will simply search memory for DWORDs that look like they’re pointers to the MethodTable for our type. This may result in some bogus hits, but they’ll just be noise.

In the next post in this series, we will actually start coding the extension.

If you are interested in further reading on CLR internals, I recommend this MSDN article and the SSCLI codebase.