Minidump Explorer v0.4 released

I’ve released a new version of Minidump Explorer on CodePlex. You can find the download here.

Included in this release:

Posted in Crash Dumps | Tagged | Leave a comment

Minidump Explorer v0.3 released

I’ve released a new version of Minidump Explorer on CodePlex. You can find the download here.

Included in this release:

Write-ups on each stream will follow soon.

Posted in Crash Dumps | Tagged | Leave a comment

Minidump Explorer v0.2: Reading minidump ThreadListStream

Version 0.2 of Minidump Explorer included 4 new data streams: MemoryListStream, Memory64ListStream, HandleDataStream and ThreadListStream. These are all fairly simple streams to read and use but 3 of them are going to be vitally important when we start hooking into CorDebug: MemoryListStream, Memory64ListStream and ThreadListStream. I spoke about the 2 memory list streams last time. This time I’ll talk about the ThreadListStream.

ThreadListStream

First up lets have a look at the structures we’ll be using when reading MemoryListStream:

typedef struct _MINIDUMP_THREAD_LIST {
  ULONG32         NumberOfThreads;
  MINIDUMP_THREAD Threads[];
} MINIDUMP_THREAD_LIST, *PMINIDUMP_THREAD_LIST;

typedef struct _MINIDUMP_THREAD {
  ULONG32                      ThreadId;
  ULONG32                      SuspendCount;
  ULONG32                      PriorityClass;
  ULONG32                      Priority;
  ULONG64                      Teb;
  MINIDUMP_MEMORY_DESCRIPTOR   Stack;
  MINIDUMP_LOCATION_DESCRIPTOR ThreadContext;
} MINIDUMP_THREAD, *PMINIDUMP_THREAD;

There’s no need to discuss MINIDUMP_THREAD_LIST as it’s quite self explanatory and we’ve seen it previously when reading the ModuleListStream and memory streams. Our interest is in the MINIDUMP_THREAD structure.

I haven’t really needed to use most of the fields provided by MINIDUMP_THREAD, so I won’t be going into too much detail about most of them for now. I will provide references to extra information just in case they’re of interest to you.

The first 4 fields provide fairly basic thread info: it’s id, how many times it has been suspended and it’s priority class and priority. The Id is self-explanatory. This one is obviously quite important. I haven’t needed to use the suspend count and priority fields yet. You can find more information about the suspend count by looking at the documentation for SuspendThread and ResumeThread. Simply put: it’s a count of how many times the thread has been suspended. If the count is greater than 0 then the thread has been suspended and will not run. Each call to SuspendThread and ResumeThread increases and decreases the count respectively. Once the count reaches 0 the thread is allowed to run again.

There’s a detailed write up on the priority fields here. The documentation is clear enough (mostly), but I looked at a sample crash dump and I couldn’t tie the priorty/priority class I saw back to the documentation.

The last 3 fields are where things start to get interesting.

Teb. This is the “thread environment block”. This contains very low level information about a thread and as you can see from the documentation it can change between different versions of Windows.

Stack. I mentioned in a previous post that the minidump API (or rather the DbgHelp API) only provides methods to read the raw data contained in a minidump. Well here’s a good example of that. The Stack field tells us where the data for the stack of the thread is located, and how big it is, but it doesn’t provide a way to decode the information contained there. If you want to make sense of the data you have to go through the stack frames one by one and piece the information together yourself. That’s quite a daunting task. This where the CorDebug API comes in: it’ll reconstruct the stack for us. It does still leave some work for us to do, but it’s a lot better that piecing it together ourselves. Suffice it to say that I haven’t found a reason to use this field yet.

Context. This field points to the location in the crash dump where the context information of the thread can be found. This is also very low level thread information e.g. the value of the cpu registers, etc. The explanation about the Stack field applies here also: you can access the raw context data, but decoding it is up to you. Luckily I haven’t needed to get into the details of the context structure; all CorDebug needs is the raw data, it’ll figure the rest out itself.

Show me the code

I won’t go through the all of the field types individually as they’re all types I have covered previously. The definitions for MINIDUMP_MEMORY_DESCRIPTOR and MINIDUMP_LOCATION_DESCRIPTOR were covered when I discussed reading the two memory streams.

[StructLayout(LayoutKind.Sequential, Pack = 4)]
internal struct MINIDUMP_THREAD_LIST
{
    public UInt32 NumberOfThreads;
    public IntPtr Threads; // MINIDUMP_THREAD[] 
}

[StructLayout(LayoutKind.Sequential, Pack = 4)]
internal struct MINIDUMP_THREAD
{
    public UInt32 ThreadId;
    public UInt32 SuspendCount;
    public UInt32 PriorityClass;
    public UInt32 Priority;
    public UInt64 Teb;
    public MINIDUMP_MEMORY_DESCRIPTOR Stack;
    public MINIDUMP_LOCATION_DESCRIPTOR ThreadContext;
}

As far as reading the stream: you’ll follow the same steps as we did when reading the ModuleListStream. You can find the full source code on my CodePlex project (I’ve made a few small updates since the original article).

That’s it for the ThreadListStream. Other than the Id and Context fields there’s not much I’ll be using from this stream. Those 2 fields are vital though!

Next time: the HandleDataStream.

Posted in Crash Dumps | Tagged , , | Leave a comment

Minidump Explorer v0.2: Reading minidump MemoryListStream and Memory64ListStream

Version 0.2 of Minidump Explorer included 4 new data streams: MemoryListStream, Memory64ListStream, HandleDataStream and ThreadListStream. These are all fairly simple streams to read and use but 3 of them are going to be vitally important when we start hooking into CorDebug: MemoryListStream, Memory64ListStream and ThreadListStream. The 2 memory list streams will naturally give us access to some, or all, of the memory of the crashed process, while the thread list stream, amongst other things, give us access to the thread’s Context information. The handle data is interesting, but not essential at this point in time.

MemoryListStream and Memory64ListStream

MemoryListStream and Memory64ListStream provide you with a list of memory regions that are contained in the crash dump. The difference between the two is that Memory64ListStream is used for full-memory dumps, while MemoryListStream is used when only partial memory is available. You can see the difference when you look at the declaration of the structures they return:

MemoryListStream (C++)

typedef struct _MINIDUMP_LOCATION_DESCRIPTOR {
  ULONG32 DataSize;
  RVA     Rva;
} MINIDUMP_LOCATION_DESCRIPTOR;

typedef struct _MINIDUMP_MEMORY_DESCRIPTOR {
  ULONG64                      StartOfMemoryRange;
  MINIDUMP_LOCATION_DESCRIPTOR Memory;
} MINIDUMP_MEMORY_DESCRIPTOR, *PMINIDUMP_MEMORY_DESCRIPTOR;

typedef struct _MINIDUMP_MEMORY_LIST {
  ULONG32                    NumberOfMemoryRanges;
  MINIDUMP_MEMORY_DESCRIPTOR MemoryRanges[];
} MINIDUMP_MEMORY_LIST, *PMINIDUMP_MEMORY_LIST;

If you look at MINIDUMP_MEMORY_LIST you’ll notice that it contains an array of MINIDUMP_MEMORY_DESCRIPTOR’s (MemoryRanges). Each MINIDUMP_MEMORY_DESCRIPTOR represents a region of memory included with the minidump. It contains the starting address of the region of memory represented (StartOfMemoryRange) and a MINIDUMP_LOCATION_DESCRIPTOR (Memory) indicating how big the region is and where in the minidump file to find it. What’s important to note here, is that each region of memory could be in a different physical location inside the minidump file. Because of this each MINIDUMP_LOCATION_DESCRIPTOR has an Rva field which you need to use to find the correct location in the minidump to read from. Full memory dumps are different: the memory is all stored in one sequential block at the end of the dump. As a result you don’t need individual RVA’s for each region. You can see the difference here:

Memory64ListStream (C++)

typedef struct _MINIDUMP_MEMORY_DESCRIPTOR64 {
    ULONG64 StartOfMemoryRange;
    ULONG64 DataSize;
} MINIDUMP_MEMORY_DESCRIPTOR64, *PMINIDUMP_MEMORY_DESCRIPTOR64;

typedef struct _MINIDUMP_MEMORY64_LIST {
    ULONG64 NumberOfMemoryRanges;
    RVA64 BaseRva;
    MINIDUMP_MEMORY_DESCRIPTOR64 MemoryRanges [0];
} MINIDUMP_MEMORY64_LIST, *PMINIDUMP_MEMORY64_LIST;

You’ll notice that MINIDUMP_MEMORY64_LIST also has an array of descriptors (MINIDUMP_MEMORY_DESCRIPTOR64). The difference here is that MINIDUMP_MEMORY_DESCRIPTOR64 doesn’t include a location descriptor as MINIDUMP_MEMORY_DESCRIPTOR did. This is as a result of all of the memory being in one block at the end of the minidump: you don’t need individual RVA’s for each block since they all follow each other starting from MINIDUMP_MEMORY64_LIST.BaseRva.

Reading memory data from a minidump

So how do you go about reading from a location in memory?

If you have a crash dump containing partial memory (MINIDUMP_MEMORY_LIST) you would loop through each MINIDUMP_MEMORY_DESCRIPTOR and check for a region containing the address you were looking for: (myReadAddress >= StartOfMemoryRange) && (myReadAddress < (StartOfMemoryRange + Memory.DataSize)). If you find a matching MINIDUMP_MEMORY_DESCRIPTOR you would then add it’s Memory.Rva field (from MINIDUMP_LOCATION_DESCRIPTOR) to the address of the minidump file mapping in order to get the physical location to read from.

Reading from MINIDUMP_MEMORY64_LIST is a bit different. You would still loop through each MINIDUMP_MEMORY_DESCRIPTOR64 looking for the region that holds the address you want to read from, but the difference is in how you read the data once you’ve found the correct region. Since a full-memory dump has all of the memory stored sequentially at the end of the dump file there is only one RVA and that RVA points to the beginning of the memory data inside the minidump. In order to read the actual memory you need to keep a running total of the DataSize of each MINIDUMP_MEMORY_DESCRIPTOR64 that precedes the one that you need and add that to the BaseRva. Then add that to the address of the memory mapped file of the crash dump. So the end result logic would look similar to this:

// addressOfBlockToReadFrom is the address in the minidump file of 
// the start of the block you want to read from.
long addressOfBlockToReadFrom = memoryMappedFileAddress + memory64List.BaseRva + allPreceedingDataSizes;

// offsetToReadFrom is the offset from the beginning of the 
// block that you want to read from.
// e.g. if you want to read from 0x23 and the block starts 
// at 0x20 then the offset is 0x3.
long offsetToReadFrom = addressIWantToReadFrom - blockToReadFrom.StartOfMemoryRange;

// addressToReadFrom is the physical address in the minidump file
// where you should start reading from.
long addressToReadFrom = addressOfBlockToReadFrom + offsetToReadFrom;

Show me the code

I won’t go through the all of the field types individually as they’re all types I have covered previously. The only new one is RVA64 which is a ULONG64 and translates to a UInt64 in c#.

MemoryListStream (C#)

[StructLayout(LayoutKind.Sequential, Pack = 4)]
internal struct MINIDUMP_LOCATION_DESCRIPTOR
{
    public UInt32 DataSize;
    public uint Rva;
}

[StructLayout(LayoutKind.Sequential, Pack = 4)]
internal struct MINIDUMP_MEMORY_DESCRIPTOR
{
    public UInt64 StartOfMemoryRange;
    public MINIDUMP_LOCATION_DESCRIPTOR Memory;
}

[StructLayout(LayoutKind.Sequential, Pack = 4)]
internal struct MINIDUMP_MEMORY_LIST
{
    public UInt32 NumberOfMemoryRanges;
    public IntPtr MemoryRanges; // MINIDUMP_MEMORY_DESCRIPTOR[]
}

Memory64ListStream

[StructLayout(LayoutKind.Sequential, Pack = 4)]
internal struct MINIDUMP_MEMORY_DESCRIPTOR64
{
    public UInt64 StartOfMemoryRange;
    public UInt64 DataSize;
}

[StructLayout(LayoutKind.Sequential, Pack = 4)]
internal struct MINIDUMP_MEMORY64_LIST
{
    public UInt64 NumberOfMemoryRanges;
    public UInt64 BaseRva;
    public IntPtr MemoryRanges; // MINIDUMP_MEMORY_DESCRIPTOR64[]
}

As far as reading the streams: you’ll follow the same steps as we did when reading the ModuleListStream. You can find the full source code on my CodePlex project (I’ve made a few small updates since the original article).

That’s it for the memory streams, next time I’ll cover the ThreadListStream.

Posted in Crash Dumps | Tagged , , , , | Leave a comment

Minidump Explorer v0.2 released

I’ve released a new version of Minidump Explorer on CodePlex. You can find the download here.

Included in this release:

Write-ups on each stream will follow soon :)

Posted in Crash Dumps | Tagged , , | Leave a comment

Reading minidump files, part 4 of 4: Putting it all together

All code is available on my CodePlex project.

This is the last in a 4 part series on how to use MiniDumpReadDumpStream to read data streams contained within minidump files. The first article discussed how to access memory mapped files from c#, the second how to call MiniDumpReadDumpStream and the third how to interpret the data returned (using ModuleListStream as an example). This post wraps everything up and introduces the Minidump Explorer Windows Form application. The posts that follow this one will discuss each data stream as I work towards creating an application (Minidump Explorer is just a stepping stone) that will let you analyze and visualize minidumps of processes that were running the CLR.

Analyzing the information contained in minidumps is still going to take some time; that requires using an entirely different API and a lot more work. For now we’ll just look at the raw data contained in a minidump file and create an application that can view that data. We’ll use that as a stepping stone to something bigger. Given that, let me introduce the Minidump Explorer Windows Form application! Remember those Firefox crash dumps I found lying around in my first post? Well here’s our first look at the contents of those files:

Display of a module list

Minidump Explorer displaying a module list stream.

You can see that I’m using the module stream information I spoke about in the previous article to display a list of modules that were loaded at the time of the crash. Some of this module information (e.g. the base address, timestamp, size & module name) is going to be vital to us in the future when we start trying to interpret the minidump i.e. trying to look at threads and calls stacks, or heaps and objects. That’s still a ways off, but I want to take a moment to explain the difference between looking at the raw minidump stream data and looking at what’s happening inside the application that crashed.

What is a minidump and how do we interpret them?

You’ll notice that I always mention analyzing minidumps of processes “running the CLR”. I’ve been very careful to deliberately word it that way. Minidumps are crash dumps of a process from an operating system point of view. By that I mean that from the operating system point of view, it was running a process; that process stopped working and a crash dump was created. The operating system doesn’t know what that process was, or what it was doing. It’s knows it had resources allocated to it, there were one or more threads and it had a list of instructions that it was executing. But it doesn’t know whether it was running a Python interpreter, a Java VM, the CLR or just a regular native application. From the operating system point of view everything is a native process. Think about it this way: the operating system is running the CLR, and the CLR is running your code.

When we look at a minidump we’re seeing what the operating system see’s i.e. the CLR. It has no knowledge of the data structures or workings of the CLR; it just does what it’s told to do.

For example: the operating system doesn’t know about CLR threads. It knows about threads that the CLR asks it to create, but it doesn’t know about threads created internally by the CLR. Did you know that CLR threads don’t necessarily match one-to-one with operating system threads? Just because you create a System.Threading.Thread object in your .Net code doesn’t mean the CLR will create a matching thread in the operating system. This is because the CLR sits on top of the operating system and manages itself and its own resources; just like the Java VM does.

What does this all mean? Well we don’t want to see what the operating was doing when the application crashed. If we looked from that point of view we’d see that the operating system was running the CLR. We want to know what was happening in our .Net code when it crashed, or rather what the CLR was doing. This means we need something that knows how the CLR works and can translate between what the operating system saw and what the CLR was doing, like this:

Operating system running CLR code

Operating system running CLR code

Hopefully that helps to illustrate where we’re heading towards. It’s a bit more technical than that, but it should help you understand.

Viewing module information

Back to Minidump Explorer! As I was saying: I’ve used the module stream to display the list of modules. You can also double click each module to get more detailed information:

General module properties

General module properties

Module fixed info

By the way: I’m not picking on Firefox! It just happens to be one of the few applications that will create a minidump when something does go wrong. I’m sure the problem was probably Flash related ;)

Creating minidumps

I’ve also included the ability to capture crash dumps. I discussed that here. It’s a simple wizard that lets you select a process, choose what information you want captured and create the crash dump. The process will be unaffected once the dump has been created.

CodePlex

I’ve created a project on codeplex and uploaded all of the code I’ve done so far, you’ll find it at https://minidumps.codeplex.com. Have a look around, download the application and give some comments/suggestions, feedback is very welcome.

Moving forward

Now that the basics are done we should be able to add the other data streams fairly quickly. Next up: thread, memory/memory 64 and handle streams.

Posted in Crash Dumps | Tagged , , | 3 Comments

Converting from time_t to System.DateTime

I was busy reviewing my minidump source code before publishing it to codeplex and I noticed that there was one field in the MINIDUMP_MODULE structure that I didn’t mention previously: MINIDUMP_MODULE.TimeDateStamp.

The documentation for MINIDUMP_MODULE says this field is stored as a time_t, which turns out to be a C/C++ type. Trying to find out how to convert that into a System.DateTime in .Net has been a painful experience. The amount of different date/time formats in the Windows API is quite scary. You can find a list of some of them here.

So to save you the time and effort of trying to figure out how to deal with time_t’s here the code to convert them to a System.DateTime:

public static DateTime TimeTToDateTime(UInt32 time_t)
{
    // 10 000 000 * january 1st 1970
    long win32FileTime = 10000000 * (long)time_t + 116444736000000000;

    // FromFileTimeUtc is the UCT time, FromFileTime adjusts for the local timezone
    return DateTime.FromFileTime(win32FileTime); 
}

And here’s why it works.

Another route that (should) work is to first convert the time_t to a system time using FileTimeToSystemTime.

Posted in Crash Dumps | Tagged | Leave a comment