John Robbins' Blog

The Case of the Corrupt PE Binaries

After installing Windows 7, I also installed the Windows 7 SDK as I wanted to poke around the updated headers and documentation files to see what was new at a low level. Additionally, I wanted to make sure all my code compiled against the new headers and libraries in case someone taking my native debugging class tried it and ran into problems. After many years of the SDK team completely ignoring Visual Studio, the Windows 7 SDK installation now looks for Visual Studio and properly integrates with it (I believe this started with the Vista SDK). After hundreds of thousands of emails over the years from people who couldn't compile code from my books and columns because they hadn't gone through the manual gyrations to integrate the latest SDK with the development environment, it's a huge help.

As I have all my builds automated, I let rip and got a build failure on a few release build x86 and x64 binaries. The failure was like the following in all those cases:

mt.exe : general error c101008d: Failed to write the updated manifest to the resource of file "..\..\..\..\Release\FTSimpTest.exe". The binary is not a valid Windows image.

MT.EXE is the tool used to embed the manifest into your binaries.

As I was running a beta OS, I wondered if this was a problem with anti-virus, so I disabled eTrust and tried the build again but still got the same error.

As I'm using Visual Studio 2008 SP1, CL.EXE and LINK.EXE are doing all the main work of compiling, but MT.EXE comes from the SDK. Since this code compiled correctly on my Vista computer with the Vista SDK installed, my next step was to see if this was possibly a problem with MT.EXE or if CL.EXE and LINK.EXE were exposing a bug in the operating system DLLs they were using. I uninstalled the Windows 7 SDK, which reverted Visual Studio 2008 SP1 to using the Vista SDK. Giving the recompiles a go, I got the same error from MT.EXE. I verified that I was in fact using the MT.EXE from the Vista SDK. One thing that was confusing to me was that MT.EXE from the Vista SDK and Windows 7 SDK both report the same version number, but the binaries are different sizes.

Looking closer at what binaries were getting corrupt, it was only four out of 56 .EXE files in my build. Interestingly, they were all console applications that were unit tests. Firing up Visual Studio, I created the canonical test console application, Hello World, and verified that it compiled. Looking at the BUILD.HTM file, I saw that Hello World had the following in it:

Creating command line "mt.exe @c:\Junk\cruft\HelloWorld\Release\RSP00000840485968.rsp /nologo"
Creating temporary file "c:\Junk\cruft\HelloWorld\Release\BAT00000940485968.bat" with contents
[
@echo Manifest resource last updated at %TIME% on %DATE% > .\Release\mt.dep
]

For two seconds, I was a little confused about the temporary file creation because my failing builds didn't have that. What the output told me was that the temporary file indicating MT.EXE creates the resource update time file after it runs successfully.

My hypothesis at this point was that either LINK.EXE was creating a corrupt binary before MT.EXE worked on it, or it was MT.EXE corrupting the binary itself. Using one of my projects that produced a corrupt binary, I copied out the command lines for CL.EXE, LINK.EXE, and MT.EXE from its BUILD.HTM and ran them directly from a batch file at the command line (properly set up with VCVARS.BAT). I wanted to look at what was in the Portable Executable (PE) data to see if anything was amiss. By the way, you need to read Matt Pietrek's definitive "An In-Depth Look into the Win32 Portable Executable File Format" Parts 1 and Part 2 you'll learn a ton about how Windows works.

Running DUMPBIN.EXE /headers on the resulting EXE allowed me to look at the main portions. Because MT.EXE puts the manifest into the resource section of the binary, I paid special attention to it:

SECTION HEADER #4
   .rsrc name
       0 virtual size
    4000 virtual address (00404000 to 00403FFF)
       0 size of raw data
       0 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
40000040 flags
         Initialized Data
         Read Only

Do you see the problem? Here's a good resource section for comparison:

SECTION HEADER #4
    .rsrc name
      2B0 virtual size
     4000 virtual address (00404000 to 004042AF)
      400 size of raw data
     1800 file pointer to raw data (00001800 to 00001BFF)
        0 file pointer to relocation table
        0 file pointer to line numbers
        0 number of relocations
        0 number of line numbers
40000040 flags
         Initialized Data
         Read Only

Interesting! There's a resource section, but it is obviously corrupt because something not filling out the raw data information. As the manifest goes in the resource section, MT.EXE really can't add to a section whose raw data starts at zero.

Since the app wizard generated project is not producing corrupt binaries but I have several projects that are, it's time to look at the LINK.EXE and MT.EXE switches to see what my projects have set that are tripping either of those tools up. The easy way for me to do that is look at the .VCPROJ files in the fantastic, and free, SourceGear DiffMerge. (I can't rave enough about DiffMerge!)

DiffMerge pointed me write to the .VCPROJ, VCLinkerTool RandomizedBaseAddress attribute as the only real difference between the files. In the app wizard project, the value is 1, for my bad project, the value is 2. A check in the project properties shows a value of 2 maps to the linker switch /DYNAMICBASE:NO. In my bad project, I change the value to /DYNAMICBASE and the previously corrupt release build built and worked perfectly.

Once I fix all my broken projects to use /DYNAMICBASE instead of /DYNAMICBASE:NO, all my build problems go away. As I mentioned earlier, the corrupt builds happened to be several of my unit tests, which is where you might see it as well. The exact situation I was getting the corrupt binary is as follows:

  1. On Windows 7 with Visual Studio 2008 SP1
  2. There's no resource file in the binary.
  3. It's a release build
  4. You are specifically not setting /DYNAMICBASE to the linker

After I spent the 20 minutes tracking this problem down, I thought maybe I should read the Windows 7 SDK ReadMe file to see if this was a known issue. Section 5.3.6, titled "Problem Running MT.EXE on Windows 7 Beta," says MT.EXE fails if the .EXE does not contain a resource (.rsrc) section and the work around is to add an empty .RC file to your project to work around the problem.

After smacking my forehead wondering if this was the exact bug I was running into, I took the batch file where I ran all the CL.EXE, LINKER.EXE, and MT.EXE commands directly and removed all the /MANIFEST related switches from the LINK.EXE command line. I also commented out call to MT.EXE and rebuilt. Of course, with MT.EXE completely out of the way, the binary wasn't corrupt.

I think I'm seeing a manifestation (pun intended!) of the bug mentioned in section 5.3.6 of the ReadMe. A little experimentation compiling and linking with the bare minimum switches necessary shows that the linker defaults to /DYNAMICBASE:NO because the optional header DLL characteristics does not show Dynamic base when you dump the binary.

While I thought I was going to have a nice Case Of… story, it turns out that I should have read the Windows 7 SDK ReadMe first, which is always a very wise idea when dealing with betas. There's nothing like reproducing a known bug. However, I was able to find an additional workaround to the MT.EXE bug by setting the /DYNAMICBASE switch. In my situation that was better because if I added a .RC file to all the unit tests, I was going to have to add those files to my code installs and patches.

On Jan 24 2009 9:48 AMBy jrobbins With 14 Comments

Comments (14)

  1. Thank you! Thank you! Thank you! This has been driving me CRAZY for a week! The workaround you show does work, but I am not using the Windows 7 SDK but rather straight VS2008 Professional.

  2. I can't tell you how frustrated I was getting with this issue. I was trying to run nmake from the VC++ 2005 Express CLI on Windows 7 and it would always fail with mt.exe. Yet in the VC++ GUI worked fine!

    Thank you, thank you, thank you.

    Regards,
    Andrew

  3. OMG thank you so much...I've been fighting this issue all day and scratching my head!

  4. Roberto Dalmonte

    I run into a problem after installing Windows 7 RC build 7100.
    If I try to build with Visual studio 2008 sp1 a project I get no errors, so everything builds fine, but when I try to debug it I get the following error when I try to load an Image using a ResourceManager. Of course this very same code works successfully on Vista.
    System.Drawing.Image.get_FrameDimensionsList(). Apparently there is a problem loading a png file from the resource file. I couldn't find any suggestion so ... I've redone my machine back to Vista.
    Do you have any idea?
    Best Regards

  5. marc ochsenmeier

    Hi,

    PeStudio (www.winitor.net/en/pestudio.html) inspects any application regarding its support for ASLR...and many other flags and information hidden in the binary file.

    hope its helps.

    Regards,
    Marc Ochsenmeier

Leave a Comment

Archives

Tags