debugging services

Wintellect is unique in that we have our a separate practice with the skills on staff to debug the nastiest and meanest bugs plaguing our industry. We have solved hundreds of “show stopper” bugs that have meant the difference between successfully shipping a product, and the company going out of business or losing an important contract. When you add up what you lose in development time, client dissatisfaction, and the cost on your reputation, Wintellect’s debugging services pay for themselves. Read some of our Wintellect debugging “war stories” below and check it out.

debugging case file

client
CONFIDENTIAL
status
RESOLVED
the challenge

This story involves a large national bank that contacted Wintellect in the midst of extreme pain. They had deployed a mission critical call center application into nation-wide production but it could not stay up for more than an hour at a time without crashing. With 500 newly hired call center employees trained on only this system, the downtime was costing them $2 million per day!

the result

The application was a standard ASP.NET application which used a Windows Forms client, hosting Internet Explorer. The problem was on the ASP.NET server side and manifested itself by the server completely locking up – to the point that the only way to recover was to power off the server. Thinking it might have been a hardware problem, the client deployed the application on a different server and was able to duplicate the issue. Wintellect also learned on the initial call that the problem only occurred in production and could not be duplicated in a test environment.

As Wintellect began scouring the source code, we asked the client to get a minidump of the ASP.NET process from the production server. A minidump is essentially a photograph of the process that can be mined in a debugger to determine the state of the application. Wintellect worked with the client’s IT department to support them in the creation of the minidumps. Coordinating with the customer, we set up open phones in the call center area, since the call center representatives “could tell when there were problems because everything got slow.” Our hope was to catch the minidump right after the call center reported slowdowns so that the dump file could be written just before the server locked up.

As the IT rep was sitting at the server waiting to be told to create the minidump, we also had them running Process Explorer, the free tool from www.sysinternals.com. With Process Explorer, we could obtain information performance counter data as well as Window handle usage for the ASP.NET worker process. We instructed the IT rep to the Process Explorer output every 30 seconds.

The application was turned on and calls were routed to the idle call center representatives. For the first 30 minutes or so, the application performed flawlessly. At approximately 45 minutes the call center reported the application was starting to slow down. We instructed the IT rep to create the dump. Right after the command to create the dump was issued, the server’s UI hung, however the disk usage light indicated disk activity. We decided to let the machine sit for a couple of hours in the hope that the dump would eventually get written. As we were waiting for the dump to finish we dove into the testing environment and worked hard at trying to create a scenario that would duplicate this bug. Remember, Wintellect did not have direct access to the testing environment but we were guiding the client over the phone. We had the QA department crank their automated tests to 10,000 users hoping to stress the ASP.NET application sufficiently. Wintellect still continued to read the source code, but that was proving a dead end: our customer had followed every best practice in the book and the code was beautiful.

After waiting two hours for the minidump to complete, we requested a reboot of the hung server and asked to see the state of the minidump. We got some seriously bad news. The minidump was zero bytes, which meant that it didn’t get written. Fortunately, the Process Explorer text files were saved until minute 46 so at least we had something to work with.

As soon as we opened up the mail with the Process Explorer files, we were shocked at what we saw. The ASP.NET process was obviously leaking handles like crazy, and they were all security tokens. We immediately dove back into the code to look for anything working with security tokens and found nothing. There were a couple of third party components, but they were written in .NET so we scanned those component’s source code with .NET Reflector, however they weren’t doing anything obvious that would use security tokens. What was even stranger was that all the security tokens were named with actual user names.

It was time to step back and look at everything about the application, from deployment on down. In examining the top level WEB.CONFIG file, we noticed the customer had set up impersonation but had used “*” instead of the usual account name. We set up a quick test with ten different user accounts on one of Wintellect’s servers and a “Hello World” ASP.NET application. Within 10 minutes we were seeing the same security token handle leak the client had.

We called our customer and explained what we were seeing, delving into why they had set the impersonation to “*”. They explained that in a future version of the product they wanted to do role-based security. We directed them to remove the impersonation for this version, and re-release the application. It ran like a dream the entire rest of the day.

the lesson

One of Wintellect’s strengths is our ability to leverage direct relationships with the various Microsoft product teams. Knowing the developers on the Microsoft ASP.NET team, we contacted them and explained the bug we were seeing. After two minutes of looking at the ASP.NET code the developers confirmed that this was a bug in ASP.NET and it turned into one of the first hot fixes for the product.

In working through a post mortem with our customer, we wanted to find out why they didn’t see this bug in test even though they had up to 10,000 users hitting the site. Digging into the testing configuration, it turned out that they had misconfigured their testing software and were actually only testing with one user hitting the site with 10,000 browsers.

What made this bug especially interesting to Wintellect is that in over ten years of our Debugging Services, this is the only bug we have found that didn’t belong to the client. The important point to realize is that the bug is always in your code, never anywhere else.

testimonials

  • Wintellect is among the best for .NET architecture, design and development work. Their consultants b

    Senior Architect
  • We have always found the real world experience that the Wintellect consultants bring to be invaluabl

    Application Development Team (Insurance Division)