FastCGI: Debugging "The FastCGI process exited unexpectedly"

FastCGI: Debugging "The FastCGI process exited unexpectedly"

Update 17-Oct: Added .lines, fixed some spelling mistakes, added pageheap Here is the sequence of events for two back-to-back FastCGI requests to a freshly started worker process:
  1. Request 1: HTTP GET /hello.php
  2. IIS (iiscore.dll) looks up handler in system.webServer/handlers, find FastCGI module
  3. The FastCGI module (iisfcgi.dll) looks up handlers@scriptProcessor in fastCgi/applications
  4. The FastCGI module spawns a new FastCGI process
  5. The module then starts communicating with the process, according to the FastCGI protocol
  6. The FastCGI process generates and returns a response
  7. Request 2: HTTP GET /hello.php
  8. (same as above, except request 1's process is re-used)

As you can see there are many places where the FastCGI module expects to have a FastCGI process to communicate with. If the FastCGI process disappears at any of these steps, the "The FastCGI process exited unexpectedly" error occurs. If this happens frequently Rapid Failure Protection may be enabled.

Here are some common causes for this error:

  • The FastCGI process calls some sort of ExitProcess API, e.g. the C exit() function
  • An unhandled exception occurs, e.g. Access Violation

ExitProcess:

  • If the FastCGI is a script interpreter, the scripting language may support some sort of ExitProcess function and the script may call it. (though PHP's exit() function does not seem to exit the process)
  • FastCGI processes often voluntarily exit after performing a certain number of requests. (e.g. PHP 5.2.1 does so at about 250 and can be configured) If the FastCGI module is expecting (fastCgi/application@instanceMaxRequests) the FastCGI process to perform more requests then this can also be a source of this error. This error can only happen in the time between when the next request arrives and the module detects the process' exit. This happens most often on busy servers, where requests are queuing for a FastCGI process.

Access Violations are more difficult to debug. The first step is to create a simple repro, here are repros sorted from simplest to most challenging:

  • Command line, without FastCGI. (e.g. \php\php.exe \inetpub\wwwroot\hello.php)
  • Command line, with a web environment. (e.g. Ruby's Webrick, or my fake FastCGI server)
  • Single HTTP request. (e.g. recycle AppPool, GET /hello.php)
  • Only under load. (e.g. Only when wcat is hitting the site)
  • Randomly. (e.g. Only in production)
Once you have the simplest script with the simplest repro, you need to debug it. Download Windbg or the IIS debug toolkit. The rest of this post discusses Windbg debugging, though the debug toolkit is often simpler and easier.
  1. Attach the debugger:

    • For command-line failures, prefix the debugger onto the command. (e.g. c:\debuggers\ntsd -g -G \php\php.exe \inetpub\wwwroot\hello.php)
    • For AppPool debugging, you'll need to attach after starting the AppPool, e.g.
              D:\>net start w3svc
              The World Wide Web Publishing Service service is starting.
              The World Wide Web Publishing Service service was started successfully.
      
      
              D:\>start http://localhost/dne
      
              D:\>\Windows\system32\inetsrv\appcmd list wps
              WP "4012" (applicationPool:DefaultAppPool)
      
              D:\>c:\debuggers\ntsd -o -g -G -p 4012
              
  2. Reproduce the Access Violation, you should see something like this in the debugger:
    Executable search path is:
    ModLoad: 00370000 00374000   av.exe
    ModLoad: 77380000 774a6000   ntdll.dll
    ModLoad: 76a80000 76b5b000   D:\Windows\system32\kernel32.dll
    ModLoad: 75eb0000 75f5a000   D:\Windows\system32\msvcrt.dll
    (ce4.d30): Access violation - code c0000005 (first chance)
    First chance exceptions are reported before any exception handling.
    This exception may be expected and handled.
    eax=00000000 ebx=00000000 ecx=00372374 edx=00000000 esi=00000001 edi=00372374
    eip=00371190 esp=0014fe24 ebp=0014fe28 iopl=0         nv up ei pl zr na pe nc
    cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00010246
    *** ERROR: Module load completed but symbols could not be loaded for av.exe
    av+0x1190:
    00371190 c700adbaadba    mov     dword ptr [eax],0BAADBAADh ds:0023:00000000=???
    ?????
    0:000>
    
    If you don't see an access violation occur in the debugger, then you don't have a reliable repro. Go back to creating a small simple repro. (very rarely certain AVs only happen when NOT under a debugger).
  3. Setup symbols:

    0:000> .sympath SRV*c:\localsymbols*http://msdl.microsoft.com/download/symbols
    Symbol search path is: SRV*c:\localsymbols*http://msdl.microsoft.com/download/sy
    mbols
    0:000> .lines
    Line number information will be loaded
    0:000> k
    *** ERROR: Symbol file could not be found.  Defaulted to export symbols for ntdl
    l.dll -
    ChildEBP RetAddr
    WARNING: Stack unwind information not available. Following frames may be wrong.
    0014fe28 00371312 av+0x1190
    *** ERROR: Symbol file could not be found.  Defaulted to export symbols for D:\W
    indows\system32\kernel32.dll -
    0014fe6c 76ac1d02 av+0x1312
    0014fe78 773b85db kernel32!BaseThreadInitThunk+0x12
    0014feb8 773b85ae ntdll!RtlInitializeExceptionChain+0x64
    0014fed0 00000000 ntdll!RtlInitializeExceptionChain+0x37
    0:000> lml
    start    end        module name
    00370000 00374000   av         (no symbols)
    7c800000 7c8f5000   kernel32   (pdb symbols)          c:\symbols\kernel32.pdb\CE
    65FAF896A046629C9EC86F626344302\kernel32.pdb
    7c900000 7c9b0000   ntdll      (pdb symbols)          c:\symbols\ntdll.pdb\36515
    FB5D04345E491F672FA2E2878C02\ntdll.pdb
    
    If symbols are working, then, after "k" (stack back trace), "lml" (list loaded modules) will not show "no symbols" or "export symbols". The Microsoft symbol server only gets Microsoft symbols. You'll also need symbols for the application, e.g. If you're debugging PHP use the PHP Snapshot php bits, and then use the "Debug pack" symbols (.sympath+ \path\to\debugpack; .reload). If you're using Ruby, you'll probably need to build from source and then ".sympath+ " to wherever ruby puts the .pdb files.
  4. Share out the stack backtrace (the "k" command's output) with whoever you think may own it.
Other useful tricks:
  • Take a memory dump. (".dump /ma \av.dmp") and then share out the av.dmp file and any symbols you did not get from the Microsoft symbol server.
  • The Windows heap has a built-in memory-checking feature called pageheap. It makes debugging certain memory-usage errors extremely easy. On the downside, it slows the process down substantially and uses massive amounts of memory, so use caution before enabling it and remember to turn it off before perf testing. Here is a sample usage:
    1. ensure no ruby.exe processes exist
    2. gflags /p /enable ruby.exe /full
    3. start ruby.exe, cause av, debug
    4. gflags /p /disable ruby.exe
    5. gflags /p (should say "no application has page heap enabled")
Happy Debugging!

2 Comments

Comments have been disabled for this content.