Thursday, March 22, 2007 12:20 PM brian-murphy-booth

How to troubleshoot an IIS "Event ID: 1009" error.

An error that most IIS 6.0 administrators have probably encountered is "Event ID: 1009" which usually leads to a "503 Service Unavailable" error being displayed in a browser. "503" usually indicates the Application Pool has been disabled for some reason. The IIS support team frequently gets support calls to help resolve this issue and over the years I have compiled a list of steps I use to troubleshoot this. If the following information seems too confusing please let me know and I'll clarify any confusing points. The follow list is broken down into different sections for the various "exit codes" that are in the Event 1009.

<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

*** Problem Description ***
Many times when an IIS6 application pool terminates the following will be logged.

  1. The Name of the App Pool
  2. The PID
  3. The exit code.

The exit code is the most useful part of the event entry.

  Event Type: Warning
  Event Source: W3SVC
  Event Category: None
  Event ID: 1009
  Date:  1/29/2004
  Time:  10:01:14 AM
  User:  N/A
  Computer: COMPUTERNAME
  Description:
  A process serving application pool 'DefaultAppPool' terminated unexpectedly. The
process id was '3908'. The process exit code was '0x80'.

<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

*** Resolution ***

Constants: 

// the WAS killed the worker process
#define KILLED_WORKER_PROCESS_EXIT_CODE 0xFFFFFFFD

// the worker process exited ok
#define CLEAN_WORKER_PROCESS_EXIT_CODE  0xFFFFFFFE

// the worker process exited due to a fatal error
#define ERROR_WORKER_PROCESS_EXIT_CODE  0xFFFFFFFF

-----------------------------------------------------------
Concepts that apply to exit codes of both 0x80 and 0xffffffff
  · Make sure "Network Service" and "IWAM_MACHINE" are members of IIS_WPG
  · If a custom identity is being used for the app pool ensure it is in IIS_WPG
  · Make sure that IIS_WPG is included somehow in these User Rights assignments
(example: Everyone group includes IIS_WPG so that is sufficient)
      a. Access this computer from network
      b. Log on as batch job
      c. Bypass traverse checking
  · Ensure that IIS_WPG or members of that group are not in any of the
corresponding "Deny" User Rights
  · Ensure that "NT AUTHORITY\Authenticated Users" and "NT AUTHORITY\Interactive"
are part of the "Users" group.
  · Use FileMon.exe and RegMon.exe to identify ACC DENIED's.
  · If customer has more than ~60 app pools with unique identities, set the
following key:
      HKLM\System\CurrentControlSet\Services\W3SVC\Parameters\UseSharedWPDesktop
(REG_DWORD with value of 1)
-----------------------------------------------------------
The process exit code was '0x80'.
  In certain cases when w3wp.exe goes away IIS will attempt to determine the exit code by calling the "GetExitCodeProcess" API.
  If the reason cannot be determined by this API then the code returned is 0x80.
  "0x80" means "ERROR_WAIT_NO_CHILDREN" which means "There are no child processes to wait for"

  This typically means the W3WP.exe never started at all which could be User-Rights related or
NTFS permissions.
  If it is NTFS permissions, that means the AppPool identity doesn't have read
permission to the w3wp.exe file and/or supporting DLLs
  - Use FileMon to troubleshoot
  If it is User Rights related then the AppPool identity failed to logon.
  - Check the 3 User Rights listed above.
-----------------------------------------------------------
The process exit code was '0xffffffff'.
  Means the W3WP.exe process partially started but could not load a dependancy for
some reason.
  This is either permissions/security related or due to mismatched DLL's.
  Try running the AppPool as "Local System"
  - If this works it is a permissions problem for the AppPool Identity. Check the
following "Scenarios" section then follow the "concepts" section above
  - If this fails using System it is a mismatch or missing DLL problem. Check
scenario #1 below then follow the "Loader Snaps" section

Scenarios for 0xffffffff

  1. The first thing that should be checked is whether a Windows Service pack is installed
    - If it is, verify that "c:\windows\System32\instsrv\w3core.dll" is either the
RTM version or SP1 version.
    - If it is the RTM version then reapplying the service pack should fix the problem.

  2. Is this IIS server a DC and have you reinstalled IIS on another DC?
    I have had two cases where there was a permissions failure reading nodes in the
metabase
    When installing IIS it creates an IIS_WPG group with a somewhat random SID.
    Permissions in the metabase are then set using this unique version of
IIS_WPG.
    When removing IIS on a DC it will delete the IIS_WPG group, if Win2k3 SP1 is
not installed, which is used by all the other DC's running IIS
    When then adding IIS back on to this DC, a new IIS_WPG group is created that
has a new SID
    The pre-existing permissions (older SID) in the other metabases will have
"Unknown User" for the permissions and spawning W3WP.exe under anything other than
SYSTEM will fail with 0xffffffff
    - I found this by enabling Tracing then searching through source code.
    - If you get a Debug Trace using DbgView.exe look for a line that says:
        w3core!W3_SERVER::Initialize [\w3server.cxx @ 526]:Error reading
UseDigestSSP property.  hr = 80070005
    - However It would be easy enough to skip Debug Tracing and just look at the
following nodes using Metabase Explorer.
    - IIS_WPG needs permissions to the following nodes:
      1. MachineName - Read
      2. w3svc/1/Filters (or any other filters node) - Read/Write
      3. w3svc/AppPools - Special (Query Unsecure Property)

  3. Has the customer modified default DCOM security?
    - We have run into an issue where the customer had modified
the default Launch and Activation Permissions in Component Services.
    - The customer removed Local Launch and Local Activation for the Everyone
group.
    - Here is the section of loader snaps output that is a hint of this scenario:

LDR: LdrGetProcedureAddress by NAME - CoMarshalInterface
LDR: LdrGetProcedureAddress by NAME - CoUnmarshalInterface
LDR: LdrGetProcedureAddress by NAME - CoReleaseMarshalData
(15c8.914): Unknown exception - code 80070005 (first chance)
LDR: UNINIT LIST
          (1) [iisres.dll] c:\windows\system32\inetsrv\iisres.dll (0) deinit 0
LDR: Unmapping [iisres.dll]
LDR: Derefcount IISMAP.dll (0)

-----------------------------------------------------------
The process exit code was '0xc0000005'.
  This is a crash.
  Troubleshoot using a Debugger. (Debug Diagnostics)
-----------------------------------------------------------
The process exit code was '0xff'.
  Process shut down "gracefully" for some reason.
  Troubleshoot as a crash and see who called TerminateProcess or ExitProcess.
-----------------------------------------------------------
The process exit code was '0x0'.
  This would be typical if you had w3wp.exe configured to launch under a debugger
and you never did a "Go" in the debugger windows.
  Launch gflags.exe and clear the debugger setting for w3wp.exe
-----------------------------------------------------------
Loader Snaps Section - These steps are not for getting memory dumps. This explains
how to easily get the reason that a module (DLL) failed to load which is one reason for a 0xffffffff.

  1. Send customer "gflags.exe" (can be obtained from Debugging Tools for
Windows)
  2. Double-click gflags.exe
  3. Go to the "Image File" tab
  4. Enter "w3wp.exe" for the image then press "tab"
  5. Put a check in "Show loader snaps"
  6. Put a check in "Debugger" and enter
        "NTSD.exe -logo c:\temp\LDR.log -g -G -r 0" (<-- that is a zero)
  7. Ensure the folder from the previous step has adequate read/write permissions
for the Identity that is launching the AppPool.
  8. Ensure that the AppPool is enabled
  9. Reproduce the problem. - At this point w3wp.exe should spawn under NTSD.exe,
write to the log, then shut down.
  10. Have customer send the LoaderSnaps.log output found in the c:\temp folder.
  11. Search for the text of "exception" or "failed" (you'll probably find what you
want near the end of the log)
  12. Lookup the listed error number using err.exe or hrplus.exe. The DLL that it
is having trouble on is the line just previous to the "exception/failed" line.
  13. Take the logical steps to address whatever the error is describing.

  Notes - Don't forget to reverse these settings when done.
            - If you identify a DLL that is the wrong version usually simply
reapplying the relevant hotfix or service pack will resolve the issue.
 


<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

Tags:

Comments

# re: How to troubleshoot an IIS "Event ID: 1009" error.

Thursday, August 30, 2007 8:45 AM by Jay

Excellent post!! Had permissions issue in the IIS meta database it turns out. Thanks alot saved me hours of searching for this permissions issue!!. Cheers, Jay

# re: How to troubleshoot an IIS "Event ID: 1009" error.

Wednesday, November 14, 2007 8:02 AM by Rob

Have you any information on the exit code of 0x1?

# re: How to troubleshoot an IIS "Event ID: 1009" error.

Friday, November 30, 2007 6:07 PM by Chris Osterdock

Completely awesome! - Server was patched to SP1 however the dll wasn't listed as being patched - I applied SP2 and the problem went away - thanks!!

# re: How to troubleshoot an IIS "Event ID: 1009" error.

Friday, January 25, 2008 5:10 PM by Ed Grant

This is very useful in understanding reasons for 1009s. We were getting 1009 events in the system log from a worker process. The application is Livelink 9.7 with IIS 6.0 as the web server.

The w3wp.exe process constantly died with exit code 0x2 and eventually exit code 0xffffffff which was fatal. This means that the process was unable to load a dependency which proved to be a corrupt registry entry.

# re: How to troubleshoot an IIS "Event ID: 1009" error.

Thursday, February 21, 2008 8:53 AM by brian-murphy-booth

Rob,

Anybody can call "ExitProcess" and pass whatever number they want to the function. Let's say I decide that my app will have 3 possible exit codes. 0 means success perhaps... and 1 means warning... and maybe 2 means error. It's all up to what I want to put in there. So maybe that 0x1 means nothing to anybody except the code that used it. But if Microsoft code called ExitProcess then we'd probably pick a code that makes sense (in theory anyway... right?) Most likely 0x1 is this: ERROR_INVALID_FUNCTION

What you'd probably need to do to troubleshoot that code is find out "who" is calling ExitProcess. If you install DebugDiag 1.1, setup a crash rule that includes the breakpoint of kernel32!ExitProcess then I think you'd catch that. Keep in mind, however, that with a debugging breakpoint of ExitProcess you're goign to get DMPs for things that you don't care about such as AppPool recycles or when restarting IIS. After you have your DMPs you can do a "Crash/Hang Analysis" on it and an MHT report will be generated. Look for the callstack that is doing ExitProcess and then at which DLL called into ExitProcess and that might give some insight into the problem.

Debugging is a skill that takes years to learn. If you get that far and can't make sense of what you see in the report then opening a support incident with Microsoft would get you the rest of the way.

# re: How to troubleshoot an IIS "Event ID: 1009" error.

Friday, May 16, 2008 10:52 AM by Steven

I am having the same issue but with an exit code of 0x0. Any thoughts?

# re: How to troubleshoot an IIS "Event ID: 1009" error.

Tuesday, May 27, 2008 9:13 AM by brian-murphy-booth

Steven,

In my experience, I have encountered an exit code 0x0 when a debugger was attaching to w3wp.exe, then exiting. Typically a debugger exiting will terminate the host process as well. As the debugger exited on my machine, an exit code of 0 was being logged.

# re: How to troubleshoot an IIS "Event ID: 1009" error.

Friday, June 6, 2008 7:41 AM by Dan

Getting tons of W3SVC 1009 errors all with exit code of "0xe0434f4d", along with some W3SVC 1013 errors.

Our users are experiencing unresponsive page loads, but is sporadic per user (since we are on a 2-server load balanced cluster) and it only seems to affect one server at a time. The unresponsiveness will last 20min to 2hrs...most of the time we are recycling the "problem server" to correct the problem.

Is troubleshooting these 1009 and 1013 errors the right track?

Does anyone have info on the exit code of "0xe0434f4d"?

# re: How to troubleshoot an IIS "Event ID: 1009" error.

Friday, July 25, 2008 12:36 PM by brian-murphy-booth

Dan

The 0xe0434f4d exception is a generic code that .NET uses to notify the OS that an "exception" occurred. I think technically a .NET exception isn't really an exception so .NET calls kernel32!RaiseException and passes that generic code. The real goal is to allow debuggers to hear that exception and will pause the host process if applicable so we can figure out what's going on and fix the problem. With that in mind... if your process is exiting with 0xe0434f4d that means .NET called RaiseException but also decided that the underlying exception type is was serious enough to warrant closing the process. In ASP.NET this usually only happens when there is a stack overflow. Or... if you are using .NET 2.0 and an exception occurred on a thread not directly under the control of ASP.NET, instead of doing "ExitThread" (v1.1 behavior) after the exception, it does "ExitProcess". Regardless of the actual root of the problem, you will want to get a crash dump using DebugDiag. And you will need to add the kernel32!ExitProcess breakpoint.

# re: How to troubleshoot an IIS "Event ID: 1009" error.

Sunday, August 3, 2008 4:11 PM by Venkat

Hi,

In my application server, I recieved a warning from source W3SVC with an Event ID of 1009. It had a description of "A process serving application pool 'SUCA' terminated unexpectedly. The process ID was '9845'. the exit code was '0x3'. "

Everytime I try to access the pages hosted on this application, i get this error message and the page does not load. In this application website, i have a Webgate ISAPI filter (webgate.dll - used for SSO - Oracle product, Oblix) loaded. When i remove the filter, the application works just fine but if i reload the filter, i start getting this error.

what would this mean? An error with the webgate or an error with IIS 6.0? I tried to look up the reasons for the exit code 0x3 and was not able to find it anywhere. Any help in this regard is highly appreciated.

# re: How to troubleshoot an IIS "Event ID: 1009" error.

Wednesday, December 24, 2008 11:10 AM by Ethan

This is absolutely the best post I've seen on the error condition you describe, Brian. Specifically, it helped me on a sysprepped VM. After the computer rename, I had to modify metabase permissions to get a web site to run with a unique identity. I sincerely hope that Microsoft takes your post as an excellent example of how to enable the technical community to resolve problems on their own. Imagine how much time and effort it would save the end-user and how much money it would save Microsoft with common PSS calls.

Thanks again!

Ethan Wilansky

# re: How to troubleshoot an IIS "Event ID: 1009" error.

Friday, January 2, 2009 9:51 AM by brian-murphy-booth

Venkat,

Exit code 0x3 is most likely ERROR_PATH_NOT_FOUND. To understand why this is happening it is best to know "who" (what DLL) called ExitProcess using that code. The only way to get the "who" is to get a memory DMP of w3wp.exe using something like DebugDiag.

# re: How to troubleshoot an IIS "Event ID: 1009" error.

Monday, June 15, 2009 9:43 AM by brian-murphy-booth

Yes, the list above includes 0x80.