APPLICATION_FAULT_NULL_POINTER_READ_INVALID_POINTER_READ – IIS Worker Process stopped working and was closed –

Ran into another Weird issue today. An ASP.NET 4.0 website started throwing an Exception today. The weird part was that only one of the pages would throw a Network Error (tcp_error, not a 500 or 503) one time. The rest of the site seemed to work normally up until navigating to this particular page. After navigating to this page, the Web site become unavailable and start throwing a 503. The Application EventLog did provide clues to the failure. I just wasn’t looking hard enough. If you are looking for the solution, please skip the rest of the section and jump to the last paragraph.

Faulting application name: w3wp.exe, version: 7.5.7601.17514, time stamp: 0x4ce7a5f8
Faulting module name: clr.dll, version: 4.7.2053.0, time stamp: 0x58fa6bb3
Exception code: 0xc0000005
Fault offset: 0x00192f7b
Faulting process id: 0x96c
Faulting application start time: 0x01d2fe12bb42ecc2
Faulting application path: C:\Windows\system32\inetsrv\w3wp.exe
Faulting module path: C:\Windows\Microsoft.NET\Framework\v4.0.30319\clr.dll

Attaching WinDbg to a running instance of w3wp, a stack trace, confirmed that the CLR crashed. Little bit of googling around showed that this might be a Heap corruption, but nothing concrete. A similar issue was reported to Microsoft, but seems to be open at this point.

0:032> !analyze -v

STACK_TEXT: 
12d6e400 7079005e 12d6e740 0000000a 5516af7c clr!BlobToAttributeSet+0x63
12d6e878 6f0f910b 12d6e894 00000001 07625170 clr!COMCustomAttribute::GetSecurityAttributes+0x265
12d6e8a4 6f0f8f56 12d6e8bc 00000001 6f225210 mscorlib_ni+0x3c910b
12d6e8cc 6f16d7ea 03624f34 6d1af3e3 00000000 mscorlib_ni+0x3c8f56
12d6e8fc 6d1af39e 00000000 03624f34 12d6e948 mscorlib_ni+0x43d7ea
12d6e90c 6d3b5145 12d6e8b4 0b31f594 70787cd8 System_Web_Extensions_ni+0x5f39e
12d6e948 6d3b4c0c 036183dc 12d6ecb8 12d6e970 System_Web_Extensions_ni+0x265145
12d6e958 10444234 0361d190 00000000 00000000 System_Web_Extensions_ni+0x264c0c
WARNING: Frame IP not in any known module. Following frames may be wrong.
12d6e970 104416a3 00000000 00000000 00000000 0x10444234
12d6ecc0 015ff870 00000000 00000000 00000000 0x104416a3
12d6ed04 015ff735 00000000 00000000 00000000 0x15ff870
12d6ed2c 6adfc921 036183dc 036183dc 00000000 0x15ff735
12d6ed5c 6adfc8a9 00000001 03472ad8 03472ad8 System_Web_ni+0x21c921
12d6ed94 6adfc857 036183dc 12d6edb4 6adfc83b System_Web_ni+0x21c8a9
12d6eda0 6adfc83b 00000000 0761bec8 034b5aa8 System_Web_ni+0x21c857
12d6edb4 015ff676 0761bec8 036183dc 12d6ee08 System_Web_ni+0x21c83b
12d6edc4 6adff96d 12d6ee08 6adf637b 034b243c 0x15ff676
12d6ee08 6adb9ec6 6adcc959 6f0e1af6 00000000 System_Web_ni+0x21f96d
12d6ee44 6adc813d 12d6eef0 00000000 00000000 System_Web_ni+0x1d9ec6
12d6ef28 6adba850 00000000 00000000 12d6ef80 System_Web_ni+0x1e813d
12d6ef3c 6adc6ccb 034a7e54 12d6ef74 12d6f02c System_Web_ni+0x1da850
12d6ef8c 6adbb6ef 0761bec8 47cb644c 706ff6e8 System_Web_ni+0x1e6ccb
12d6f084 6adbb39f 00000000 0000000c 00000002 System_Web_ni+0x1db6ef
12d6f0ac 00c1de6a 00000000 0000000c 6ae10878 System_Web_ni+0x1db39f
12d6f0dc 71acac76 0124106c 01288c08 0000000c 0xc1de6a
12d6f100 71acbc75 0128990c 712e3863 00000000 webengine4!W3_MGD_HANDLER::ProcessNotification+0x62
12d6f12c 71acac15 00000080 00000000 02feb79c webengine4!W3_MGD_HANDLER::DoWork+0x32a
12d6f174 71b29a88 00000080 00000000 02feb79c webengine4!RequestDoWork+0x39f
12d6f194 712e03b5 012883b0 01288408 012883b4 webengine4!CMgdEngHttpModule::OnExecuteRequestHandler+0x18
12d6f1a8 712e11c0 02feb79c 012883b0 00000000 iiscore!NOTIFICATION_CONTEXT::RequestDoWork+0x128
12d6f220 712cdf13 00000000 00000000 00000000 iiscore!NOTIFICATION_CONTEXT::CallModulesInternal+0x305
12d6f268 712d068b 00000000 00000000 00000000 iiscore!NOTIFICATION_CONTEXT::CallModules+0x28
12d6f28c 712d267b 00000000 00000000 00000000 iiscore!W3_CONTEXT::DoStateRequestExecuteHandler+0x36
12d6f4f4 712d62e4 00000000 00000000 00000001 iiscore!W3_CONTEXT::DoWork+0xd7
12d6f514 712d633d 00000000 00000000 712d0765 iiscore!W3_MAIN_CONTEXT::ContinueNotificationLoop+0x1f
12d6f528 712d07da 00000000 012883b0 12d6f54c iiscore!W3_MAIN_CONTEXT::ProcessIndicateCompletion+0x1f
12d6f538 71b25abd 00000000 12d6f65c 71b27e10 iiscore!W3_CONTEXT::IndicateCompletion+0x75
12d6f54c 71b27e32 00000000 0303cc58 12d6f59c webengine4!W3_MGD_HANDLER::IndicateCompletion+0x45
12d6f55c 6ae0e181 01288c08 12d6f65c 47cb644c webengine4!MgdIndicateCompletion+0x22
12d6f59c 6adbb892 47cb644c 706ff6e8 12d6f778 System_Web_ni+0x22e181
12d6f690 6adbb39f 00000004 0000000d 0303cc58 System_Web_ni+0x1db892
12d6f6b8 00c1de6a 00000004 0000000d 6ae10878 System_Web_ni+0x1db39f
12d6f6d8 70724861 12d6f898 00000010 00c1de48 0xc1de6a
12d6f738 70724792 12d6f82c 5516bf00 0fcb6236 clr!UM2MThunk_Wrapper+0x76
12d6f804 707246fa 00000002 707247f0 12d6f82c clr!Thread::DoADCallBack+0xbc
12d6f85c 00c1dec3 0fcb6220 00c1de48 12d6f898 clr!UM2MDoADCallBack+0x92
12d6f890 71acac76 00000000 01288c08 0000000d 0xc1dec3
12d6f8b4 71acacd3 70d6d368 03053068 01288c08 webengine4!W3_MGD_HANDLER::ProcessNotification+0x62
12d6f8c8 70715423 01288c08 5516bfdc 70d6d368 webengine4!ProcessNotificationCallback+0x33
12d6f914 70711d53 12d6f945 12d6f947 0303cc58 clr!UnManagedPerAppDomainTPCount::DispatchWorkItem+0x1d6
12d6f92c 707119f9 5516be3c 707118d0 00000000 clr!ThreadpoolMgr::ExecuteWorkRequest+0x4f
12d6f994 7089de11 00000000 ffffffff ffffffff clr!ThreadpoolMgr::WorkerThreadStart+0x3d3
12d6fcb4 75ff336a 030533c0 12d6fd00 775f9902 clr!Thread::intermediateThreadProc+0x55
12d6fcc0 775f9902 030533c0 65b1ffb7 00000000 kernel32!BaseThreadInitThunk+0xe
12d6fd00 775f98d5 7089ddc0 030533c0 ffffffff ntdll!__RtlUserThreadStart+0x70
12d6fd18 00000000 7089ddc0 030533c0 00000000 ntdll!_RtlUserThreadStart+0x1b

Next step was to load sos.dll using (Right Click, Launch WinDbg as an Administrator),

.load C:\Windows\Microsoft.NET\Framework\v4.0.30319\sos.dll

Reviewing the .NET Stack Trace using clrstack

0:034> !clrstack
OS Thread Id: 0xee0 (34)
Child SP IP Call Site
13a2e264 6eb52f7b [HelperMethodFrame_2OBJ: 13a2e264] System.Reflection.PseudoCustomAttribute._GetSecurityAttributes(System.Reflection.RuntimeModule, Int32, Boolean, System.Object[] ByRef)
13a2e6a0 6bac910b System.Reflection.PseudoCustomAttribute.GetCustomAttributes(System.Reflection.RuntimeAssembly, System.RuntimeType, Boolean, Int32 ByRef)
13a2e6cc 6bac8f56 System.Reflection.CustomAttribute.GetCustomAttributes(System.Reflection.RuntimeAssembly, System.RuntimeType)
13a2e6ec 6bb3d7ea System.Reflection.RuntimeAssembly.GetCustomAttributes(Boolean)
13a2e6f4 662cf3e3 System.Web.UI.AssemblyCache.SafeGetAjaxFrameworkAssemblyAttribute(System.Reflection.ICustomAttributeProvider)
13a2e71c 662cf39e System.Web.UI.AssemblyCache.GetAjaxFrameworkAssemblyAttribute(System.Reflection.Assembly)
13a2e72c 664d5145 System.Web.UI.ScriptManager.get_DefaultAjaxFrameworkAssembly()
13a2e768 664d4c0c System.Web.UI.ScriptManager..ctor()
13a2e778 01f2dfaa ASP.JethroGibbs_aspx.__BuildControlScriptManager1()
13a2e788 01f2afc6 ASP.JethroGibbs_aspx.__BuildControlJB()
13a2eb10 01f298af ASP.JethroGibbs_aspx.__BuildControlTree(ASP.JethroGibbs_aspx)
13a2eb24 01f2939e ASP.JethroGibbs_aspx.FrameworkInitialize()
13a2eb30 66cfc921 System.Web.UI.Page.ProcessRequest(Boolean, Boolean)
13a2eb64 66cfc8a9 System.Web.UI.Page.ProcessRequest()
13a2eb98 66cfc857 System.Web.UI.Page.ProcessRequestWithNoAssert(System.Web.HttpContext)
13a2eba4 66cfc83b System.Web.UI.Page.ProcessRequest(System.Web.HttpContext)
13a2ebb8 01f29335 ASP.JethroGibbs_aspx.ProcessRequest(System.Web.HttpContext)
13a2ebbc 66cff96d System.Web.HttpApplication+CallHandlerExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute()
13a2ec00 66cb9ec6 System.Web.HttpApplication.ExecuteStep(IExecutionStep, Boolean ByRef)
13a2ec40 66cc813d System.Web.HttpApplication+PipelineStepManager.ResumeSteps(System.Exception)
13a2ed20 66cba850 System.Web.HttpApplication.BeginProcessRequestNotification(System.Web.HttpContext, System.AsyncCallback)
13a2ed38 66cc6ccb System.Web.HttpRuntime.ProcessRequestNotificationPrivate(System.Web.Hosting.IIS7WorkerRequest, System.Web.HttpContext)
13a2ed88 66cbb6ef System.Web.Hosting.PipelineRuntime.ProcessRequestNotificationHelper(IntPtr, IntPtr, IntPtr, Int32)
13a2ed8c 66cbb39f [InlinedCallFrame: 13a2ed8c] 
13a2ee84 66cbb39f System.Web.Hosting.PipelineRuntime.ProcessRequestNotification(IntPtr, IntPtr, IntPtr, Int32)
13a2f360 0123de6a [InlinedCallFrame: 13a2f360] 
13a2f35c 66d0e181 DomainNeutralILStubClass.IL_STUB_PInvoke(IntPtr, System.Web.RequestNotificationStatus ByRef)
13a2f360 66cbb892 [InlinedCallFrame: 13a2f360] System.Web.Hosting.UnsafeIISMethods.MgdIndicateCompletion(IntPtr, System.Web.RequestNotificationStatus ByRef)
13a2f394 66cbb892 System.Web.Hosting.PipelineRuntime.ProcessRequestNotificationHelper(IntPtr, IntPtr, IntPtr, Int32)
13a2f398 66cbb39f [InlinedCallFrame: 13a2f398] 
13a2f490 66cbb39f System.Web.Hosting.PipelineRuntime.ProcessRequestNotification(IntPtr, IntPtr, IntPtr, Int32)
13a2f568 0123de6a [ContextTransitionFrame: 13a2f568]

Looking into the Script Manager didn’t yield much, since the application in itself hasn’t changed in a while. Going back to the CLR and System_Web_ni.dll, I did notice the files were updated few hours ago (I should have noticed the CLR version in the EventLog, my bad).

Using the .NET Framework Setup Verification Tool I noticed .NET Framework 4.7 installed. We had only certified the app to run on 4.6.1. .NET Framework 4.7 wasn’t listed as part of  “View Installed updates”, but as a standalone component. The only drawback is that the AppPool’s .NET Framework Version will be reset to 2.0 after an uninstall. It would be worth taking a screenshot of your IIS AppPool Framework versions prior to  uninstalling. Downloaded the Standalone .NET Framework 4.6 installer, uninstalled 4.7 and installed 4.6.1. After a reboot and reconfiguration of the AppPool and an IISReset returned the Application back to a working state.

 

Advertisements

Memory Leak – Microsoft.Reporting.WinForms.LocalReport

Several weeks ago I was asked to assist with a sluggish ASP.NET C# MVC application. This application is an OLTP system and processes several thousand orders a day. It was hosted on a Windows 2012 R2 64bit 16 Core machine. After discussing with the developer, I was told that the application has acceptable levels of performance for a while and then becomes sluggish (this behavior suggests some kind of  a leak). At times, it becomes extremely slow and rebooting the machine solves the problem. The problem exhibits itself more frequently during days when there were above average volumes.

Downloading Process Explorer from SysInternals, I found the w3wp.exe (there were several, but 2 of them stood out). Process Explorer does not display handle information by default. I will probably blog about this separately. There are many articles out there that explain how to look for Handles using Process Explorer. This should get you started. Here is another excellent blog explaining the theoretical limit of handles. It’s a theoretical limit because there are many types of thresholds that an application can reach which causes severe performance degradation.

I knew that we had a potential handle leak. There were also other issues within the application, that were not related to the handle leak. The challenge was to isolate the handle leak versus defects in the program logic, 3rd party hardware behaviors (like non genuine printer cartridges) and various other user errors.  Debugging a live system with several 100 users connected wasn’t happening. We were able to get a 4.5Gb memory dump.

When I moved the memory dump to a machine where I could analyze it, I found that the Production server was running an older version of the .NET Framework. The MS Symbol server wasn’t much help. I ended up copying the complete framework64 folder to the local machine (a different folder, not the Microsoft.NET\Framework folder) and loaded SOS using the .load command.

01-load sos
Using the time command, the overall CPU utilization by the Process compared to the System uptime showed that it wasn’t CPU contention.

02-time

The dumpheap command displays the managed heap. The output contains the Method Table, Object Instance Count, Memory Allocated and the Type from left to right. !dumpheap -stat reported about 164k lines of output and 9.7M object references. Many of these objects pointed to dynamically created Report Expressions.

01-dumpheap -stat

Typically the next logical step is to use !dumpheap -mt <<method table address>> to display all the individual object instances and then reviewing the individual objects to understand them better.

02-dumpheap -stat

Looking for loaded modules using the !eeheap -loader command showed that the worker process had 2622 App Domains (with a heapsize of 537mb). Dumping one of the modules using !DumpModule command pointed to a Dynamically loaded Expression Library associated with SQL Server Reporting Services.

05-eeheap -loader06-eeheap -loader

At this point, I knew that the issue may be related to the Dynamically Loaded Expression Assembly and its associated AppDomain that failed to unload. What was keeping the AppDomain from unloading? The dynamic expression library is used by the LocalReport object, which is part of the Microsoft.Reporting.Winforms namespace.

Further research showed that creating a new AppDomain is per design. The developers @ Microsoft had identified an issue with the dynamic expression library being loaded into the Default AppDomain and using up more memory each time the LocalReport object was created. An assembly once loaded into an AppDomain can never be unloaded. Hence the dynamic assembly was loaded in a new AppDomain and if all goes well, the AppDomain can be safely unloaded. As per the documentation calling LocalReport.ReleaseSandBoxAppDomain should unload the AppDomain. Unfortunately, the unload never happens because there is a problem with the internal implementation of the Dispose Method.

Using .NET Memory Profiler (JetBrains dotPeek), I was able to identify the Event Delegate

Screen Shot 2017-04-11 at 11.13.42 AM

Looking at the implementation, the handlers are not removed in the Dispose method. This in turn keeps the AppDomain from unloading. Hope this gets resolved in a future release. For now, we are recycling the Application Pool more frequently to work around the memory leak and the developers are looking at an alternate solution to print the labels.

08

One of the questions that I was asked while presenting my findings with the project manager, is why did this suddenly become a problem, since the application had been in Production for over 2 years. The issue had been there from the very beginning. The OOTB Application Pool Recycle was  causing the host process to shutdown after a period of inactivity (this app was used only between 8am-5pm). So it had enough time to shutdown after hours and the problem would never surface until the transaction volume increased, at which point the application started hitting thresholds.

It always helps to set baselines. Keep tabs on System configuration, Transaction Volume and major system change events. This will greatly shorten the time it takes to troubleshoot performance issues.

 

 

.foreach WinDbg

Recursively executing a command in WinDbg is one of the coolest features. In the example below, I used one method table from the output of a !FinalizeQueue command. The FinalizeQueue similar to DumpHeap provides the Method Table, Object Count, Size allocated and the Type.

  • Here 6dcf1230 is one of the Method Tables that I was interested in. I wanted to see if any objects in this table were rooted.
  • myvar is a variable that holds the output of the !DumpHeap command. –short returns only the address. Maybe in another article I will cover how to process the tokens
  • the .echo command needs no introduction, in this content it is used twice to display the address and to separate the output.
.foreach (myvar {!DumpHeap /d -mt 6dcf1230 -short}) {.echo myvar;!gcroot -all myvar;.echo **************;}

 

Launching a Process in .NET & redirecting the StandardOutput to a LogFile for future Analysis…

Here is a code snippet that I had used a while ago to identify and kill a process that used to hang during the middle of the night.This shows how to Launch a Process, redirect the output to a Log file instead.

The intention was to capture the open handles and kill the process so that the next scheduled run could resume. The Agent that was causing this issue used files for exclusive locks. I suspected an open file might be the cause of the hang. The randomness of the event made it difficult to debug this issue real time in production.

There are other pieces to the Windows Service that are not listed here. Some aspects of the code explained below…

APP_NAME : Name of the Application in uppercase, that I was looking for as displayed in Task Manager.

Logger : Enterprise Library Logging Framework

handle64.exe : downloaded from SysInternals Suite and made available in a folder accessible in the Path Env variable

Arguments to the handle64.exe is to suppress the EULA if and when it appears.

Dim prc() As Process = Process.GetProcesses

For Each p In prc
                If p.ProcessName.ToUpper = APP_NAME Then

                        Dim newP As New Process

                        Logger.Write(String.Format("Terminating Process / Process ID {0}", p.Id))
                        newP.StartInfo.FileName = "handle64.exe"
                        newP.StartInfo.WorkingDirectory = "c:\temp\"
                        newP.StartInfo.RedirectStandardOutput = True
                        newP.StartInfo.UseShellExecute = False
                        newP.StartInfo.Arguments = String.Format("-p {0} -accepteula", p.Id)
                        newP.Start()
                        Dim output As String = newP.StandardOutput.ReadToEnd
                        Logger.Write(output)
                        newP.WaitForExit()

                        p.Kill()
                End If
                p.Dispose()
Next