VISTA bug: iterating a COM collection with foreach causes your thread to be reentered during the iteration via a message pump. This happens with free-threaded COM objects that do not cross apartments - by HalibutHandle

Status : 

  External<br /><br />
		This item may be valid but belongs to an external system out of the direct control of this product team.<br /><br />
		A more detailed explanation for the resolution of this particular item may have been provided in the comments section.


18
1
Sign in
to vote
ID 330906 Comments
Status Closed Workarounds
Type Bug Repros 3
Opened 2/29/2008 1:34:51 PM
Access Restriction Public

Description

In the following line of code OutputMessage and OutputMessages are free threaded COM objects. I have verified by stepping into that no calls into these objects cross apartment boundaries, including the call to obtain the COM enumerator.

foreach (OutputMessage outputMessage in outputMessages);

The BUG is that while this loop is running, it start a message pump which allows the thread to be reentered. In my case, I get a paint message, which leads me to use the data structure I'm in the process of rebuilding with the above loop. 

I cannot get this bug to happen at all on XP. It happens reliably on Vista. The specific problem is that the foreach construct calls CustomMarshalers.EnumeratorViewOfEnumVariant.MoveNext, with in turn calls ole32.dll!_CoWaitForMultipleHandles@20(), which pumps messages.

I hope you can understand that having your thread reentered while iterating a collection of non-marshalled objects is completely wrong. There is no way to safely program if your thread can reenter itself at arbitrary unknown points. If you think this is not a bug, please stop and think again.


Here is the stack craw where it occurred:

System.Windows.Forms.dll!System.Windows.Forms.Control.WmPaint(ref System.Windows.Forms.Message m) Line 13199	C#
 	System.Windows.Forms.dll!System.Windows.Forms.Control.WndProc(ref System.Windows.Forms.Message m) Line 13638 + 0xb bytes	C#
 	System.Windows.Forms.dll!System.Windows.Forms.ScrollableControl.WndProc(ref System.Windows.Forms.Message m) Line 1491	C#
 	System.Windows.Forms.dll!System.Windows.Forms.ContainerControl.WndProc(ref System.Windows.Forms.Message m) Line 1898	C#
 	System.Windows.Forms.dll!System.Windows.Forms.UserControl.WndProc(ref System.Windows.Forms.Message m) Line 378	C#
 	System.Windows.Forms.dll!System.Windows.Forms.Control.ControlNativeWindow.OnMessage(ref System.Windows.Forms.Message m) Line 14059	C#
 	System.Windows.Forms.dll!System.Windows.Forms.Control.ControlNativeWindow.WndProc(ref System.Windows.Forms.Message m) Line 14114	C#
 	System.Windows.Forms.dll!System.Windows.Forms.NativeWindow.DebuggableCallback(System.IntPtr hWnd, int msg = 15, System.IntPtr wparam, System.IntPtr lparam) Line 777 + 0xa bytes	C#
 	user32.dll!_InternalCallWinProc@20()  + 0x23 bytes	
 	user32.dll!_UserCallWinProcCheckWow@32()  + 0xb3 bytes	
 	user32.dll!_DispatchClientMessage@20()  + 0x4b bytes	
 	user32.dll!___fnDWORD@4()  + 0x24 bytes	
 	ntdll.dll!_KiUserCallbackDispatcher@12()  + 0x2e bytes	
 	user32.dll!_NtUserDispatchMessage@4()  + 0xc bytes	
 	user32.dll!_DispatchMessageWorker@8()  - 0x5e1e bytes	
 	user32.dll!_DispatchMessageW@4()  + 0xf bytes	
 	ole32.dll!CCliModalLoop::HandleWakeForMsg()  + 0x60fbc bytes	
 	ole32.dll!CCliModalLoop::BlockFn()  - 0x210d bytes	
 	ole32.dll!_CoWaitForMultipleHandles@20()  - 0x382a bytes	
>	CustomMarshalers.dll!System.Runtime.InteropServices.CustomMarshalers.EnumeratorViewOfEnumVariant.MoveNext() + 0x159 bytes	
 	ResourceUsageProfilerControl.dll!NationalInstruments.TestStand.ResourceUsageProfiler.EventAnalyzer.Analyze(NationalInstruments.TestStand.Interop.API.OutputMessages outputMessages = {System.__ComObject}) Line 1190 + 0x41 bytes	C#
 	


Note that I have a main gui thread, and an secondary gui thread in which I create a new app domain. I'm not sure whether these details matter.

An unrelated issue is that although I am setup to step through the .net source code with VS2008, I still don't get source for System.Runtime.InteropServices.CustomMarshalers, although I do get method names. 


[2/29 10:19pm] Updated: I'm over the 5k limit. Additional info in attached file: [2-29] Update.txt
Sign in to post a comment.
Posted by Arno_S on 11/25/2010 at 3:39 AM
I just want to add that you cannot control this behavior by returning PENDINGMSG_WAITDEFPROCESS vs PENDINGMSG_WAITNOPROCESS from IMessageFilter::MessagePending. In fact Windows makes no distinction between the two return values, although

http://msdn.microsoft.com/en-us/library/ms694352(v=VS.85).aspx

mentions WM_PAINT explicitly in the PENDINGMSG_WAITDEFPROCESS description.
Posted by Mike on 11/17/2008 at 9:21 PM
C++ workaround: http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=4139006&SiteID=1&mode=1
Posted by dmex on 11/10/2008 at 2:47 PM
This bug has been designated: Resolved (External)

Where are the details?
Posted by Nick42blah on 10/9/2008 at 6:52 PM
I think I might be having a similar problem iterating IVCCollection's in XP; it's crashing after many iterations with a heap corruption error, and sometimes I get the _CoWaitForMultipleHandles@20() on the stack. My app is a C++/CLI app calling a C# library through COM. The heap error is: HEAP: Free Heap block 4b61bb8 modified at 4b61be8 after it was freed.
Posted by HalibutHandle on 4/16/2008 at 11:18 AM
I'm re-opening the issue because it was marked closed-by-design shortly after I posted a comment explaining why closing it for that reason would be a bad idea. Because there was no reply to my comment, I'm concerned that the issue was closed without reading it or considering the merits.

PS. Not to be snarky, but I think the video at this link illustrates the consequences of dismissing issues like this:
http://arlingtoncardinal.blogharbor.com/blog/_archives/2008/4/4/3620186.html
Posted by HalibutHandle on 4/15/2008 at 10:16 AM
RE: "By Design Response",

I'm sorry, but that is B.S. I'm confident the designer failed to realize that they would be causing literally thousands of applications that work perfectly on XP to unpredictably crash on Vista, in exchange for a purely cosmetic improvement that was never important enough to put into XP. This is not "by design". Rather, this is a "design flaw". Also, the claimed benefit of this change was responsiveness, but it allegedly affects painting only, not user input. Perhaps it is not as beneficial as imagined?

The "by design" response also prematurely cuts off your investigation. I'm sure you can have your Vista-repaint-kludge without randomly hijacking threads that call from .net to COM. The problem is that you happen to be using the CoWaitMultipleHandles function in the interop layer and are thus getting the kludge where you don't need it and probably never intended to have it. The interop should use the old behavior, even if the kludge applies elsewhere. All you need it is to use another copy of the function without the kludge in the interop layer. Perhaps it could be called CoWaitMultipleHandlesWithNoVistaWMPAINTKludge. Of course, everywhere the kludge remains should be double checked to make sure it doesn't have issues as well.

Aside from dismissing the potential crashing all existing .NET/COM applications, the suggested workaround of rewriting existing applications for Vista so that they copy and swap all view data is not generally applicable. Problems include:
- The data updates in existing applications can be distributed in various operations done throughout the code and are for a practical matter are not necessarily findable or centralizable.
- The displayed data might be huge and/or uncopyable (that is what virtual controls are for!). As a simplistic example, you wouldn't make copies of a hard disk folder (or even the textual file list and displayed attributes, since the folder might contain any number of files) just because a virtual control was displaying its contents.

I must admit though that your conclusion that the app compat database workaround has no deployment story and is not useful at all for components (DLLs, controls, etc) is exactly correct.

Partially or completely backing out this change is clearly in the best interest of both your customers and especially for Microsoft, who obviously has an interest in Vista being perceived as a viable stable platform. Please reconsider this issue with more care.

Posted by Microsoft on 4/14/2008 at 4:28 PM
This is a change to Vista that is by design. You can workaround this by doing an app compat shim. The app compat flag is DisableNewWMPAINTDispatchInOLE. Here are instructions about app compat:

The flag should show up in the Microsoft Application Compatibility Toolkit:
http://technet.microsoft.com/en-us/windowsvista/aa905066.aspx

This is used to create a private app compat database for the machine, mainly for testing purposes. A couple general points worth noting:
•    Currently the base app compat functionality does not support applying these flags based on the dlls loaded in a process. They are applied at process startup time based on the executable image.
•    There is no real deployment story for private app compat databases (and this is to a large extent by design). What needs to happen for the affected apps is the app compat team should add the flag to the app in the official Windows database.

Note that the best way to fix this would be by making your WM_PAINT access stable data. I would think about creating a set of data that is stable for painting and then another set of data that is generated by iteration that would not be accessed by processing WM_PAINT. After the iteration is complete and the new data is generated, swap the two sets of data and then free the old painting data.

This change does cause the UI to be quite a bit more responsive on Vista.
Posted by HalibutHandle on 3/5/2008 at 2:59 PM
Important Update: I've attached a super simple way to reproduce this. See SimpleWndMsgReentrancy.zip.

Here are the updated instructions:

Instructions:

On Vista:
1) Build the simple COM server: TryToReproduceWndMsgReentrancyCOMServer.sln
2) Build and run the .NET client in the debugger: TryToReproduceWndMsgReentrancy.sln
3) click the test button and in a second or two you should hit the System.Diagnostics.Debugger.Break() call when the reentrancy occurs.
4) Build and run the mfc client in the debugger: mfcclient.sln
5) click the test button and note that the problem doesn't happen in the MFC app. The paint message isn't handled until after the button click callback finishes completely. Even with many more iterations than the .NET client had.
Posted by spurdog on 3/3/2008 at 6:01 AM
I am having the same kind of reentrancy issues in a .Net application. All managed blocking (Monitor.Wait etc) ends up in CoWaitForMultipleHandles, which will dispatch WM_PAINT on Vista (at least in an STA).
There seems to be a method in Ole32.dll which is called _DisableNewWM_PAINTDispatch, so there may be some undocumented way to prevent paint messages from being dispatched?
Posted by Microsoft on 3/2/2008 at 8:05 PM
Thanks for your feedback.

We are escalating this issue to the appropriate group within the Visual Studio Product Team for triage and resolution.
These specialized experts will follow-up with your issue.

Thank you,
Visual Studio Product Team