Code optimizer fails to eliminate memcpy/memset when "number of bytes" is zero - by Dmitry Me

Status : 

  Fixed<br /><br />
		This item has been fixed in the current or upcoming version of this product.<br /><br />
		A more detailed explanation for the resolution of this particular item may have been provided in the comments section.

Sign in
to vote
ID 687527 Comments
Status Closed Workarounds
Type Bug Repros 0
Opened 9/7/2011 8:14:03 AM
Access Restriction Public
Moderator Decision Sent to Engineering Team for consideration


This behavior is observed in version 10.0.30319.1 RTMRel

I have the following two snippets that I compile with /O2, put a breakpoint, run and then open the Disassembly window to see the machine code.

Snippet 1:

int _tmain(int argc, _TCHAR* argv[])
	char dummy1 = 0;
	char dummy2 = 0;
	memcpy( &dummy1, &dummy2, 0 );
	memset( &dummy1, 0, 0 );
	if( dummy1 || dummy2 ) {
		Sleep( 0 );
	return 0;


45: int _tmain(int argc, _TCHAR* argv[])
    46: {
00401000  push        ebp  
00401001  mov         ebp,esp  
00401003  push        ecx  
00401004  push        ebx  
    47: 	char dummy1 = 0;
00401005  xor         ebx,ebx  
    48: 	char dummy2 = 0;
    49: 	memcpy( &dummy1, &dummy2, 0 );
00401007  push        ebx  
00401008  lea         eax,[dummy2]  
0040100B  push        eax  
0040100C  lea         eax,[dummy1]  
0040100F  push        eax  
00401010  mov         byte ptr [dummy1],bl  
00401013  mov         byte ptr [dummy2],bl  
00401016  call        memcpy (401844h)  
    50: 	memset( &dummy1, 0, 0 );
0040101B  push        ebx  
0040101C  lea         eax,[dummy1]  
0040101F  push        ebx  
00401020  push        eax  
00401021  call        memset (40183Eh)  
00401026  add         esp,18h  
    51: 	if( dummy1 || dummy2 ) {
00401029  cmp         byte ptr [dummy1],bl  
0040102C  je          wmain+35h (401035h)  
    52: 		Sleep( 0 );
0040102E  push        ebx  
0040102F  call        dword ptr [__imp__Sleep@4 (402000h)]  
    53: 	}
    54: 	return 0;
00401035  xor         eax,eax  
00401037  pop         ebx  
    55: }
00401038  leave  
00401039  ret 


int _tmain(int argc, _TCHAR* argv[])
	char dummy1 = 0;
	char dummy2 = 0;
	memcpy( &dummy1, &dummy2, 1 );
	memset( &dummy1, 0, 1 );
	if( dummy1 || dummy2 ) {
		Sleep( 0 );
	return 0;

is the same as snippet 1 but "number of bytes" is now 1 instead of 0. It yields:

45: int _tmain(int argc, _TCHAR* argv[])
    46: {
00401000  push        ebp  
00401001  mov         ebp,esp  
00401003  push        ecx  
00401004  push        edi  
    47: 	char dummy1 = 0;
    48: 	char dummy2 = 0;
    49: 	memcpy( &dummy1, &dummy2, 1 );
    50: 	memset( &dummy1, 0, 1 );
00401005  xor         eax,eax  
00401007  lea         edi,[dummy1]  
0040100A  stos        byte ptr es:[edi]  
0040100B  pop         edi  
    51: 	if( dummy1 || dummy2 ) {
0040100C  cmp         byte ptr [dummy1],al  
0040100F  je          wmain+18h (401018h)  
    52: 		Sleep( 0 );
00401011  push        eax  
00401012  call        dword ptr [__imp__Sleep@4 (402000h)]  
    53: 	}
    54: 	return 0;
00401018  xor         eax,eax  
    55: }
0040101A  leave  
0040101B  ret

The machine code is massively different - the compiler can see how memcpy() and memset() are implemented.

The problem is that in the first snippet the compiler fails to see that memset()/memcpy() with "number of bytes" equal to zero is a no-op. This leads to lots of inefficient machine code for the first snippet that actually does nothing useful.
Sign in to post a comment.
Posted by Microsoft on 9/13/2011 at 11:53 AM

I’m also having trouble reproducing the exact behavior that you describe in your post. Using the project file that you submitted i see that you’re compiling with /Os in addition to /O2. This may be part of the problem as there is a known issue in VC++ when compiling for size with 0 length memset. This issue will be addressed in future releases of VC++. You can work around this by compiling the function for speed rather than size.

Looking at your original sample code I see that the compiler has inserted calls to memset and memcpy functions in the CRT. This indicates to me that perhaps you’re compiling without /Oi (which is included in the macro switch /O2) or there is a #pragma function(memset) in your code.

You can read more about optimization switches, including the macro switches /O1, /O2, and /Ox here:

An description of the function pragma directive can be found here:

ian Bearman
VC++ Code Generation and Optimization Team
Posted by Microsoft on 9/12/2011 at 12:46 AM
Thanks for your feedback.
We are routing this issue to the appropriate group within the Visual Studio Product Team for triage and resolution. These specialized experts will follow-up with your issue.
Posted by Dmitry Me on 9/9/2011 at 3:00 AM
Attached the full project.
Posted by Microsoft on 9/8/2011 at 11:47 PM
Hi Dmitry Me,

I have this tested with VS2010 RTM ans VS2010 Sp1. And none of them behaves as described.
Could you please help to provide us with some more details about your project's configuration? A sample project zip can be better.

Thanks again for your efforts and we look forward to hearing from you.
Posted by MS-Moderator09 [Feedback Moderator] on 9/8/2011 at 5:09 AM
Thank you for submitting feedback on Visual Studio 2010 and .NET Framework. Your issue has been routed to the appropriate VS development team for review. We will contact you if we require any additional information.
Posted by Dmitry Me on 9/7/2011 at 10:49 PM
If SP1 behaves better - great.
Posted by Mike Danes on 9/7/2011 at 10:38 AM
There's something fishy going on here. My VC++ 2010 SP1 produces the following code for your first example:

00401000 xor         eax,eax

Yes, that's right, just a single instruction for return 0. Now, they might have fixes some bugs between RTM and SP1 but this is not the first time this happens with your examples. Are you sure your compiler installation and the options you use are ok?
Posted by MS-Moderator01 on 9/7/2011 at 8:41 AM
Thank you for your feedback, we are currently reviewing the issue you have submitted. If this issue is urgent, please contact support directly(