C compiler optimization bug with parameter / local variable and memcpy - by Andy Kilpatrick

Status : 

  By Design<br /><br />
		The product team believes this item works according to its intended design.<br /><br />
		A more detailed explanation for the resolution of this particular item may have been provided in the comments section.

Sign in
to vote
ID 547423 Comments
Status Closed Workarounds
Type Bug Repros 1
Opened 4/1/2010 10:08:27 AM
Access Restriction Public


VS2008 C compiler optimization bug.  The C code below fails if compiled with -O2 or -Ox optimization - printing out "FAILED!" in foo1() and foo2().
It looks like the optimizer is getting confused and optimizing out the parameter / local variable, which it shouldn't do in this case.  
I've reproduced this (on XP and Windows 7) with  cl version:
VS2008 Express: Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 15.00.30729.01 for 80x86

Source for code.c starts here:
/* Test code for VS2008 -O2 / -Ox optimization bug.                          */

/* memcpy macro checks that TO and FROM don't overlap.                       */
#define MEMCPY(TO, FROM, SIZE)                                                \
   if ((((char *)(TO) + (SIZE)) <= (const char *)(FROM)) ||                   \
              (((const char *)(FROM) + (SIZE)) <= (char  *)(TO)))             \
   {                                                                          \
     printf("OK!\n");                                                         \
   }                                                                          \
   else                                                                       \
   {                                                                          \
     printf("FAILED!\n");                                                     \
   }                                                                          \
   memcpy((TO), (FROM), (SIZE))

/* I use 64-bit "Correlators" for various integer types.                     */
#define CORR_GET_VALUE(VALUE, CORRELATOR)                                     \
        MEMCPY(&(VALUE), &(CORRELATOR), sizeof(VALUE))

typedef unsigned long     CORRELATOR;
typedef unsigned long     ULONG;

/* This version fails with -O2, -Ox; works with -O1 or no optimization.      */
void foo1(CORRELATOR key_corr)
  ULONG  pid = 0;

  CORR_GET_VALUE(pid, key_corr);
  printf("pid now %ul\n", pid);

/* This version fails with -O2, -Ox; works with -O1 or no optimization.      */
void foo2(CORRELATOR key_corr)
  CORRELATOR local_correlator = key_corr;
  ULONG  pid = 0;

  CORR_GET_VALUE(pid, local_correlator);
  printf("pid now %ul\n", pid);

/* This version works with -O2, -O1, -Ox, no optimization.                   */
void foo3(CORRELATOR key_corr)
  ULONG  pid = 0;

  printf("&key_corr %x\n", &key_corr);
  CORR_GET_VALUE(pid, key_corr);
  printf("pid now %ul\n", pid);

int main (int argc, char *argv[])
  printf("Hello world\n");

  /* Expect all these to pass (print "OK!")                                  */


Sign in to post a comment.
Posted by BlueRaja on 4/27/2010 at 8:16 AM
Not sure that I agree that bugs should be closed as "by design" simply because we understand why they happen...
Posted by Lawrence [MSFT] on 4/22/2010 at 2:08 PM
Hello Andy,

Your assumptions are correct. The compiler is clever enough to figure out the situation within the function. So if you just had “memcpy(&local, &pram)” without the error-checking; the compiler will actually figure out that you don’t even need the param or local variable and just call printf on the constant value. This is because the compiler knows about memcpy and will try to convert it to a MOV instruction. After that, the compiler can do a variety of optimizations (such as Constant Folding, Stack Packing, Dead Code Removal, etc) which (depending on the code) can result in removing the MOV instruction as long as any dependency on the value are not broken.

An example is with why foo3 was able to print “OK!”. With foo3, you had a printf statement with the address of key_corr as an argument. The compiler has little knowledge of printf and can only deduce that printf can potentially hold a dependency on &key_corr (either by storing or retrieving information from that address space). So the compiler had to make sure that &key_corr does not share its address space with any other variables that can potentially change the value in the address. Foo1 and foo2 did not have this problem since there were no dependencies on the variables.

If you are interested you can look at the assembly listing using compiler flag /FA (this creates an assembly listing file with .asm extension). If you look under “_main PROC” you can see that there is only one MOV instruction (this is related to the memcpy called by foo3), the rest of the function has code on comparing the address values and calling printf on the constant value.

I hope this answers your question. Since this is the nature of the compiler during an optimize build I am going to close this item as resolved. Please do not hesitate to reactivate this MSConnect bug if you have related questions or found related issues.

Thank you,
Lawrence Joel
Posted by Andy Kilpatrick on 4/21/2010 at 4:03 AM
In the case of my code, we memcpy the contents of the parameter (key_corr - or param in Daniel's example) to the local variable (pid, or local in Daniel's example). Our memcpy function asserts if the source and target are the same (as this is considered a bug in our coding standards) - the optimization causes this assert to be hit. (I've put this as a printf in my example code above rather than an assert()).

I assume the optimization can only happen because local=param at all times (or the values are not subsequently accessed). The optimizer is being clever enough to spot that the memcpy doesn't result in different values for the parameter / local variable? So if we had "memcpy(&local, &param);" instead of my error-checking version, would this be removed by the optimizer, or executed (with &local == &param)?

Posted by Lawrence [MSFT] on 4/19/2010 at 4:27 PM
Hello Daniel,

I am the developer assigned to investigate this issue. Thank you for the more readable repro in your last post.    I will be referring to it when I describe the issue.

So this behavior is a result of the compiler inlining the function failOptimize which would then make the parameter param be a local variable. This would make param subject to optimizations; more specifically stack packing (an optimization that places local variables in the same stack address space to reduce stack space). This makes param and “local” share the same stack offset and so they would overlap. Based on the nature of the compiler this is by design for optimize builds.

What is the nature of your program that you require the parameter and local variable to not overlap? A simple workaround is to prevent the function from being inlined or to wrap the function with #pragma optimize ( “”, off).

Thank you,
Lawrence Joel
Posted by Helen [MSFT] on 4/1/2010 at 8:34 PM
Thanks for your feedback.

We are rerouting this issue to the appropriate group within the Visual Studio Product Team for triage and resolution. These specialized experts will follow-up with your issue.

Thank you
Posted by BlueRaja on 4/1/2010 at 2:08 PM
sorry forgot to mention, also using VS 2008 SP1, XP Pro.
Posted by BlueRaja on 4/1/2010 at 2:06 PM
I can confirm this behavior. Here is a version I think makes the problem more clear:

void failOnOptimize(unsigned long param)
    unsigned long local = 0;
    char* localEnd = ((char*)&(local)) + sizeof(local);
    char* paramEnd = ((char*)&(param)) + sizeof(param);

    //Check for overlap
    if (localEnd < (char *)&(param) ||
        paramEnd < (char *)&(local))