Search

Bug in cl for x64 (c compiler) by Dmitry Volkinshtein

Closed
as Fixed Help for as Fixed

1
0
Sign in
to vote
Type: Bug
ID: 676417
Opened: 6/22/2011 1:23:24 AM
Access Restriction: Public
Moderator Decision: Sent to Engineering Team for consideration
0
Workaround(s)
0
User(s) can reproduce this bug
The following c program gives incorrect results when compiled with full optimization for x64 machine:
#include <stdio.h>

typedef struct {
    int flags;
    int n;
} bug_t;

bug_t _bug;
unsigned short _byteswap_ushort(unsigned short val); /* from <intrin.h> */
#pragma intrinsic(_byteswap_ushort)

int main(int argc, char *argv[])
{
    volatile bug_t *bug = (volatile bug_t *)&_bug;
    if (bug->flags & 0x200000)
    printf("bad %d\n", _byteswap_ushort(bug->n));
    else
    printf("good %d\n", _byteswap_ushort(bug->n));
    getchar();
    return 0;
}

Thanks in advance,
Dmitry Volkinshtein
dmitry@hola.org
Details (expand)

Visual Studio/Team Foundation Server/.NET Framework Tooling version

Visual Studio 2010 SP1

Steps to reproduce

Compile the following program with optimization for x64 machine using the cl.exe compiler.

#include <stdio.h>

typedef struct {
    int flags;
    int n;
} bug_t;

bug_t _bug;
unsigned short _byteswap_ushort(unsigned short val); /* from <intrin.h> */
#pragma intrinsic(_byteswap_ushort)

int main(int argc, char *argv[])
{
    volatile bug_t *bug = (volatile bug_t *)&_bug;
    if (bug->flags & 0x200000)
    printf("bad %d\n", _byteswap_ushort(bug->n));
    else
    printf("good %d\n", _byteswap_ushort(bug->n));
    getchar();
    return 0;
}

Though bug->flags is 0, the machine executes the true branch of the if. The following is the assembly output:

; Listing generated by Microsoft (R) Optimizing Compiler Version 16.00.30319.01

include listing.inc

INCLUDELIB OLDNAMES

EXTRN    __security_check_cookie:PROC
EXTRN    __imp_getchar:PROC
EXTRN    __imp_printf:PROC
COMM    _bug:QWORD
$SG-7    DB    'bad %d', 0aH, 00H
$SG-8    DB    'good %d', 0aH, 00H
PUBLIC    main
;    COMDAT pdata
; File c:\users\steve\documents\visual studio 2010\projects\test2\test2\test2.c
pdata    SEGMENT
$pdata$main DD    imagerel $LN6
    DD    imagerel $LN6+62
    DD    imagerel $unwind$main
pdata    ENDS
;    COMDAT xdata
xdata    SEGMENT
$unwind$main DD    010401H
    DD    04204H
; Function compile flags: /Ogtpy
xdata    ENDS
;    COMDAT main
_TEXT    SEGMENT
argc$ = 48
argv$ = 56
main    PROC                        ; COMDAT

; 13 : {

$LN6:
    sub    rsp, 40                    ; 00000028H

; 14 :     volatile bug_t *bug = (volatile bug_t *)&_bug;
; 15 :     if (bug->flags & 0x200000)

    mov    eax, DWORD PTR _bug

; 16 :     printf("bad %d\n", _byteswap_ushort(bug->n));

    lea    rcx, OFFSET FLAT:$SG-7
    bt    eax, 21
    mov    eax, DWORD PTR _bug+4
    ror    ax, 8
    movzx    edx, ax
    jne    SHORT $LN5@main

; 17 :     else
; 18 :     printf("good %d\n", _byteswap_ushort(bug->n));

    lea    rcx, OFFSET FLAT:$SG-8
$LN5@main:
    call    QWORD PTR __imp_printf

; 19 :     getchar();

    call    QWORD PTR __imp_getchar

; 20 :     return 0;

    xor    eax, eax

; 21 : }

    add    rsp, 40                    ; 00000028H
    ret    0
main    ENDP
_TEXT    ENDS
END


There are two weird things in the assembly output, which might be connected to the same problem:
1) bt 15 results in carry flag when bit 15 is high, but zero flag is tested for branching in jne
2) the _byteswap_ushort is compiled to ror. It is ok, but this operation is done between the "bt 15" and its branch (jne) and changes the carry flag that was set by bt 15 and need not to be touched till the branch.

This is very serious since we cannot trust the optimization / intrinsic functions byteswap.

Product Language

English

Operating System

Windows XP

Operating System Language

English

Actual results

Output:
bad 0

Expected results

Output:
good 0
File Attachments
File Name Submitted By Submitted On File Size  
test2.zip 6/22/2011 651 KB
Sign in to post a comment.
Posted by Microsoft on 8/22/2011 at 8:04 PM
Thank you for reporting this issue. This issue has been resolved and the fix will be available in a future release.
Posted by Microsoft on 6/23/2011 at 12:10 PM
I did some more investigating and the problem actually lies in the creation of the BT instruction. The compiler is not changing the jcc instruction to use the CF after creating the BT instruction. This seems to only happen in the case where the rotate instruction sits between the BT and the JCC. You may be able to work around this by 1. writing code that the compiler won't change into BT or 2. write a custom version of _byteswap that doesn't use the ROR instruction. See the codegen from the x86 compiler for an example on how to implement #2.

ian Bearman
VC++ Code Generation and Optimization Team
Posted by Microsoft on 6/23/2011 at 11:46 AM
Dimtry, thanks for reporting this issue. I have verified that the compiler is modeling the rotate instructions incorrectly. The compiler may incorrectly move a rotate instruction between a compare/branch pair. This will be fixed in a future release of visual studio.

Disabling optimizations aroudn the rotate should allow you to work around the problem.

ian Bearman
VC++ Code Generation and Optimization Team
Posted by Dmitry Volkinshtein on 6/22/2011 at 4:32 AM
P.S. While I was working on a workaround, I found out that even without intrinsic _byteswap_ushort, there is a problem.
I wrote my inline byteswap:

static inline unsigned short _byteswap_ushort_fix(unsigned short val)
{
     return ((val<<8) | (val>>8));
}

The resulting assembly is still:
ror ax, 8
and the problem still exists

Dmitry Volkinshtein
dmitry@hola.org
Posted by MS-Moderator08 [Feedback Moderator] on 6/22/2011 at 2:34 AM
Thank you for reporting the issue.
We are routing this issue to the appropriate group within the Visual Studio Product Team for triage and resolution.These specialized experts will follow-up with your issue.
Posted by MS-Moderator01 on 6/22/2011 at 1:50 AM
Thank you for your feedback, we are currently reviewing the issue you have submitted. If this issue is urgent, please contact support directly(http://support.microsoft.com)
Sign in to post a workaround.