Search

Wrong Release-code generation for SSE4 intrinsics by Matthias Straka

Closed
as Fixed Help for as Fixed

1
0
Sign in
to vote
Type: Bug
ID: 683152
Opened: 8/5/2011 7:46:25 AM
Access Restriction: Public
Moderator Decision: Sent to Engineering Team for consideration
0
Workaround(s)
0
User(s) can reproduce this bug
I upgraded to the new VS2010 C++ compiler and found my previous code not working anymore. I pin-pointed the reason in the way the compiler compiles the following example (i want to load a 3-vector (1,2,3) and set the 4th component to 1 using SSE4 commands). See steps-Section for the code.

What I expect is the following output: "Result: 1, 2, 3, 1" and this is what I get in Debug-Build. When I turn on Release, this is what I get: "Result: 1, 2, 3, 0". The assembly code generated in Release is the following:

    15:     __m128 xy = _mm_loadl_pi(ALL_ONES, (const __m64*)&x.x);
    16:     __m128 z = _mm_load_ss(&x.z);    
movss     xmm0,dword ptr [eax+8]
movlps     xmm1,qword ptr [eax]
    17:     return _mm_insert_ps(xy, z, 0x20);    
insertps    xmm1,xmm0,20h
movaps     xmm0,xmm1

Somehow my ALL_ONES in the high-part of xmm1 disappeared completely, leaving the result undefined. There should be a movaps xmm1, qword ptr [ALL_ONES]
Details (expand)

Visual Studio/Team Foundation Server/.NET Framework Tooling version

Visual Studio 2010 SP1

Steps to reproduce

Default C++ compiler settings, 32bit, Release mode

static const __m128 ALL_ONES = _mm_set1_ps(1.0f);
struct v3 { float x,y,z; };

__declspec(noinline) __m128 vs2010_bug(const v3& x)
{
    __m128 xy = _mm_loadl_pi(ALL_ONES, (const __m64*)&x.x);
    __m128 z = _mm_load_ss(&x.z);    
    return _mm_insert_ps(xy, z, 0x20);    
}

void TestBug()
{
    v3 x = {1, 2, 3};
    __m128 xyz1 = vs2010_bug(x);

    float v[4]; _mm_storeu_ps(v, xyz1);
    std::cout << "Result: " << v[0] << ", " << v[1] << ", " << v[2] << ", " << v[3] << std::endl;
}

Product Language

English

Operating System

Windows 7

Operating System Language

English

Actual results

Result: 1, 2, 3, 0

Expected results

Result: 1, 2, 3, 1
File Attachments
0 attachments
Sign in to post a comment.
Posted by Microsoft on 9/26/2011 at 4:28 PM
Hello,

We have fixed this probelm in our compiler and the fix will be included in the next major release of Visual Studio. Before that, if you need the fix, please submit a QFE request so that we can get it out to you.

Thanks for the report.

Daofa Li
--
VC++ CodeGen and Tools
Posted by MS-Moderator10 [Feedback Moderator] on 8/9/2011 at 2:45 AM
Thank you for submitting feedback on Visual Studio 2010 and .NET Framework. Your issue has been routed to the appropriate VS development team for investigation. We will contact you if we require any additional information.
Posted by MS-Moderator01 on 8/5/2011 at 7:50 AM
Thank you for your feedback, we are currently reviewing the issue you have submitted. If this issue is urgent, please contact support directly(http://support.microsoft.com)
Sign in to post a workaround.