Home Dashboard Directory Help
Search

std::atomic load implementation is absurdly slow by CornedBee


Status: 

Closed
 as Deferred Help for as Deferred


6
0
Sign in
to vote
Type: Bug
ID: 770885
Opened: 11/13/2012 5:10:38 AM
Access Restriction: Public
0
Workaround(s)
view
2
User(s) can reproduce this bug

Description

The x64 (and probably x32) implementation of std::atomic::load in VS 2012 is so bad as to be practically useless. Given a std::atomic<std::size_t> ai, the expression ai.load(std::memory_order_relaxed) ultimately arrives at a call to the intrinsic _InterlockedOr64(&x, 0). This intrinsic, in turn, emits a cmpxchg loop that repeatedly loads the memory location, and then compares the loaded value to the memory location just loaded from.
For reference, the correct code to emit for a relaxed load is "mov register, [memory location]".

It seems that all atomic load operations are treated the same no matter what memory ordering is specified; all operations have sequential consistency. This is, strictly speaking, permitted behavior. It's just completely useless. Atomics are used in performance-critical lock-free data structures, and the abysmal implementation slows our lock-free hash map down by a factor of 10 or even 100 compared to the Intel atomics implementation or Boost.Atomic. The atomics become a major bottleneck and scalability issue in applications using them.
Details
Sign in to post a comment.
Posted by Jonathan Potter on 12/9/2013 at 5:06 PM
I've done some simple benchmarking with atomic<DWORD> and it seems that in VS 2013 std::atomic is now the same speed as boost::atomic in a release build. In a debug build boost is still about twice as fast.
Posted by Microsoft on 3/22/2013 at 4:44 PM
Hi again,

We've fixed this bug, and the fix will be available in VC12. See the attached meow.zip for an example of the new codegen on x64:

mov    rax, QWORD PTR ?g_i@@3U?$atomic@_K@std@@A ; g_i
ret    0

Wenlei He from our compiler back-end team contributed a major rewrite of <atomic>'s implementation, improving performance for x86/x64/ARM.

Note: Connect doesn't notify me about comments. If you have any further questions, please E-mail me.

Stephan T. Lavavej
Senior Developer - Visual C++ Libraries
stl@microsoft.com
Posted by Microsoft on 2/21/2013 at 1:52 PM
Hi,

Thanks for reporting this bug. I wanted to let you know what's happening with it. I'm still keeping track of it, but it's been resolved as "Deferred" because we may not have time to fix it in VC12. (Note: VC8 = VS 2005, VC9 = VS 2008, VC10 = VS 2010, VC11 = VS 2012.)

Note: Connect doesn't notify me about comments. If you have any further questions, please E-mail me.

Stephan T. Lavavej
Senior Developer - Visual C++ Libraries
stl@microsoft.com
Posted by JustManowar on 1/28/2013 at 2:13 AM
Agree with the bug submitter.
Unfortunately, totally can't use std::atomic<> implementation because of this issue.
Posted by Microsoft on 11/21/2012 at 2:03 AM
Thank you for submitting feedback on Visual Studio and .NET Framework. Your issue has been routed to the appropriate VS development team for investigation. We will contact you if we require any additional information.
Posted by Microsoft on 11/13/2012 at 6:19 PM
Thank you for your feedback, we are currently reviewing the issue you have submitted. If this issue is urgent, please contact support directly(http://support.microsoft.com)
Sign in to post a workaround.
File Name Submitted By Submitted On File Size  
meow.zip 3/22/2013 2 KB