The following repro runs without issue when compiled with ICC, but produces an Access Violation when compiled with VC2010 in 64-bit debug builds:
void AVXRepro( __m256i* d, const __m256i* s )
// This will cause an occasional access violation when compiled with VC2010.
// It does not occur in 32-bit builds. Nor does it occur if you remove the
// post-increment from the arguments.
_mm256_store_si256( d++, *s++ );
const size_t MaxElements = 2;
__m256i v1[ MaxElements ], v2[ MaxElements ];
for( int i = 0; i < _countof( v1 ); ++i )
for( int j = 0; j < _countof( v1.m256i_u32 ); ++j )
v1[i].m256i_u32[j] = i * _countof( v1.m256i_u32 ) + j;
AVXRepro( v2, v1 );
Please note that the access violation doesn't necessarily occur on every run, so be sure to run it a few times (I only have to try it once or twice).
Also, I'm running this on native AVX-enabled hardware running 64-bit Win7 SP1 beta. When running it with through the Intel Software Development Emulator on non-AVX hardware (64-bit Win7 without SP1 beta), I get the following:
SDE ERROR: ALIGN32 FAILED PC=13f161421 MEMEA=18f4d0 vmovdqa ymmword ptr [rsp], ymm0
Again, it sometimes takes a few runs to get it to trigger...and it doesn't ever occur when the repro is compiled with ICC.