The C++ compiler generates incorrect movups (8-byte write) instructions instead of movss (4-byte write) instructions in e.g. D3DXMatrixIdentity function. This causes corrupting the stack corruption etc.
It occurs with the following flag settings: using /Zp1 or /Zp2 option in combination with /arch:SSE or /arch:SSE2 option in debug mode (/Od).
I guess this movups instruction is chosen because it can handle unaligned accesses, however it moves 8-bytes iso. 4.
BTW. I used the February 2010 DirectX SDK