Home Dashboard Directory Help
Search

Feedback: 32-bit Development PCs build print drivers in half the time compared to 64-bit PCs (16 vs. 32 mins) by bugzapper


Status: 

Closed
 as Fixed Help for as Fixed


1
0
Sign in
to vote
Type: Bug
ID: 717690
Opened: 1/9/2012 9:57:13 AM
Access Restriction: Public
Moderator Decision: Sent to Engineering Team for consideration
0
Workaround(s)
view
0
User(s) can reproduce this bug

Description

I mentioned to Justin Hutchings (DOX) that I wanted to set up a 32-bit Win8 PC because 32-bit builds on a 32-bit OS seemed to be so much faster than 64-bit builds on a 64-bit OS on the same hardware. He asked to enter a bug to start a discussion on performance because Microsoft is very interested in performance.

Basically, what I found is that builds take about twice as long if I use a 64-bit OS rather than a 32-bit OS for development. This is bad since Microsoft seems to be emphasizing 64-bit over 32-bit OSes. Is there anything you can do to optimize the x64 compiler/linker to come closer to parity with the x86 tools?

I acknowledge that it is not ideal to have one test on Win7 and the other on Win8, but those were the only OSes I had installed on this type of PC.

We have had to go to Quad-core PCs with 16 GB of RAM and SSDs in order to achieve reasonable build times for 64-bit.

PC Hardware Used:
Intel Core 2 Dui E7500 @ 2.93 GHz – Dual Core
RAM: 4.0 GB
Disk: Standard Hard Drive (not SSD)

Project:
A very large, complex V3 print driver codebase with lots of STL and other templates. The SOURCES files were converted to Visual Studio .vcxproj files using the Win8 WDK’s conversion tool.

Dual-Boot OS #1: Win7 32-bit:
    Visual Studio 2010
    Win8 WDK 8090
    Time to build (x86 checked build): 16 min, 22 sec.

Dual-Boot OS #2: Win8 Build WDP 8102 64-bit:
                Visual Studio 11
    Win8 WDK 8102
    Time to build (x64 checked build): 31 min, 39 sec.

========================================================

Switching gears: Here is a completely separate test comparing two different PC hardware configurations. I’m including this since it raises a different performance question which may also be of interest: native vs. cross compiler perf.

The following are build times for a relatively small V4 driver codebase: MSBuild was used to build the .SLN file from the command line.

PC #1 Configuration:
Windows Developer Preview 8102
Visual Studio 11
Intel Core I7-2600 @ 3.4 GHz - Quad Core
RAM: 16 GB
Disk: Solid State Drive

PC #2 Configuration:
OS: Windows Developer Preview 8102
                Visual Studio 11
Intel Core 2 Dui E7500 @ 2.93 GHz – Dual Core
RAM: 4.0 GB
Disk: Standard Hard Drive

PC #1 – Build Times: (Minutes:Seconds)
x64 Free Build using native 64-bit compiler: 0:58.9
x64 Free Build using the 32-bit cross compiler (x86_amd64): 0:48.31
x86 Free Build: 0:47.8

PC #2 – Build Times: (Minutes:Seconds)
x64 Free native 64-bit: 1:45.21
x64 Free x86_amd64 cross compiler(x86_amd64): 1:15.01
x86 Free: 1:15.84

Analysis:
It seems very strange that the 32-bit cross compiler would be so much faster than the native 64-bit compiler at producing x64 binaries. We would have expected the 64-bit native compiler/linker to benefit greatly from its ability to access the full 16 GB of RAM. Does the 64 bit native compiler have some kind of performance problem?

Using the cross compiler may not be a good long term solution, however. Based on past experience (pls. see Historical Data below), we would expect the cross compiler to slow down dramatically once the codebase becomes large enough for the linker to exceed the 4 GB RAM limit addressable under 32-bit.

Historical Data:
Here are some times for building a very large, complex V3 driver codebase with lots of templates, STL, etc. These data were mined from past e-mail. I believe they were all done on the same Dell Optiplex 960 with Quad cores, 8 GB RAM and an SSD. The Win7 WDK was used. Times without the SSD would be much longer.

o    Build x86 drivers on 32-bit OS: 9 mins    
o    Build x64 drivers on 32-bit OS: 22 mins
o    Build x64 drivers on 64-bit OS: 11 mins

The biggest bottleneck we’ve seen occurs when linking our large V3 driver codebase. Linking x64 binaries on 32-bit Windows is the worst offender. The build seems to run low on memory, and there are long delays where the build reports that it is waiting for a thread. Builds of x64 binaries take more than twice as long (22 mins vs. 9 mins) than building x86 binaries on 32-bit Windows.

Thanks for your time and interest!
Details
Sign in to post a comment.
Posted by Microsoft on 1/24/2012 at 10:36 AM
Hi,

Thankyou for these new tests. And delighted to see the build performance behaving how it should.

I suspect the difference from previous results could be the combined effect of: uniform, fast hardware - plus improvements to the tools and compilers.

In a perfect world, we could trace back thru our internal history to locate which checkin, or series of checkins, to the compiler and tools contributed to fixing the throughput problem. But in reality, there are 100s and 100s of such checkins. We have automated server farms that look for functional and/or performance regressions several times each 24 hours - and change code to improve results steadily thru each milestone. So tracking down the relevant changes would be a substantial job.

So I propose to close this bug as fixed.

Thankyou for all of the hard work and analysis you have invested to explore this issue,

Jim
Posted by bugzapper on 1/23/2012 at 10:58 AM
P.S. MSBuild logs are now attached.
Posted by bugzapper on 1/23/2012 at 10:55 AM
Thanks, Jim. I collected new data using the new Win8 WDP 4 build 8175 over the weekend when my PC wasn’t in use. The results were quite different this time compared to previous tests using older compilers and a mix of Win7 and Win8 PCs. I hope this is helpful!



Host PC Configuration:
--------------------------
Windows Developer Preview 4 build 8175, x64
Intel Core I7-2600 @ 3.4 GHz - Quad Core
RAM: 16 GB
Disk: Solid State Drive

Hyper-V VM Configurations:
-------------------------------
Windows Developer Preview 4 build 8175 – One each, x86, and x64
VMs were created on a Solid State Drive
VMs were fresh, and identically configured
# of CPUs and dedicated RAM were varied at noted with the test data

Details: Previous test data was from a mixture of Win7 and Win8, so it was hard to interpret. This test uses the latest Windows Developer Preview OS and Visual Studio 11 Beta compiler available to us. Each test was run on a freshly rebooted VM. I did a clean, then a full rebuild. All were checked builds because that’s what developers use on a daily basis. This is a large, complex codebase with lots of STL and templates. Compiler and linker timing data was logged using undocumented flags provided by the Visual Studio team.


Conclusions:
--------------

•    I expected, but no longer saw bottlenecks cross compiling for x64 on a 32-bit OS. I'm not sure whether this is due to improvements in the tools, or differences in hardware compared to previous tests, or both.

•    I also no longer saw a large difference when comparing a 32-bit PC doing 32-bit builds vs. a 64-bit PC doing 64-bit builds. I found a slight benefit (~7%) to using a 32-bit PC for development. I'm not sure whether this is due to improvements in the tools, or differences in hardware compared to previous tests, or both.

•    Surprisingly, doubling the RAM dedicated to the VM from 4GB to 8GB had no appreciable effect.

•    Surprisingly, the x86_amd64 cross compiler continues to outperform the native x64 compiler, even when the latter was given twice as much memory. This seems like an opportunity for performance improvement.

•    It is important to use multiple cores - build speed scaled in a nearly linear fashion when the number of processors was increased. That is impressive. I suspect the SSD may be a key enabler for this based on past experience because builds are very disk-intensive.


Tests with One CPU Assigned to VM:
-----------------------------------------

x86 Compiler:

32-bit VM, 4 GB RAM:                    Time Elapsed 00:21:06.87
64-bit VM, 4 GB RAM:                    Time Elapsed 00:23:30.47
64-bit VM, 8 GB RAM:                    Time Elapsed 00:23:20.95

Native x64 Compiler:

64-bit VM, 4 GB RAM:                    Time Elapsed 00:26:29.59
64-bit VM, 8 GB RAM:                    Time Elapsed 00:26:23.34

x86_amd64 Cross Compiler Building x64:

32-bit VM, 4 GB RAM:                    Time Elapsed 00:21:56.24            
64-bit VM, 4 GB RAM:                    Time Elapsed 00:23:44.68
64-bit VM, 8 GB RAM:                    Time Elapsed 00:24:16.31            


Tests with Four CPUs Assigned to VM:
------------------------------------------

x86 Compiler:

32-bit VM, 4 GB RAM:                    Time Elapsed 00:07:17.79
64-bit VM, 4 GB RAM:                    Time Elapsed 00:07:58.92
64-bit VM, 8 GB RAM:                    Time Elapsed 00:07:47.05

Native x64 Compiler:

64-bit VM, 4 GB RAM:                    Time Elapsed 00:08:44.34
64-bit VM, 8 GB RAM:                    Time Elapsed 00:09:21.12 (Test anomaly?-I expected 8GB to be at least as fast as 4GB…?)

x86_amd64 Cross Compiler Building x64:

32-bit VM, 4 GB RAM:                    Time Elapsed 00:07:28.59            
64-bit VM, 4 GB RAM:                    Time Elapsed 00:08:06.51            
64-bit VM, 8 GB RAM:                    Time Elapsed 00:07:48.42                            
Posted by Microsoft on 1/19/2012 at 3:55 PM
Here you go:

set _CL_=/Bt
set _LINK_=/TIME

then invoke the msbuild command.

You will get timing information interleaved with the invocations of cl.exe and link.exe. You then need to go in and parse them and compare to the baseline previous compiler's throughput.

Can you also run comparison /verbose diagnostics? The comparison will tell whether the slowdown is caused by compiler/linker, or by the MSBUILD engine itself.

thanks,

Jim
Posted by Microsoft on 1/19/2012 at 9:48 AM
Hi - let me check with the MSBUILD guys and I'll get back to you. (May take a day or two - yesterday's snowfall seems to have brought Seattle area to a standstill)

Jim
Posted by bugzapper on 1/19/2012 at 8:22 AM
Hi Jim,

Regarding my question below, "We're invoking MsBuild to build the solution like this. The solution contains dozens of projects, mostly C++. What's the best way to add the two flags you requested?"

Is there a simple way to set environment variables to add your requested compiler and linker flags to all builds? That might be the quickest and more reliable way.


('m setting up a pair of Win8 "WDP 4 build 8175" Hyper-V virtual machines - identical configuration, but one x86 and the other x64 to collect better test data for you.)

Regards,
Al

Posted by bugzapper on 1/12/2012 at 7:16 AM
Hello Jim,

Thank you so much for your reply and interest!

>> Am I right that in all cases you are building an x64 image? The differences being whether you build it on x86 versus x64?

Actually, no, one of the most interesting questions is whether it is more efficient for developers to use x86 or x64 PCs for their daily work. For our codebase at least, it appears that doing x86 builds on an x86 OS is twice as fast as doing x64 builds on an x64 OS on the same PC (seen on a dual core with 4 GB RAM, no SSD).

A second, independent case of particular interest is the comparative performance of the native 64-bit compiler vs. the 32-bit cross compiler that Visual Studio 11 uses even when installed on an x64 PC. In this second scenario, we are always building x64 binaries on an x64 OS. The compiler is what changes. We don't understand why the native compiler that can use the full 16 GB of RAM seems slower than the "x86_amd64" cross compiler that is limited to 4GB.

>> I realize your test compared Win7 versus Win8. However, have you noticed similar differences where both were built on Win7? Ditto for where both were built on Win8?

My gut feeling is that it probably does not make a significant difference whether we use Win7 or Win 8, or whether we use VS 2010 or VS 11. But in fairness I don't have the necessary combinations of OSes and compilers all installed on the same PC hardware as would be required to answer this with data.


>> What is your current main concern? - builds on Win7, or builds on Win8? (it's ok to say "both"!)

Windows 8 V4 print drivers can only be developed on Win8, so we won't be using Win7.

>> Can you give us more info about the project - command-line switches to the build (eg, are you using LTCG (whole-program optimization).

I timed a checked build when I compared the time to build x86 binaries on an x86 OS vs. the time to build x64 binaries on a x64 OS, so optimizations should pretty much all have been disabled. I'll attach some project files and the .props files that the WDK project conversion tool created from our old SOURCES files. Those should tell you exactly how we're building our code.

>> Is it possible you could zip up a representative project and send it to us, so we can analyze what's going on? (there's a whole pile of facets to your project that might not show up in other scenarios, or internal tests we run) ...
Further to my reply yesterday: I realize that providing a repro, with your code, is likely difficult. As an alternate step along the way that would help with our analysis, could you rebuild (x86->x64 versus x64->x64), enable timing ("cl /Bt" and "link/time" - yep, they're not documented :-) and mail us the resulting build logs?

Question about how to do that... We're invoking MsBuild to build the solution like this. The solution contains dozens of projects, mostly C++. What's the best way to add the two flags you requested?

msbuild %MSB_CLEAN% /t:build %XSRC_DIR%\Voyager_MsBuild.sln %MSB_PLATFORM% %MSB_CONFIGURATION% /m:%XNUM_BUILD_THREADS% /p:BuildInParallel=true /v:n /fl /flp:Verbosity=normal
Posted by Microsoft on 1/11/2012 at 12:03 PM
Hi,

Further to my reply yesterday: I realize that providing a repro, with your code, is likely difficult. As an alternate step along the way that would help with our analysis, could you rebuild (x86->x64 versus x64->x64), enable timing ("cl /Bt" and "link/time" - yep, they're not documented :-) and mail us the resulting build logs?

Thanks again,

Jim
Posted by Microsoft on 1/10/2012 at 9:17 AM
Hi, thanks for posting this bug.

We will investigate internal tests, to see whether we've hit this issue.

Meantime, can you answer a few questions to help us narrow down the best cases to drill into ...

Am I right that in all cases you are building an x64 image? The differences being whether you build it on x86 versus x64?

I realize your test compared Win7 versus Win8. However, have you noticed similar differences where both were built on Win7? Ditto for where both were built on Win8?

What is your current main concern? - builds on Win7, or builds on Win8? (it's ok to say "both"!)

Can you give us more info about the project - command-line switches to the build (eg, are you using LTCG (whole-program optimization).

Is it possible you could zip up a representative project and send it to us, so we can analyze what's going on? (there's a whole pile of facets to your project that might not show up in other scenarios, or internal tests we run)

Thanks,

Jim Hogg

Posted by MS-Moderator08 [Feedback Moderator] on 1/9/2012 at 10:08 PM
Thank you for submitting feedback on Visual Studio 2010 and .NET Framework. Your issue has been routed to the appropriate VS development team for investigation. We will contact you if we require any additional information.
Posted by MS-Moderator01 on 1/9/2012 at 10:40 AM
Thank you for your feedback, we are currently reviewing the issue you have submitted. If this issue is urgent, please contact support directly(http://support.microsoft.com)
Sign in to post a workaround.
File Name Submitted By Submitted On File Size  
sample_xerox_projects.zip (restricted) 1/12/2012 -
MSBuild_Logs_21-Jan-2012.zip 1/23/2012 1.35 MB