Search

Double Rounding Error When the 'f' Suffix is Used on Floating-Point Literals by DoctorBinary

Closed
as Won't Fix Help for as Won't Fix

1
0
Sign in
to vote
Type: Bug
ID: 585669
Opened: 8/11/2010 8:05:17 PM
Access Restriction: Public
0
Workaround(s)
0
User(s) can reproduce this bug
Visual C++ appears to double round floating-point literals when assigned to floats, even when the literals have the 'f' suffix attached. This can lead to double rounding errors, and as a result, incorrect decimal to floating-point conversions.

For example, the decimal value 0.5000000894069671353303618843710864894092082977294921875 is 0.1000000000000000000000010111111111111111111111111111111 in binary. The part after bit 24 is 0111111111111111111111111111111, and so as a float, it should be rounded down to 0.100000000000000000000001. Visual C++ rounds it up to 0.10000000000000000000001, so the conversion is too high by 1 ULP.

It appears double rounding is the culprit, since as a double, the decimal value above is rounded to 0.1000000000000000000000011. This has bit 25 = 1, and due to "round half even", it is rounded up to 0.10000000000000000000001 when going to single-precision.

For the record, gcc on Linux does not have this problem.

For a more detailed description of this problem, see my article http://www.exploringbinary.com/double-rounding-errors-in-floating-point-conversions/
Details (expand)

Visual Studio/Silverlight/Tooling version

Visual Studio 2010

What category (if any) best represents this feedback?

 

Steps to reproduce

#include <stdio.h>
int main (void)
{
float f = 0.5000000894069671353303618843710864894092082977294921875f;
printf ("f = %a\n",f);
}

Product Language

English

Operating System

Windows XP

Operating System Language

English

Actual results

f = 0x1.000004p-1

Expected results

f = 0x1.000002p-1
File Attachments
0 attachments
Sign in to post a comment.
Posted by Microsoft on 8/31/2010 at 1:00 PM
Thank you for reporting this issue to Microsoft. This is indeed an issue with the compiler but we regret that we cannot fix it for the next release due to its priority. Please let us know if this is a blocking issue for you.

Tanveer Gani
Visual C++ Team.
Posted by DoctorBinary on 8/19/2010 at 12:59 PM
A reader of my article (cited above) suggested a shorter example that shows the same double rounding problem: 1.0000000596046448. In binary it's 1.00000000000000000000000100000000000000000000000000000001110… . As an ‘f’ suffix float it should round correctly to 1.00000000000000000000001, but Visual C++ doubly rounds it to 1.0.
Posted by Microsoft on 8/12/2010 at 5:12 AM
Thanks for your feedback. We are routing this issue to the appropriate group within the Visual Studio Product Team for triage and resolution. These specialized experts will follow-up with your issue.
Sign in to post a workaround.