What can be done to programming languages to avoid floating point pitfalls?

Question

The misunderstanding of floating point arithmetic and its short-comings is a major cause of surprise and confusion in programming (consider the number of questions on Stack Overflow pertaining to "numbers not adding correctly"). Considering many programmers have yet to understand its implications, it has the potential to introduce many subtle bugs (especially into financial software). What can programming languages do to avoid its pitfalls for those that are unfamiliar with the concepts, while still offering its speed when accuracy is not critical for those that do understand the concepts?

The only thing a programming language can do to avoid the pitfalls of floating-point processing is to ban it. Note that this includes base-10 floating-point as well, which is just as inaccurate in general, except that financial applications are pre-adapted to it.
This is what "Numerical Analysis" is for. Learn how to minimize precision loss - aka floating point pitfalls.

William Pietri · Answer 1 · 2011-03-31 16:54:21Z

You say "especially for financial software", which brings up one of my pet peeves: money is not a float, it's an int.

Sure, it looks like a float. It has a decimal point in there. But that's just because you're used to units that confuse the issue. Money always comes in integer quantities. In America, it's cents. (In certain contexts I think it can be mills, but ignore that for now.)

So when you say $1.23, that's really 123 cents. Always, always, always do your math in those terms, and you will be fine. For more information, see:

Martin Fowler's Quantity and Money patterns
His books Patterns of Enterprise Application Architecture and Analysis Patterns
Wikipedia on Banker's Rounding

Answering the question directly, programming languages should just include a Money type as a reasonable primitive.

update

Ok, I should have only said "always" twice, rather than three times. Money is indeed always an int; those who think otherwise are welcome to try sending me 0.3 cents and showing me the result on your bank statement. But as commenters point out, there are rare exceptions when you need to do floating point math on money-like numbers. E.g., certain kinds of prices or interest calculations. Even then, those should be treated like exceptions. Money comes in and goes out as integer quantities, so the closer your system hews to that, the saner it will be.

@JoelFan: you're mistaking a concept for a platform specific implementation.
It's not quite that simple. Interest calculations, among others, do produce fractional cents, and have to be rounded at some point according to a specified method.
Fictional -1, since I lack the rep for a downvote :) ...This might be correct for whatever's in your wallet but there are plenty of accounting situations where you could well be dealing with tenths of a cent, or smaller fractions. Decimal is the only sane system for dealing with this, and your comment "ignore that for now" is the harbinger of doom for programmers everywhere :P
@kevin cline: There are fractional cents in calculations, but there are conventions on how to handle them. The goal for financial calculations is not mathematical correctness, but getting the exact same results that a banker with a calculator would.
Everything will be perfect by replacing the word "integer" with "rational"-

BillThor · Answer 2 · 2011-03-28 17:33:56Z

up vote 10 down vote

Providing support for a Decimal type helps in many cases. Many languages have a decimal type, but they are underused.

Understanding the approximation that occurs when working with representation of real numbers is important. Using both decimal and floating point types 9 * (1/9) != 1 is a correct statement. When constants a optimizer may optimize the calculation so that it is correct.

Providing an approximates operator would help. However, such comparisons are problematic. Note that .9999 trillion dollars is approximately equal to 1 trillion dollars. Could you please deposit the difference in my bank account?

answered Mar 28 '11 at 17:33

community wiki

BillThor

4

+1 for the bank acct. joke. – dsimcha Mar 28 '11 at 18:25

2

0.9999... trillion dollars is precisely equal to 1 trillion dollars actually. – JUST MY correct OPINION Mar 29 '11 at 7:00

5

@JUST: Yes but I haven't encountered any computers with registers that will hold 0.99999.... They all truncate at some point resulting in an inequality. 0.9999 is equal enough for engineering. For financial purposes it isn't. – BillThor Mar 29 '11 at 14:08

1

But what kind of system used trillions of dollars as the base unit instead of ones of dollars? – Brad Aug 1 '12 at 18:23

show 1 more comment

ChaosPandion · Answer 3 · 2012-08-01 13:04:15Z

up vote 8 down vote

Warning: The floating-point type System.Double lacks the precision for direct equality testing.

double x = CalculateX();
if (x == 0.1)
{
    // ............
}

I don't believe anything can or should be done at a language level.

edited Aug 1 '12 at 13:04

community wiki

ChaosPandion

1

I haven't used a float or double in a long time, so I'm curious. Is that an actual existing compiler warning, or just one you'd like to see? – Karl Bielefeldt Mar 28 '11 at 18:00

1

@Karl - Personally I haven't seen it or need it but I imagine it could be useful to dedicated but green developers. – ChaosPandion Mar 28 '11 at 19:28

1

The binary floating point types are no better or worse qualitatively than Decimal when it comes to equality testing. The difference between 1.0m/7.0m*7.0m and 1.0m may be many orders of magnitude less than the difference between 1.0/7.0*7.0, but it's not zero. – supercat Jul 12 '12 at 22:11

1

@Patrick - I'm not sure what you are getting at. There is a huge difference between something being true for one case and being true for all cases. – ChaosPandion Aug 1 '12 at 14:15

1

@ChaosPandion The problem with the example in this post isn't the equality-comparison, it's the floating-point literal. There is no float with the exact value 1.0/10. Floating point maths results in 100% accurate results when computing with integer numbers fitting within the mantissa. – Patrick Aug 1 '12 at 20:01

show 3 more comments

Tim Williscroft · Answer 4 · 2011-03-28 22:29:35Z

We were told what to do in the first year ( sophomore) lecture in computer science when I went to university , ( this course was a pre-requisite for most science courses as well)

I recall the lecturer saying "Floating point numbers are approximations. Use integer types for money. Use FORTRAN or other language with BCD numbers for accurate computation." ( and then he pointed out the approximation, using that classic example of 0.2 impossible to represent accurately in binary floating point). This also turned up that week in the laboratory exercises.

Same lecture : "If you must get more accuracy from floating point, sort your terms. Add small numbers together, not to big numbers." That stuck in my mind.

A few years ago I had a some spherical geometry that needed to be very accurate, and still fast. 80 bit double on PC's was not cutting it, so I added some types to the program that sorted terms before performing commutative operations. Problem solved.

Before you complain about the quality of the guitar, learn to play.

I had a co-worker four years ago who'd worked for JPL. He expressed disbelief that we used FORTRAN for some things. (We needed super accurate numerical simulations calculated offline.) "We replaced all that FORTRAN with C++" he said proudly. I stopped wondering why they missed a planet.

+1 the right tool for the right job. Although I don't actually use FORTRAN. Thankfully neither do I work on our financial systems at work.

Berin Loritsch · Answer 5 · 2011-03-28 16:55:48Z

The two biggest problems involving floating point numbers are:

inconsistent units applied to the calculations (note this also affects integer arithmetic in the same way)
failure to understand that FP numbers are an approximation and how to intelligently deal with rounding.

The first type of failure can only be remedied by providing a composite type that includes value and unit information. For example, a length or area value that incorporates the unit (meters or square meters or feet and square feet respectively). Otherwise you have to be diligent about always working with one type of unit of measurement and only converting to another when we share the answer with a human.

The second type of failure is a conceptual failure. The failures manifest themselves when people think of them as absolute numbers. It affects equality operations, cumulative rounding errors, etc. For example, it may be correct that for one system two measurements are equivalent within a certain margin of error. I.e. .999 and 1.001 are roughly the same as 1.0 when you don't care about differences that are smaller than +/- .1. However, not all systems are that lenient.

If there is any language level facility needed, then I would call it equality precision. In NUnit, JUnit, and similarly constructed testing frameworks you can control the precision that is considered correct. For example:

Assert.That(.999, Is.EqualTo(1.001).Within(10).Percent);
// -- or --
Assert.That(.999, Is.EqualTo(1.001).Within(.1));

If, for example, C# or Java were altered to include a precision operator, it might look something like this:

if(.999 == 1.001 within .1) { /* do something */ }

However, if you supply a feature like that, you also have to consider the case where equality is good if the +/- sides are not the same. For example, +1/-10 would consider two numbers equivalent if one of them was within 1 more, or 10 less than the first number. To handle this case, you might need to add a range keyword as well:

if(.999 == 1.001 within range(.001, -.1)) { /* do something */ }

I'd switch the order. The conceptual problem is pervasive. The units conversion issue is relatively minor by comparison.
@dan04: I was thinking more in terms of "all calculations accurate to within one percent" or the like. I've seen the tar-pit that is unit of measure handling and I'm staying well away.

Waquo · Answer 6 · 2011-03-28 18:07:54Z

up vote 4 down vote

By default, languages should use arbitrary-precision rationals for non-integer numbers.

Those who need to optimize can always ask for floats. Using them as a default made sense in C and other systems programming languages, but not in most languages popular today.

answered Mar 28 '11 at 18:07

community wiki

Waquo

1

How do you deal with irrational numbers then? – dsimcha Mar 28 '11 at 18:24

3

You do it the same way as with floats: approximation. – Waquo Mar 28 '11 at 19:09

1

@supercat: What you're suggesting is just a poster-child of premature optimisation. The current situation is that the vast majority of programmers have no need whatsoever for fast math, and then get bitten by hard to understand floating-point (mis)behaviour, so that the relatively tiny number of programmers who need fast math gets it without having to type a single extra character. This made sense in the seventies, now it's just nonsense. The default should be safe. Those who need fast should ask for it. – Waquo Aug 1 '12 at 19:09

show 7 more comments

John · Answer 7 · 2011-03-28 16:54:18Z

What can programming languages do? Don't know if there's one answer to that question, because anything the compiler/interpreter does on the programmer's behalf to make his/her life easier usually works against performance, clarity, and readability. I think both the C++ way (pay only for what you need) and the Perl way (principle of least surprise) are both valid, but it depends on the application.

Programmers still need to work with the language and understand how it handles floating points, because if they don't, they'll make assumptions, and one day the perscribed behavior won't match up with their assumptions.

My take on what the programmer needs to know:

What floating-point types are available on the system and in the language
What type is needed
How to express the intentions of what type is needed in the code
How to correctly take advantage of any automatic type promotion to balance clarity and efficiency while maintaining correctness

Alison · Answer 8 · 2011-03-28 17:12:34Z

up vote 3 down vote

What can programming languages do to avoid [floating point] pitfalls...?

Use sensible defaults, e.g. built-in support for decmials.

Groovy does this quite nicely, although with a bit of effort you can still write code to introduce floating point imprecision.

answered Mar 28 '11 at 17:12

community wiki

Alison

Apalala · Answer 9 · 2011-03-29 01:01:48Z

I agree there's nothing to do at the language level. Programmers must understand that computers are discrete and limited, and that many of the mathematical concepts represented in them are only approximations.

Never mind floating point. One has to understand that half of the bit patterns are used for negative numbers and that 2^64 is actually quite small to avoid typical problems with integer arithmetic.

Justin Cave · Answer 10 · 2011-03-28 17:06:06Z

If more programming languages took a page from databases and allowed developers to specify the length and precision of their numeric data types, they could substantially reduce the probability of floating point related errors. If a language allowed a developer to declare a variable as a Float(2), indicating that they needed a floating point number with two decimal digits of precision, it could perform mathematical operations much more safely. If it did so by representing the variable as an integer internally and dividing by 100 before exposing the value, it could improve speed by using the faster integer arithmetic paths. The semantics of a Float(2) would also let developers avoid the constant need to round data before outputting it since a Float(2) would inherently round data to two decimal points.

Of course, you'd need to allow a developer to ask for a maximum-precision floating point value when the developer needs to have that precision. And you would introduce problems where slightly different expressions of the same mathematical operation produce potentially different results because of intermediate rounding operations when developers don't carry enough precision in their variables. But at least in the database world, that doesn't seem to be too big a deal. Most people aren't doing the sorts of scientific calculations that require lots of precision in intermediate results.

What you're advocating is a fixed-point base 10 data type with programmer-specified precision. I'm saying that the programmer-specified precision is mostly pointless, and will just lead to the sorts of errors I used to run into in COBOL programs. (For example, when you change the precision of variables, it's real easy to miss one variable the value runs through. For another, it will take a lot more thinking about intermediate result size than is good.)
A Float(2) like you propose should not be called Float, since there is nothing floating here, certainly not the "decimal point".

supercat · Answer 11 · 2012-07-31 22:49:17Z

One thing I would like to see would be a recognition that double to float should be regarded as a widening conversion, while float to double is narrowing(*). That may seem counter-intuitive, but consider what the types actually mean:

0.1f means "13,421,773.5/134,217,728, plus or minus 1/268,435,456 or so".
0.1 really means 3,602,879,701,896,397/36,028,797,018,963,968, plus or minus 1/72,057,594,037,927,936 or so"

If one has a double which holds the best representation of the quantity "one-tenth" and converts it to float, the result will be "13,421,773.5/134,217,728, plus or minus 1/268,435,456 or so", which is a correct description of the value.

By contrast, if one has a float which holds the best representation of the quantity "one-tenth" and converts it to double, the result will be "13,421,773.5/134,217,728, plus or minus 1/72,057,594,037,927,936 or so"--a level of implied accuracy which is wrong by a factor of over 53 million.

Although the IEEE-744 standard requires that floating-point maths be performed as though every floating-point number represents the exact numerical quantity precisely at the center of its range, that should not be taken to imply that floating-point values actually represent those exact numerical quantities. Rather, the requirement that the values be assumed to be at the center of their ranges stems from three facts: (1) calculations must be performed as though the operands have some particular precise values; (2) consistent and documented assumptions are more helpful than inconsistent or undocumented ones; (3) if one is going to make a consistent assumption, no other consistent assumption is apt to be better than assuming a quantity represents the center of its range.

Incidentally, I remember some 25 years or so ago, someone came up with a numerical package for C which used "range types", each consisting of a pair of 128-bit floats; all calculations would be done in such fashion as to compute the minimum and maximum possible value for each result. If one performed a big long iterative calculation and came up with a value of [12.53401391134 12.53902812673], one could be confident that while many digits of precision were lost to rounding errors, the result could still be reasonably expressed as 12.54 (and it wasn't really 12.9 or 53.2). I'm surprised I haven't seen any support for such types in any mainstream languages, especially since they would seem a good fit with math units that can operate on multiple values in parallel.

(*)In practice, it's often helpful to use double-precision values to hold intermediate computations when working with single-precision numbers, so having to use a typecast for all such operations could be annoying. Languages could help by having a "fuzzy double" type, which would perform computations as double, and could be freely cast to and from single; this would be especially helpful if functions which take parameters of type double and return double could be marked so that they would automatically generate an overload which accepts and returns "fuzzy double" instead.

vartec · Answer 12 · 2011-03-28 23:05:05Z

languages have Decimal type support; of course this doesn't really solve the problem, still you have no exact and finite representation of for example ⅓;
some DBs and frameworks have Money type support, this is basically storing number of cents as integer;
there are some libraries for rational numbers support; that solves problem of ⅓, but doesn't solve the problem of for example √2;

These above are applicable in some cases, but not really a general solution for dealing with float values. The real solution is to understand the problem and learn how to deal with it. If you're using float point calculations, you should always check is your algorithms are numerically stable. There is huge field of mathematics/computer science which relates to the problem. It's called Numerical Analysis.

Loren Pechtel · Answer 13 · 2012-07-12 21:45:31Z

One thing languages could do--remove the equality comparison from floating point types other than a direct comparison to the NAN values.

Equality testing would only exist is as function call that took the two values and a delta, or for languages like C# that allow types to have methods an EqualsTo that takes the other value and the delta.

Haakon Løtveit · Answer 14 · 2012-07-31 18:12:10Z

up vote 1 down vote

I find it strange that nobody has pointed out the Lisp family's rational number trick.

Seriously, open sbcl, and do this: (+ 1 3) and you get 4. If you do*( 3 2) you get 6. Now try (/ 5 3) and you get 5/3, or 5 thirds.

That should help somewhat in some situations, shouldn't it?

answered Jul 31 '12 at 18:12

community wiki

Haakon Løtveit

comingstorm · Answer 15 · 2012-08-01 22:27:01Z

As other answers have noted, the only real way to avoid floating point pitfalls in financial software is not to use it there. This may actually be feasible -- if you provide a well-designed library dedicated to financial math.

Functions designed to import floating-point estimates should be clearly labelled as such, and provided with parameters appropriate to that operation, e.g.:

Finance.importEstimate(float value, Finance roundingStep)

The only real way to avoid floating point pitfalls in general is education -- programmers need to read and understand something like What Every Programmer Should Know About Floating-Point Arithmetic.

A few things that might help, though:

I'll second those who ask "why is exact equality testing for floating point even legal?"
Instead, use an isNear() function.
Provide, and encourage use of, floating-point accumulator objects (which add sequences of floating point values more stably than simply adding them all into a regular floating point variable).

JoelFan · Answer 16 · 2011-03-28 21:44:15Z

up vote -2 down vote

Most programmers would be surprised that COBOL got that right... in the first version of COBOL there was no floating point, only decimal, and the tradition in COBOL continued until today that the first thing you think of when declaring a number is decimal... floating point would only be used if you really needed it. When C came along, for some reason, there was no primitive decimal type, so in my opinion, that's where all the problems started.

answered Mar 28 '11 at 21:44

community wiki

JoelFan

1

C didn't have a decimal type because it isn't primitive, very few computers having any sort of hardware decimal instructions. You might ask why BASIC and Pascal didn't have it, since they weren't designed to conform closely to the metal. COBOL and PL/I are the only languages I know of the time that had anything like that. – David Thornley Mar 28 '11 at 21:54

3

@JoelFan: so how do you write ⅓ in COBOL? Decimal doesn't solve any problems, base 10 is just as inaccurate as base 2. – vartec Mar 28 '11 at 23:01

2

Decimal solves the problem of exactly representing dollars and cents, which is useful for a "Business Oriented" language. But otherwise, decimal is useless; it has the same kinds of errors (e.g., 1/3*3=0.99999999) while being much slower. Which is why it's not the default in languages that weren't specifically designed for accounting. – dan04 Mar 29 '11 at 0:12

1

@JoelFan: if you have quarterly value and you need per month value, guess what do you have to multiply it by... no, it's not 0.33, it's ⅓. – vartec Mar 29 '11 at 16:24

1

@JoelFan: I'm familiar with them. I've programmed them. They're probably about the only computers with microcode-level decimal arithmetic in use now, and there are very few of them, for any reasonable use of "very few" when applied to numbers of computers. – David Thornley Mar 29 '11 at 16:25

show 4 more comments

asked	2 years ago
viewed	2629 times
active	8 months ago

What can be done to programming languages to avoid floating point pitfalls?

16 Answers

Your Answer

Not the answer you're looking for? Browse other questions tagged language-design or ask your own question.

What can be done to programming languages to avoid floating point pitfalls?

16 Answers

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged language-design or ask your own question.

Related