Why is 0.2 + 0.1 not equal to 0.3 in some programming languages?

If you are a programmer, you may have come across this : 0.2 + 0.1 is equal to 0.3000000000000004 and not 0.3, which makes 0.2 + 0.1 == 0.3 false.

There are several more examples such as 0.3 + 0.6 being equal to 0.8999999999999999 instead of 0.9.
To understand the origin of the "problem",  we'll start off by looking at how we represent fractions in binary.
Binary is a positional number system. It is also a base number system. As we move a digit to the left, the power we multiply the base (2 in binary) by increases by 1. As we move to the right we decrease by 1 (into negative numbers).
For example, in decimal the number 156 actually translates as: 10² + 5 * 10  + 6 * 100 In binary it is the same process however we use powers of 2 instead.


Now let's do the opposite thing. The easiest approach to convert a decimal fraction to binary is a method where we repeatedly multiply the fraction by 2 and recording whether the digit to the left of the decimal point is a 0 or 1, then discarding the 1 if it is. Once you are done you read the value from top to bottom. Let's take a look an example:
let's convert the fraction 0.1 :
0.1 * 2 = 0.2


the result we get is 0.00011001. This example finishes after 8 bits to the right of the binary point but you may keep going as long as you like.
We notice that if we do the opposite operation on 0.0001100 we get 0.09765625 which is very close to 0.1 but not exactly equal to it.
As I said before, you can go on a much as you like and not stop at the eighth bit.
We can go on and on in the process of the conversion of 0.1 to binary to obtain 0.00011001100110011001...
<> What we have looked at previously is what is called fixed point binary fractions. These are a convenient way of representing numbers but as soon as the number we want to represent is very large or very small we find that we need a very large number of bits to represent them. If we want to represent the decimal value 128 we require 8 binary digits 10000000.
To get around this we use a method of representing numbers called floating point. Floating point is quite similar to scientific notation as a means of representing numbers. We lose a little bit of accuracy however when dealing with very large or very small values that is generally acceptable.
This is an example of the scientific notation we all know :

Now let's move on to the floating point method. Hardware manufactures use the IEEE 754 standard. This standard specifies the following formats for floating point numbers:
Single precision, which uses 32 bits and has the following layout:
1 bit for the sign of the number. 0 means positive and 1 means negative.
8 bits for the exponent.
23 bits for the mantissa
Double precision, which uses 64 bits and has the following layout.

1 bit for the sign of the number. 0 means positive and 1 means negative.11 bits for the exponent.52 bits for the mantissa.
Double precision has more bits, allowing for much larger and much smaller numbers to be represented. As the mantissa is also larger, the degree of accuracy is also increased. While double precision floating point numbers have these advantages, they also require more processing power.
In programming languages, double and float variable types are different in the way that they store the values. Precision is the main difference where float is a single precision (32 bit) floating point data type, double is a double precision (64 bit) floating point data type.
Nowadays, with increases in CPU processing power and the move to 64 bit computing a lot of programming languages just default to double precision. JavaScript is one of them. It does not define different types of numbers, like integers, short, long, floating-point etc..JavaScript numbers are always stored as double precision floating point numbers, following the international IEEE 754 standard.
Computers work in binary. but obviously you can't store infinite digits, so we will have to chop off the number, leaving some digits behind. Let's take 0.1 as an example. We will take only the 16 first its. If you do that, the number that you are actually storing is equal to 0.0999908447265625 not 0.1.
This means that numbers like 0.1 = 1/10 and 0.2 = 1/5 can only be approximated. That’s why adding them is imprecise. the numbers were imprecise even before the addition, but you did not see it. The addition made the imprecision large enough that it is visible.
And that's exactly what happens here. Since it's hard to represent normal numbers in binary, you will almost always end up with slight inaccuracies. One of them is leading to 0.1 + 0.2 == 0.3 giving false as a result.
While modern computers are quite capable of doing math with actual decimal numbers, lots of languages don't support that very well, and still use floating point, because it's fast and it's generally good enough for most things.

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel