Your question is “Why does this happen?”,
but your implicit question (which others have addressed) is “How can I fix this?”
You figured out an approach, which you raised in a comment:
So if I multiply it to 1000 to eliminate the point, I can get the exact result, can’t I?
Yes. Well, 10000, since you have four decimal places. Consider this:
awk '{ s+=$1*10000; print $1, s/10000 }'
Unfortunately, this doesn’t work, because the corruption has already occurred
as soon as we interpret the token (string) as a decimal number.
For example, printf "%.20f\n"
shows that the input data 0.4157
is actually interpreted as 0.41570000000000001394.
In this case, multiplying by 10000 gets you what you would expect: 4157.
But, for example, 0.5973
= 0.59730000000000005311,
and multiplying that by 10000 yields 5973.00000000000090949470.
So instead we try
awk '{ s+=int($1*10000); print $1, s/10000 }'
to convert the numbers that “should be” whole numbers (e.g., 5973.00000000000090949470)
into the corresponding whole numbers (5973).
But that fails because sometimes the conversion error is negative;
e.g., 0.7130
is 0.71299999999999996714.
And awk
’s int(expr)
functions truncates (toward zero)
rather than rounding, so int(7129.99999999)
is 7129.
So, when life gives you lemons, you make lemonade.
And when a tool gives you a truncate function, you round by adding 0.5.
7129.99999999+0.5≈7130.49999999, and, of course, int(7130.49999999)
is 7130.
But remember: int()
truncates toward zero, and your input includes negative numbers.
If you want to round –7129.99999999 to –7130,
you need to subtract 0.5 to get –7130.49999999.
So,
awk '{ s+=int($1*10000+($1>0?0.5:-0.5)); print $1, s/10000 }'
which adds –0.5 to $1*10000
if $1
is ≤ 0.