Take the 2-minute tour ×
Code Review Stack Exchange is a question and answer site for peer programmer code reviews. It's 100% free, no registration required.

In need for testing if a string is a number which could be a double or integer, I wrote a simple skeleton, which parses a string like this:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>

int main(int argc, char *argv[])
{

char* to_convert = argv[1];
char* p = NULL;
errno = 0;
long val = strtol(argv[1], &p, 0);

if (errno != 0)
return 1;// conversion failed (EINVAL, ERANGE)

if (to_convert == p){
// conversion to int failed (no characters consumed)
double val2 = strtod(p, &p);
if (*p){
printf("Not a number!\n");
return 1;
}
printf("Double %lf\n" , val2);
return 0;
}

if (*p != 0){
// conversion to int failed (trailing data)
double val2 = strtod(argv[1], &p);
if (*p){
printf("Not a number!\n");
return 1;
}
printf("Double %lf\n" , val2);
return 0;
}

printf("Int %ld\n" , val);
return 0;
}

/*
$ ./a.out 123
Int 123
$ ./a.out -123
Int -123
$ ./a.out 31.1
Double 31.100000
$ ./a.out -231
Int -231
$ ./a.out NaN
Double nan
$ ./a.out -Nan
Double nan
$ ./a.out -INF
Double -inf
$ ./a.out INF
Double inf
$ ./a.out 12.3
Double 12.300000
$ ./a.out -12,3  # Although some european localels (like German, would consider this a number)
Not a number!
$ ./a.out -12.3
Double -12.300000
$ ./a.out foo
Not a number!
*/ 

I would like to know if this could be done better, without iterating on all the characters like here.

  1. Is there any case I didn't think of which breaks this code?
  2. or causes undefined behaviour (except numbers larger than long on my architechture)?

Another thing is the problem with decimal seperator. How do you convert a number where the decimal separator is comma instead of a point like in the German language? This would be a nice to have, but not a must.

For example, my code says 12,3 is not a number, for a German person this would be a number (12.3 for English speakers).

share|improve this question
    
Internationalization is a broad topic with many pitfalls. Is 十二 a number? Is 12,345 to be interpreted as (12 + 345/1000) or as (12 * 1000 + 345)? A fully general solution is very complex. –  200_success Dec 4 '14 at 8:35
    
You can achieve this solution by suing if (sscanf(data, "%*lf%n", &count) == 1 && data[count] == '\0') { /* Its a Number */ }. Also scanf is uses the local local for number scanning. ; –  Loki Astari Dec 4 '14 at 19:12
    
@LokiAstari, does your suggestion differentiate between float and integer? Can you explain what do "%*lf%n" means? –  Oz123 Dec 8 '14 at 22:08
    
@Oz123: The bit in the comment does not. But it is easy to expand the above code so that you can detect integers or floating point numbers (in a local aware way). "%lf" is a long float ie a double. "%*lf" is the same but the star means don't store. "%n" means count the number of characters that have been read from the input. –  Loki Astari Dec 8 '14 at 22:16
    
OK, thank you. I like how compact C can sometimes be, but for readability, it can be really confusing sometimes. –  Oz123 Dec 8 '14 at 22:19

1 Answer 1

int test and double test are incompletely done.

errno = 0;
long val = strtol(argv[1], &p, 0);
if (errno) Fail();         // Range error (or possible other reasons)
if (argv[1] == p) Fail();  // no conversion
while (isspace((unsigned char) *p)) p++;
if (*p) Fail();            // Extra junk afterwards

if (*p) Fail(); Needs to be applied to both integer and FP test.

while (isspace((unsigned char) *p)) p++; allows trailing white-space. Both strtol() and strtod() allow leading white-space.

Double is more complex as in the case of underflow like "1e-10000", the errno may set and that is usually not considered an error in many uses. The returned value will be either +/-0.0 or a small double value.

double val2 = strtod(argv[1], &p);
if (errno == ERANGE && fabs(val2) <= DBL_MIN) NoFail();

Keep in mind a string may nicely translate to a long as well as a double. Should "-0" be along 0 or double -0.0? Aside from this case, I would favor long, especially if the test for extra junk is used.

Your test cases do not include big values "123...thousand_more...456", values like LONG_MIN (-1, +0, +1) and LONG_MAX (-1, +0, +1), DBL_MAX, next_after(DBL_MAX,0), strings with trailing garbage "123xyz", -0.0. BTW: a harsh check it to use (char) -1 in various places in the string. Or lead the string with (char) (x80 + ' ')

Nor does it test odd things like "++123", "0x0xABC", "08". (multiple sign, 2-prefix, octal 8) and new emerging "0b0101" (C14?)

Extreme FP strings include "0.000...0001e320", "" and " " (white-space only strings.)

FP also has a hexadecimal format, see scanf("%a")/printf("%a") for additional strings.

Original request was for "double or integer". Code conveniently uses long, but what if int was meant or intmax_t? Example additional test for int

long val = strtol(argv[1], &p, 0);
if (val < INT_MIN || val > INT_MAX) RangeError();
int i = (int) val;
...

To cope with decimal separator as , or ., use setlocale() and that will adjust strtol() and strtod(). But that is a weak aspect of C.

Lastly, a given locale could have "additional subject sequences" that convert into numbers.

share|improve this answer

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.