Take the 2-minute tour ×
Unix & Linux Stack Exchange is a question and answer site for users of Linux, FreeBSD and other Un*x-like operating systems. It's 100% free, no registration required.

This looks so simple but it isn't:

[[ "1234+5678" =~ [0-9]+(\s*(\-|\*)\s*[0-9]+)* ]] && echo $?

returns a 0. However, it actually should not do that, as only minus (-) and multiplication (*) operator are allowed. Also, I grabbed some regex tool in the net and tried to match this pattern: result was the null string. (as expected)

In prose, this extended regex reads:

  • look for number (mandatory)
  • check if there is some white space
  • operator must be either a - or a *
  • check if there is some white space (again)
  • look for another number (must be there if preceded by an operator)

Also, the asterisk following the expression in parentheses says that 2nd to nth operator-number pair is optional.

Where is my mistake in thinking here?

share|improve this question

3 Answers 3

Without markers the regexp (right part) can match any part of string. So your variant match 1234. To satisfy requirements you have to use markers:

[[ "1234+5678" =~ ^[0-9]+(\s*(\-|\*)\s*[0-9]+)*$ ]] ; echo $?

And shorter (if you'd like):

[[ "1234+5678" =~ ^[0-9\ *-]+*$ ]] ; echo $?
share|improve this answer
    
SO simple? Just putting the ^ at the beginning did the trick. Thank you very much! (also for not inverting the logic for no reason ;)) Again: single numbers must work as well as numbers with addends. So it is not that simple as replacing * by +. –  syntaxerror Jun 23 at 21:09
    
the shorter expression also matches 123-*5* –  ikrabbe Jun 23 at 21:12
    
Note that \s* stands for various whitespace like e. g. tabulators as well. Your shorter solution blindly assumes that whitespace = space, which is a little naughty ;) Plus, whatever you write into [ ] is arbitrary in order, whilst my expression dictates a definite order - for a reason! Your expression would even allow a * or - as first characters, which is plainly wrong, as they are arithmetic operators and must be preceded by a number. Shorter is not always better, as it may unwantedly alter the logic. –  syntaxerror Jun 23 at 21:15
    
@ikrabbe Yes, but the choice to use (or do not use) some variant or its modification depends of operated data and users forecast. –  Costas Jun 23 at 21:21
    
@syntaxerror I have stated: if you'd like. Certainly, if you expect data with a * or - as first characters and so on you are free to choice 1 variant but in most cases shorter is enough –  Costas Jun 23 at 21:25

Your expression returns 0 because it matches the first number. Anchor the regexp to make it do what you want:

[[ "1234+5678" =~ ^[0-9]+( *(-|\*) *[0-9]+)*$ ]] && echo $?

On a side note: (1) you don't need to quote -, and (2) \s is not recognized as an ERE.

share|improve this answer
    
Thanks, also for the info about \s. Well, I wanted to support both tabulators (et al.) and "plain" spaces, hence I used \s on purpose. However, I wasn't aware that this is not part of ERE. Most peculiar... –  syntaxerror Jun 23 at 21:24
1  
It isn't peculiar: \s and friends come from Perl's lineage of regexps, not the much older BRE / ERE. –  lcd047 Jun 24 at 3:25
[[ "1234+5678" =~ [0-9]+(\s*(\-|\*)\s*[0-9]+) ]] ; echo $?

The * regex operator at the end of your regex allows any number of repeats of the expression inside the () also none. So you match [0-9]+ from the start and the rest is optional.

If you remove the * or replace it by a + the expression returns as expected.

Also, the asterisk following the expression in parentheses says that 2nd to nth operator-number pair is optional.

That is the mistake: The * means the any operator is option. Use the + operator!

Ok, that was not the answer, but you can consume the expressions and see if the result is empty:

echo "1234*5678"|sed 's/[0-9]\+\([ \*\-][0-9]\+\)*//'
share|improve this answer
    
Thanks, but it's meant to be that way. :) It should match on "1234" as well as "2345*6789*4321", or in prose, on single numbers as well as on "formulae" (i. e. additions in this case). However, the problem is that it matches also on additions + and divisions /, even though I'm only allowing subtractions and multiplications in this sample one-liner. Your answer will also exclude standalone numbers, as now you do no longer allow that the addend be omitted. –  syntaxerror Jun 23 at 20:52
    
Just add an end anchor ($) and you are set: [[ "1234+5678" =~ [0-9]+(\s*(\-|\*)\s*[0-9]+)*$ ]] || echo $? –  user1794469 Jun 23 at 20:52
    
@ user1794469 No. Doesn't help me at all. By replacing && by ||, you've just inverted the logic. Coolish. But that's totally pointless, as I need the result of the expression itself, not the result if the expression is false. –  syntaxerror Jun 23 at 20:54
    
I thought so, but it doesn't work, I inverted the logic just to get some output from a expression that is true. –  ikrabbe Jun 23 at 20:56
    
@ikrabbe It definitely doesn't. :) So I now get it, you were the initiator of inverting the logic, and user179... just jumped on the bandwagon. LOL. –  syntaxerror Jun 23 at 20:57

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.