Code Review Stack Exchange is a question and answer site for peer programmer code reviews. Join them; it only takes a minute:

Sign up
Here's how it works:
  1. Anybody can ask a question
  2. Anybody can answer
  3. The best answers are voted up and rise to the top
  • I want to first search for a specific regular expression.
  • If it is not found then I would like to search for another regular expression
  • If that one is also not found then I would like to search for a third
  • (And so on...)

Whenever a matching regular expression is found, I want to return a String array containing:

  • The text before the match
  • The match itself
  • The text after the match

I've only come up with this horribly if-else nesting way of doing it. There's gotta be a better (more extensible and more clean) way but I can't come up with how...

Here is the code, it is currently only containing two regular expressions to search for and it's getting nasty already:

private static String[] findNumbers(String string) {
    Matcher match = Pattern.compile("\\d+[\\:\\.\\,]\\d+").matcher(string);
    if (match.find()) {
        String found = string.substring(match.start(), match.end());
        return new String[]{ string.substring(0, match.start()),
            found, string.substring(match.end()) };
    }
    else {
        match = Pattern.compile("\\d{3,}").matcher(string);
        if (match.find()) {
            String found = string.substring(match.start(), match.end());
            return new String[]{ string.substring(0, match.start()),
                found, string.substring(match.end()) };
        }
        else {
            // Another if-else could be added here
            return new String[]{ string, "", "" };
        }
    }
}

@Test
public void test() {
    assertArrayEquals(new String[]{ "ABC ", "2000", " DEF" }, findNumbers("ABC 2000 DEF"));
    assertArrayEquals(new String[]{ "42 ABC ", "2000", " DEF" }, findNumbers("42 ABC 2000 DEF"));
    assertArrayEquals(new String[]{ "42 ABC ", "18:47", " DEF" }, findNumbers("42 ABC 18:47 DEF"));
}

I want to put priority to one regular expression before trying the next, that's why I have written this code.

share|improve this question
up vote 10 down vote accepted

Nesting is not necessary, you can iterate instead...

But first:

  • naming the input String string can be confusing. Of course it's a string, what is it used for though?
  • there is no need to escape the values inside the [...] construct. the pattern: "\\d+[:.,]\\d+" is just fine, and more readable.
  • you should be pre-compiling and then reusing the patterns, instead of re-compiling them every time.

Consider the following:

private static final Pattern[] patternorder = { 
        Pattern.compile("\\d+[:.,]\\d+"),
        Pattern.compile("\\d{3,}")
};

private static String[] findNumbers(String input) {
    for (Pattern pat : patternorder) {
        Matcher match = pat.matcher(input);
        if (match.find()) {
            return new String[] {
                input.substring(0, match.start()),
                input.substring(match.start(), match.end()),
                input.substring(match.end()),
            };
        }
    }
    return new String[]{input, "", ""};
}
share|improve this answer
    
Ah, apparently . don't match every character when inside []! – Simon Forsberg Jan 10 '14 at 20:40

You could use the (expression1|expression2|expression3) syntax. It makes for a long string, but you won't have to nest.

share|improve this answer
4  
Good instinct, but it wouldn't behave the same way as the original. Your solution would produce the leftmost longest match. The original falls back only when the first expression fails to match at all. – 200_success Jan 10 '14 at 20:24
    
Any developer worth their salt should be able to write a regex that looks like this: "^(.*)(matchthis|matchthat|matchsomethingelse)(.*)$". It's not that hard at all. I was merely suggesting that the OP employ the use of the grouping construct in the middle to prevent the need for iteration, BUT without deconstructing his code to write an exact working answer FOR him. – codenoir Jan 13 '14 at 15:17
1  
@codenoire the problem is, in description OP states: matchtis-> remove matches; matchthat -> remove matches; matchsomethingelse -> removematches. and your approach does not fulfil the requirement that the code outputs the same results as before... This makes your answer incorrect. either correct your answer or delete it, please.. – Vogel612 Mar 28 '14 at 12:46
1  
I've edited the question part to make it look more like a real answer. You should add a bit of explanation. – Marc-Andre Mar 28 '14 at 13:10

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.