I'm kind of stuck trying to come up with regular expression to break up strings with the following properties:
- Delimited by the | (pipe) character
- If an individual value contains a pipe, escaped with \ (backslash)
- If an individual value ends with backslash, escaped with backslash
So for example, here are some strings that I want to break up:
One|Two|Three
should yield:["One", "Two", "Three"]
One\|Two\|Three
should yield:["One|Two|Three"]
One\\|Two\|Three
should yield:["One\", "Two|Three"]
Now how could I split this up with a single regex?
UPDATE: As many of you already suggested, this is not a good application of regex. Also, the regex solution is orders of magnitude slower than just iterating over the characters. I ended up iterating over the characters:
public static List<String> splitValues(String val) {
final List<String> list = new ArrayList<String>();
boolean esc = false;
final StringBuilder sb = new StringBuilder(1024);
final CharacterIterator it = new StringCharacterIterator(val);
for(char c = it.first(); c != CharacterIterator.DONE; c = it.next()) {
if(esc) {
sb.append(c);
esc = false;
} else if(c == '\\') {
esc = true;
} else if(c == '|') {
list.add(sb.toString());
sb.delete(0, sb.length());
} else {
sb.append(c);
}
}
if(sb.length() > 0) {
list.add(sb.toString());
}
return list;
}