0

I'm importing a file with umpteen lines of "##,##". Each number can be one or two digits.

I'd like to use String.split(regex) to get the two numbers without the adjacent quote marks.

Understanding that I could nibble off the first and last character and use a non-regex split, I'm hoping that there is a regular expression that will make this more graceful.

Suggestions?

EDIT:

In: "12,3"  
Out: 12  
      3
2
  • 1
    what do you mean with "non-regex split"? Also, can you provide a input/output example. Should "12,34" become 12,34 or 12 and 34?
    – jlordo
    Commented Jul 23, 2013 at 13:03
  • If I use String.split(",") I get the two halves. Each has a single quote mark on it... ok - not truly "non-regex" but not really using the strength of regex...
    – ethrbunny
    Commented Jul 23, 2013 at 13:04

3 Answers 3

7

How about using a regexp \"(d+),(d+)\". Then using Pattern.matcher(input) instead of String.split, and obtaining your digits by Matcher.group(int).

Please consider following snippet:

String line = "\"1,31\"";

Pattern pattern = Pattern.compile("\"(\\d+),(\\d+)\"");
Matcher matcher = pattern.matcher(line);
if (matcher.matches()) {
    int firstNumber = Integer.parseInt(matcher.group(1));
    int secondNumber = Integer.parseInt(matcher.group(2));
    // do whatever with the numbers
}
2

You can remove all double-quotes characters in each line then split the string by ,

String toSplit = "\"##,##\"";
String[] splitted = toSplit.replaceAll("\"", "").split(",");

Using \" in the toSplit string to simulate the "##,##" string.

0

You could split at the quotes as well but that would result in an array of length 4. Unfortunately, there's no way of splitting a string and removing others characters from the same string in one call using String#split.

As an alternative, you could use Apache's StringUtils:

String[] n = StringUtils.removeStart( StringUtils.removeEnd( "##,##", "\""), "\"").split(",");

Edit: as a side note, using StringUtils would allow for missing quotes at the start or end of the input string. If you're sure they're always present, a simple substring(...) might be sufficient. (credits go to @Ingo)

5
  • You could split the substring 1 too length-1
    – Ingo
    Commented Jul 23, 2013 at 13:00
  • @Ingo which substring do you mean? Could you elaborate?
    – Thomas
    Commented Jul 23, 2013 at 13:02
  • it should be obvious, shouldn't it? If I can't split "xx,xx" because of the quotes, I can split the substring xx,xx
    – Ingo
    Commented Jul 23, 2013 at 13:08
  • @Ingo now I understand your sentence. That's obvious, you're right. But with respect to my answer, I didn't understand your comment, since I meant to clearly state that it's not possible with one call of split alone - you'd need at least a substring or whatever to remove the quotes first ;)
    – Thomas
    Commented Jul 23, 2013 at 13:12
  • I did mean it as a supplement to your answer - one way of getting rid of the enclosing characters. Substring should cheaper than all the solutions that use replaceAll, etc.
    – Ingo
    Commented Jul 23, 2013 at 13:29

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.