Take the 2-minute tour ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.

Is there a simple solution to parse a String by using regex in Java?

I have to adapt a HTML page. Therefore I have to parse several strings, e.g.:

href="/browse/PJBUGS-911"
=>
href="PJBUGS-911.html"

The pattern of the strings is only different corresponding to the ID (e.g. 911). My first idea looks like this:

String input = "";
String output = input.replaceAll("href=\"/browse/PJBUGS\\-[0-9]*\"", "href=\"PJBUGS-???.html\"");

I want to replace everything except the ID. How can I do this?

Would be nice if someone can help me :)

share|improve this question
add comment

3 Answers

up vote 3 down vote accepted

You can capture substrings that were matched by your pattern, using parentheses. And then you can use the captured things in the replacement with $n where n is the number of the set of parentheses (counting opening parentheses from left to right). For your example:

String output = input.replaceAll("href=\"/browse/PJBUGS-([0-9]*)\"", "href=\"PJBUGS-$1.html\"");

Or if you want:

String output = input.replaceAll("href=\"/browse/(PJBUGS-[0-9]*)\"", "href=\"$1.html\"");
share|improve this answer
1  
Thank you for your very quick answer and solution. Works fine :-) –  erwingun2010 Dec 3 '12 at 19:31
add comment

This is how I would do it:

public static void main(String[] args) 
    {
        String text = "href=\"/browse/PJBUGS-911\" blahblah href=\"/browse/PJBUGS-111\" " +
                "blahblah href=\"/browse/PJBUGS-34234\"";

        Pattern ptrn = Pattern.compile("href=\"/browse/(PJBUGS-[0-9]+?)\"");

        Matcher mtchr = ptrn.matcher(text);

        while(mtchr.find())
        {
            String match = mtchr.group(0);
            String insMatch = mtchr.group(1);



            String repl = match.replaceFirst(match, "href=\"" + insMatch + ".html\"");

            System.out.println("orig = <" + match + "> repl = <" + repl + ">");
        }
    }

This just shows the regex and replacements, not the final formatted text, which you can get by using Matcher.replaceAll:

String allRepl = mtchr.replaceAll("href=\"$1.html\"");

If just interested in replacing all, you don't need the loop -- I used it just for debugging/showing how regex does business.

share|improve this answer
add comment

This does not use regexp. But maybe it still solves your problem.

output = "href=\"" + input.substring(input.lastIndexOf("/")) + ".html\"";
share|improve this answer
    
Don't forget to add ".html" to the end –  ean5533 Dec 3 '12 at 19:07
    
Pretty simple and straightforward this is. –  Rohit Jain Dec 3 '12 at 19:09
    
@Vulcan Yes there is. He requires it for his answer. –  ean5533 Dec 3 '12 at 19:09
2  
I believe input is not a single href="/browse/..." but a whole HTML file. Hence, the explicit mentioning of replaceAll in the question. –  m.buettner Dec 3 '12 at 19:09
    
Thanks for the edit. And yes you're probably right @m.buettner –  bert Dec 3 '12 at 19:14
add comment

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.