1

Hi I'm new to regex and I am trying to use this to catch spaces \s{2,} in the junk but NOT including the spaces inside the "url":"https://x.com/a/C25/XPS - Connection - May 2013.docx". Currently, I have a scenario where url is not encoded yet so it may contain spaces inside.

Sample Text:

"startofjunk      junkjunkjunkjunk","url":"https://x.com/a/C25/XPS  - Connection - May 2013.docx","contentsource":"AX","returpath":null,"detailpath":"https://ax.sample.com/Rep>ositories/form.aspx?path=C25/96/99&mode=Read","detailspath2":"samplepath"

Desired Text:

"startofjunk junkjunkjunkjunk","url":"https://x.com/a/C25/XPS  - Connection - May 2013.docx","contentsource":"AX","returpath":null,"detailpath":"https://ax.sample.com/Rep>ositories/form.aspx?path=C25/96/99&mode=Read","detailspath2":"samplepath"

please help. Thanks

0

2 Answers 2

0

Description

This regex will find an replace all multiple spaces with a single space, and will bypass the url section. In a sequence of X number of spaces, the first space is placed into group 1 which is fed to the output as \1 and the additional spaces are ignored. The URL section is bypassed because if it is encountered as part of the | or statement, then it is captured into group 2 which is then injected back into the output by the \2 replacement.

Regex: (\s)\s*|("url":"[^"]*"), Replace with: \1\2

enter image description here

Source String

"startofjunk        junkjunkjunkjunk","url":"https://x.com/a/C25/XPS - Connection - May 2013.docx","contentsource":"AX","returpath":null,"detailpath":"https://ax.sample.com/Rep>ositories/form.aspx?path=C25/96/99&mode=Read","detailspath2":"samplepath"

PHP example

This php example is included to simply show that the regex works

<?php
$sourcestring="your source string";
echo preg_replace('/(\s)\s*|("url":"[^"]*")/im','\1',$sourcestring);
?>

$sourcestring after replacement:
"startofjunk junkjunkjunkjunk","url":"https://x.com/a/C25/XPS - Connection - May 2013.docx","contentsource":"AX","returpath":null,"detailpath":"https://ax.sample.com/Rep>ositories/form.aspx?path=C25/96/99&mode=Read","detailspath2":"samplepath"
1
  • Hi @Denomales Thanks! what should we add to modify the matching spaces to a single space? Like this: "startofjunk junkjunkjunkjunk" Commented Jun 4, 2013 at 5:48
0

Use a look-ahead to assert that your spaces occur before "url". Also use a look-behind so your whole match is the excess spaces:

(?<=\s)\s+(?=.*"url":)

To remove excess spaces, replace the entire match with blank (ie nothing), or if your application language allows it, remove the entire match.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.