I am trying to parse a webpage and to get the number reference after <li>YM#. For example I need to get 1234-234234 in a variable from the HTML that contains

<li>YM# 1234-234234         </li>

Many thanks for your help someone!

Rich

link|improve this question

52% accept rate
What have you tried so far? – Jasper Apr 21 at 5:47
var text = document.body.innerHTML; console.log(text); text = text.replace(/(<([^>]+)>)/ig,''); var id = text.match(/YM#[0-9]-[0-9]/g); I know I am a way off! – Rich Apr 21 at 5:57
feedback

3 Answers

up vote 1 down vote accepted

currently, your regex only matches if there is a single number before the dash and a single number after it. This will let you get one or more numbers in each place instead:

/YM#[0-9]+-[0-9]+/g

Then, you also need to capture it, so we use a cgroup to captue it:

/YM#([0-9]+-[0-9]+)/g

Then we need to refer to the capture group again, so we use the following code instead of the String.match

var regex = /YM#([0-9]+-[0-9]+)/g;
var match = regex.exec(text);
var id = match[1];
 // 0: match of entire regex
 // after that, each of the groups gets a number
link|improve this answer
Thankyou for telling me exactly how to do it. It didn't actually work initially but it was so close I managed to fix it myself. It needed \s in between # and [ as there is a space. Thanks so much Jasper! – Rich Apr 21 at 6:17
feedback

(?!<li>YM#\s)([\d-]+)

http://regexr.com?30ng5

This will match the numbers.

link|improve this answer
feedback

Try this:
(<li>[^#<>]*?# *)([\d\-]+)\b
and get the result in $2.

link|improve this answer
Sorry to be stupid but how do I get the result in $2! – Rich Apr 21 at 5:59
Try: result = subject.replace(/(<li>[^#<>]*?# *)([\d]+)\b/g, "$2"); – Cylian Apr 21 at 7:46
feedback

Your Answer

 
or
required, but never shown

Not the answer you're looking for? Browse other questions tagged or ask your own question.