Tell me more ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.

Oddly enough I haven't found anywhere that has answer this question specificly, all the other stack overflow things I've found aren't exactly right.

I have a body text I need to search through for image urls, this doesn't mean anything complex but basically things like:

http://www.google.com/logo.png

http://reddit.com/idfaiodf/test.jpg

NOT

http://reddit.com/sadfasdf/test.jpgMORECONTENTHERE

All the regex I've used will include the "MORECONTENTHERE" in the results. It's frustrating as hell. I just want the URL with nothing appended after or added on before!

Also I don't want anything that does HTML image link extracting - I'm not pulling these from HTML.

Any regex to do this?

EDIT:

So here is what I'm using as a source: http://pastebin.com/dE2s1nHz

It's HTML but I didn't want to mention that because I didn't want people to do

share|improve this question
 
If you're not pulling these from HTML please post an example of where you are getting them from. Without that it's going to be very difficult to avoid either trapping your third example, or not trapping your first two. –  Mike W Aug 7 at 3:14
 
Ok, adding an example now –  Mandatory Programmer Aug 7 at 3:50
 
possible duplicate of PHP: Regular Expression to get a URL from a string –  Andy Lester Aug 7 at 4:15

4 Answers

https?://[a-zA-Z0-9.]/[a-zA-Z0-9-&.]+\.(jpg|png|gif|tif|exf|svg|wfm)

I picked some arbitrary image types, and possibly missed a couple special characters allowed in URLs. Feel free to customize for your needs.

share|improve this answer
 
I think that will miss images that are not in the root directory. And domains with a dash. –  ironcito Aug 7 at 3:17

Try following code:

$text = <<< EOD
http://www.google.com/logo.png
http://reddit.com/sadfasdf/test.jpgMORECONTENTHERE
http://reddit.com/idfaiodf/test.jpg
EOD;

preg_match_all('/\bhttps?:\/\/\S+(?:png|jpg)\b/', $text, $matches);
var_dump($matches[0]);
share|improve this answer

This matches a string ending with a known image extension.

<?php

    $string = "Oddly enough I haven't found anywhere that has answer this question specificly, all the other stack overflow things I've found aren't exactly right.

    I have a body text I need to search through for image urls, this doesn't mean anything complex but basically things like:

        http://www.google.com/logo.png

        http://reddit.com/idfaiodf/test.jpg

    NOT

        http://reddit.com/sadfasdf/test.jpgMORECONTENTHERE
    ";

    $pattern = '~(http.*\.)(jpe?g|png|[tg]iff?|svg)~i';

    $m = preg_match_all($pattern,$string,$matches);

    print_r($matches[0]);

?>

Output

Array
(
    [0] => http://www.google.com/logo.png
    [1] => http://reddit.com/idfaiodf/test.jpg
    [2] => http://reddit.com/sadfasdf/test.jpg
)
share|improve this answer
https?://[^/\s]+/\S+\.(jpg|png|gif)
  1. https? is "http" or "https"
  2. :// is literal
  3. [^/\s]+ is anything but a "/" or space
  4. / is literal
  5. \S+ is anything but a space
  6. \. is "."
  7. (jpg|png|gif) is image extensions, delimited by |

Result:

enter image description here

The above is taken from RegexBuddy, used in Wine on Mac. "PCRE" is equivalent to preg_* functions. Expression should work in most regular expression flavors.

share|improve this answer

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.