Take the 2-minute tour ×
Code Review Stack Exchange is a question and answer site for peer programmer code reviews. It's 100% free, no registration required.

I am in the midst of writing a simple job board and I have the URL's filtering and working how I want, everything works - I just want to know if I can improve it, if there are any security concerns, and if it is horribly inefficient.

Take the following URL section - foo.com/in/london/at/google/do/design

I assign each section a variable and work out the location, company, and category.

Other use cases that work:

  • Switching the order - foo.com/at/google/in/london/do/design
  • Having less parameters - foo.com/in/london/at/google

My code to figure out all these variables is:

$regex = "[^a-zA-Z0-9,-]";
$a = isset($_GET["a"]) ? preg_replace("/".$regex."/", "", $_GET["a"]) : "";
$aa = isset($_GET["aa"]) ? preg_replace("/".$regex."/", "", $_GET["aa"]) : "";
$b = isset($_GET["b"]) ? preg_replace("/".$regex."/", "", $_GET["b"]) : "";
$bb = isset($_GET["bb"]) ? preg_replace("/".$regex."/", "", $_GET["bb"]) : "";
$c = isset($_GET["c"]) ? preg_replace("/".$regex."/", "", $_GET["c"]) : "";
$cc = isset($_GET["cc"]) ? preg_replace("/".$regex."/", "", $_GET["cc"]) : "";

if ($a == "in" && $aa != null) { $searchLocation = $aa; }
if ($b == "in" && $bb != null) { $searchLocation = $bb; }
if ($c == "in" && $cc != null) { $searchLocation = $cc; }

if ($a == "at" && $aa != null) { $searchCompany = $aa; }
if ($b == "at" && $bb != null) { $searchCompany = $bb; }
if ($c == "at" && $cc != null) { $searchCompany = $cc; }

if ($a == "do" && $aa != null) { $searchCategory = $aa; }
if ($b == "do" && $bb != null) { $searchCategory = $bb; }
if ($c == "do" && $cc != null) { $searchCategory = $cc; }

if ($a == "missinghtml") { $errorUrl = true; }

I'm looking at this thinking there must be a better to do this. Is this section secure? Any thoughts on this are much appreciated. Like I say it works, but can it be better?

share|improve this question
1  
Welcome to Code Review! I have rolled back the last edit. Please see what you may and may not do after receiving answers. –  Vogel612 16 hours ago

3 Answers 3

Your regex could be improved:

$regex = "[^a-zA-Z0-9,-]";
$a = isset($_GET["a"]) ? preg_replace("/".$regex."/", "", $_GET["a"]) : "";

The character , shouldn't be in a URL, but _ can be.
You don't really need to add / to the beginning and end of your regex every time.
I see a lot of people do a-zA-Z when they could really just do a-z with a case insensitive search.

Becomes:

$regex = "/[^a-z0-9-_]/i"
          ^           ^^
$a = isset($_GET["a"]) ? preg_replace($regex, "", $_GET["a"]) : "";

Using if ($a == "in" && $aa != null) could probably be improved also as if $aa is not null, then it evaluates to true anyway:

if ($a == "in" && $aa)

With if ($a == "missinghtml") { $errorUrl = true; }, you should make it into an else if loop

like:

if ($a == "in" && $aa) { $searchLocation = $aa; }
else if ($a == "missinghtml") { $errorUrl = true; }
share|improve this answer
    
That's much appreciated, thanks! I will be using commas in the URL, the use case being to search in multiple locations: - foo.com/do/design/in/london,liverpool/ Everything else I've updated and it's still working the same. Thank you. –  Craig 20 hours ago
    
+1 for the tip about /i, never thought about using that before –  Dan Pantry 15 hours ago

In addition to the changes already proposed by Quill I strongly hope you reconsider the approach you're taking here.

I assign each section a variable ...

You can strongly simplify your code by just extracting a group match out of the URL:

$regex = '$/(in|at|do)/([^/]+)/$i';

Matching this regex will get you your $search... as the second group to extract.

under the assertion that a search-identifier will always be followed by it's value and not another identifier you can thus extract your values from the URL as follows:

$URL // here's your URL
$locationRegex = '$/in/([^/]+)/$i';
$resultcontainer;
if(preg_match($locationRegex, $URL, $resultcontainer)) {
     $searchLocation = $resultcontainer[1];
}
// and similarly for the other variables

now we should put these two approaches together...

$url // here's your URL
$pattern = '$/(in|at|do)/([^/]+)/$gi';
$resultContainer;
if (preg_match($pattern, $url, $resultContainer)) {
    // at this point your result container contains up to 6 elements:
    for ($i = 0; $i < count($resulContainer) - 1; $i = $i + 2) {
       switch ($resultContainer[$i]) {
            case "in":
               $searchLocation = $resultContainer[$i + 1];
               break;
            case "at":
               $searchCompany = $resultContainer[$i + 1];
               break;
            case "do":
               $searchCategory = $resultContainer[$i + 1];
               break;
        }

This approach basically parses the URL tokens into "key-value-pairs" which might be better stored into a different array structure, but that depends on what you need to do with this later on ;)

share|improve this answer
    
Shouldn't the regex be (in|at|do)/([^/]+)? –  prakharsingh95 17 hours ago
    
@prakharsingh95 my PHP is not the best.. isn't $ a valid regex delimiter? –  Vogel612 17 hours ago
    
I meant the added +. Otherwise the second capture group would be l only for foo/in/london –  prakharsingh95 17 hours ago
    
ohhh crap, yes you're completely right –  Vogel612 17 hours ago
    
Thanks guys, I've edited my original question with the updated code I'm using. –  Craig 16 hours ago

Thanks so much for your combined suggestions, my code looks like this:

$regex = "/[^a-z0-9,-\/]/i";
$queryURL = $_SERVER['REQUEST_URI']."/";
$cleanQueryURL = preg_replace("".$regex."", "", $queryURL);
if(preg_match('/in\/(.+?)\//i', $cleanQueryURL, $resultcontainer)) {
    $searchLocation = $resultcontainer[1];
} else { $searchLocation = "all"; }
if(preg_match('/at\/(.+?)\//i', $cleanQueryURL, $resultcontainer)) {
    $searchCompany = $resultcontainer[1];
} else { $searchCompany = "all"; }
if(preg_match('/do\/(.+?)\//i', $cleanQueryURL, $resultcontainer)) {
    $searchCategory = $resultcontainer[1];
} else { $searchCategory = "all"; }

I had to modify the regex slightly, but I used this - https://regex101.com/r/iZ8pV4/1 - and got there in the end!

You guys rock :)

share|improve this answer

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.