Code Review Stack Exchange is a question and answer site for peer programmer code reviews. Join them; it only takes a minute:

Sign up
Here's how it works:
  1. Anybody can ask a question
  2. Anybody can answer
  3. The best answers are voted up and rise to the top

I have created this PHP script in Laravel. I have database table with items and in my app I can create tags with different regexes. If regex matches text in description or epoch. I will insert relation into ItemTag table.

My current solution is really slow (few hours to run). I have about 200 000 items and will increase fast. And I have hundreds of tags and each tag uses up to 10 regexes.

<?php

namespace App;

use Illuminate\Support\Facades\DB;

class Tagger
{
    private $sqlStart;
    private $sql;
    private $sqlCount;

    const MAX_INSERTS_IN_QUERY = 500;


    public function __construct()
    {
        $this->sqlStart = "INSERT IGNORE INTO `item_tag`(`id_item`, `id_tag`) VALUES ";
        $this->sql = "";
        $this->sqlCount = 0;
    }


    public function start()
    {
        $tags = Tag::where('active', 1)->get();
        $items = Item::all();

        // Clear all old tags
        ItemTag::truncate();

        $result = '';
        foreach ($items as $item) {

            foreach ($tags as $tag) {
                $regexes = $tag->regexes;

                foreach ($regexes as $regex) {
                    if (preg_match('/' . base64_decode($regex->regex) . '/i', $item->description)
                        || preg_match('/' . base64_decode($regex->regex) . '/i', $item->epoch)
                    ) {

                        if ($this->sql != "") {
                            $this->sql .= ",";
                        }

                        $this->sqlCount++;
                        $this->sql .= "(" . $item->id_item . "," . $tag->id_tag . ")";
                        break;
                    }
                }

            }

            if ($this->sqlCount >= self::MAX_INSERTS_IN_QUERY) {
                $this->executeQuery();
            }
        }

        if ($this->sqlCount > 0) {
            $this->executeQuery();
        }

        return $result;
    }


    private function executeQuery()
    {
        DB::connection()->getPdo()->exec($this->sqlStart . $this->sql);
        $this->sqlCount = 0;
        $this->sql = "";
    }
}

EDIT: Is this solution faster? It shouldn't be. There are many more selects.

public function start()
{
    // Clear all old tags
    ItemTag::truncate();

    $tags = Tag::where('active', 1)->get();

    foreach ($tags as $tag) {
        $regexes = $tag->regexes;

        foreach ($regexes as $regex) {
            $items = Item::where("description", "REGEX", '/' . base64_decode($regex->regex) . '/i')
                ->where("epoch", "REGEX", '/' . base64_decode($regex->regex) . '/i');

            foreach ($items as $item) {
                if ($this->sql != "") {
                    $this->sql .= ",";
                }

                $this->sqlCount++;
                $this->sql .= "(" . $item->id_item . "," . $tag->id_tag . ")";

                if ($this->sqlCount >= self::MAX_INSERTS_IN_QUERY) {
                    $this->executeQuery();
                }
            }
        }
    }

    return "";
}
share|improve this question
    
1- What is your PHP version? 2- Is case-insensitivity (i flag) a real need? – revo Jan 11 at 17:26
    
1 - I am using PHP 5.6, but I can upgrade to PHP 7, if it helps. 2 - I need case-insensitivy flag, but I can rewrite regexes like this: /test/i -> /[tT][eE][sS][tT]/ – user2351471 Jan 12 at 9:30
    
It's hard to affect performance without seeing actual regular expressions. So you have to show us something and if you can upgrade, do it right now. PCRE JIT is a feature that comes with PHP 7.0 and its effects would be incomparable. – revo Jan 12 at 11:31

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Browse other questions tagged or ask your own question.