I have created this PHP script in Laravel. I have database table with items and in my app I can create tags with different regexes. If regex matches text in description or epoch. I will insert relation into ItemTag table.
My current solution is really slow (few hours to run). I have about 200 000 items and will increase fast. And I have hundreds of tags and each tag uses up to 10 regexes.
<?php
namespace App;
use Illuminate\Support\Facades\DB;
class Tagger
{
private $sqlStart;
private $sql;
private $sqlCount;
const MAX_INSERTS_IN_QUERY = 500;
public function __construct()
{
$this->sqlStart = "INSERT IGNORE INTO `item_tag`(`id_item`, `id_tag`) VALUES ";
$this->sql = "";
$this->sqlCount = 0;
}
public function start()
{
$tags = Tag::where('active', 1)->get();
$items = Item::all();
// Clear all old tags
ItemTag::truncate();
$result = '';
foreach ($items as $item) {
foreach ($tags as $tag) {
$regexes = $tag->regexes;
foreach ($regexes as $regex) {
if (preg_match('/' . base64_decode($regex->regex) . '/i', $item->description)
|| preg_match('/' . base64_decode($regex->regex) . '/i', $item->epoch)
) {
if ($this->sql != "") {
$this->sql .= ",";
}
$this->sqlCount++;
$this->sql .= "(" . $item->id_item . "," . $tag->id_tag . ")";
break;
}
}
}
if ($this->sqlCount >= self::MAX_INSERTS_IN_QUERY) {
$this->executeQuery();
}
}
if ($this->sqlCount > 0) {
$this->executeQuery();
}
return $result;
}
private function executeQuery()
{
DB::connection()->getPdo()->exec($this->sqlStart . $this->sql);
$this->sqlCount = 0;
$this->sql = "";
}
}
EDIT: Is this solution faster? It shouldn't be. There are many more selects.
public function start()
{
// Clear all old tags
ItemTag::truncate();
$tags = Tag::where('active', 1)->get();
foreach ($tags as $tag) {
$regexes = $tag->regexes;
foreach ($regexes as $regex) {
$items = Item::where("description", "REGEX", '/' . base64_decode($regex->regex) . '/i')
->where("epoch", "REGEX", '/' . base64_decode($regex->regex) . '/i');
foreach ($items as $item) {
if ($this->sql != "") {
$this->sql .= ",";
}
$this->sqlCount++;
$this->sql .= "(" . $item->id_item . "," . $tag->id_tag . ")";
if ($this->sqlCount >= self::MAX_INSERTS_IN_QUERY) {
$this->executeQuery();
}
}
}
}
return "";
}
i
flag) a real need? – revo Jan 11 at 17:26