MySQL, optimize update query

Question

My query is this:

UPDATE `phrases`
SET `phrases`.`count`=(SELECT COUNT(*) FROM `strings` WHERE `string` LIKE CONCAT('%', `phrases`.`phrase`, '%'))

My tables look like this:

CREATE TABLE `phrases` (
    `hash` varchar(32) NOT NULL,
    `count` int DEFAULT 0,
    `phrase` text NOT NULL,
     PRIMARY KEY (`hash`),
     KEY(`count`)
)

And

CREATE TABLE `strings` (
    `string` text NOT NULL,
)

phrases has 18,000 rows. strings has 1500 rows.

So the query works as is? Then this should be on codereview.SE.
It might be more efficient to have a separate table where you would store the counts per phrase, and then only update this table once a new string is added. Since the number of strings is low in comparison to the phrases, I figure this wont happen that often. So you would not perform the whole count again, just add 1 if the new string matches that phrase.

Bill Karwin · Accepted Answer · 2011-12-20 22:04:52Z

Since you're using a LIKE with wildcards, you're going to do a table-scan against both tables, running a total of 18000*1500 = 27000000 substring comparisons.

To optimize this, you need to use some fulltext index technology. I suggest Sphinx Search or Apache Solr. If you do this, you don't need to keep a count of how many matches there are, because the search index makes it a lot less expensive to get a count on demand.

MySQL also implements a FULLTEXT index type, but it is only supported in the MyISAM storage engine in current versions (up to 5.5). I don't recommend using MyISAM for important data.

MySQL 5.6 is developing a fulltext index for InnoDB.

+1 because this answer just makes more sense than any other solution.

RolandoMySQLDBA · Answer 2 · 2011-12-20 22:07:34Z

You should drop the index and collect the counts.

This will speed up the updating of the count column.

When done, put the index back.

ALTER TABLE phrase DROP INDEX `count`;
UPDATE phrase SET COUNT=0;
UPDATE phrases INNER JOIN string
ON ( LOCATE(strings.string,phrases.phrase) > 0 )
SET phrase.`count`=phrase.`count`+1;
ALTER TABLE phrase ADD INDEX `count` (`count`);

This INNER JOIN is nothing more than a Cartesian Product (pointed out by Bill Karwin's answer as 27,000,000 rows being examined in a temp table).

If the time to process is something can live with, all well and good.

If the time to process is disastrously slow, you must try Bill Karwin's answer.

asked	1 year ago
viewed	175 times
active	1 year ago

MySQL, optimize update query

migrated from stackoverflow.com Dec 22 '11 at 5:18

2 Answers

Your Answer

MySQL, optimize update query

migrated from stackoverflow.com Dec 22 '11 at 5:18

2 Answers

Your Answer

Sign up or log in

Post as a guest

Related