Take the 2-minute tour ×
Database Administrators Stack Exchange is a question and answer site for database professionals who wish to improve their database skills and learn from others in the community. It's 100% free, no registration required.

I have a partition table on basis of Primary key having 1.5 Billion rows in it. But when i check ibd files only 80 ibd show in GB's whereas 120ibd shows 1MB file. I would like to know how does the distribution work when i limit Partition with PARTITIONS 200. My structure is as below. Given that id & checksum both are unique all the time.

CREATE TABLE `mapping` (
  `id` varchar(50) NOT NULL DEFAULT '',
  `sha_checksum` varchar(50) NOT NULL DEFAULT '',
  `path_id` varchar(90) DEFAULT NULL,
  `file_id` varchar(35) DEFAULT NULL,
  `server_id` CHAR(6) DEFAULT NULL,
  PRIMARY KEY (`id`,`sha_checksum`)
) ENGINE=InnoDB 
/*!50100 PARTITION BY KEY (id,sha_checksum)
PARTITIONS 200 */

Please help me to understand where im wrong.

share|improve this question
    
A query like SELECT partition_name, table_rows FROM INFORMATION_SCHEMA.PARTITIONS WHERE table_name='mapping'; may help you understand the row distribution among partitions. –  jynus Feb 9 at 13:48
    
thanks jynus, i can see 50% of my partition is empty. This explain 1M file. But it doesn't explain as to why it is empty. –  Rohan Feb 10 at 8:16

1 Answer 1

Rather than trying to explain why most of the partitions seem empty, let me argue against that flavor of PARTITIONing.

To put it bluntly, PK & BY KEY(id, sha1) gains no performance, nor any other benefit that I can imagine.

Note that to get "partition pruning", you have to specify both the id and the sha_checksum. Performance will be essentially the same as having a non-partitioned table with the same PRIMARY KEY.

Since the records are randomly scattered around, I can think of no use for "transportable tablespaces".

What version of MySQL are you using?

What character set are you using? (I'm thinking of space considerations, especially for the CHAR column.)

Please describe id and the application; I can help you devise a better PARTITION scheme, or continue to argue against it.

Another thing to note: More than about 50 partitions slows things down for a variety of reasons, one of which is that all partitions are opened when an INSERT occurs. (Very new versions have 'fixed' this failure to 'prune'.)

share|improve this answer
    
Thanks Rick, Im currently running Percona 5.6.21-70.1 with utf8 Charterset, rows in CHAR column is Fixed width with exact 6 char data. Based on id and sha_checksum we get the data location. Also from parition info table i can see nearly 50% partition in empty; ` mysql> SELECT count() FROM INFORMATION_SCHEMA.PARTITIONS WHERE table_name='mapping' and table_rows<>0; +----------+ | count() | +----------+ | 100| +----------+ ` –  Rohan Feb 10 at 6:48
    
CHAR(6) with utf8 always takes 18 bytes, even if you have only ascii characters. CHAR(6) CHARSET ascii might be more suitable for server_id and save 12 bytes per row. –  Rick James Feb 10 at 17:39
    
Id + sha --> data location. Yes, but partitioning does not help at all (I claim). Why are many empty? I don't know. –  Rick James Feb 10 at 17:40

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.