Tell me more ×
Database Administrators Stack Exchange is a question and answer site for database professionals who wish to improve their database skills and learn from others in the community. It's 100% free, no registration required.

I am finding what is the best way to estimate size of a table for that i have studied lot of blogs and forums but unable to find any accurate answer

For an example we have a table City with InnoDB engine,lets say in future (in next 1 year) it will have 1 million of records so what will be the estimated data size and index size of that table in this period.

mysql> desc City;
+-------------+----------+------+-----+---------+----------------+
| Field       | Type     | Null | Key | Default | Extra          |
+-------------+----------+------+-----+---------+----------------+
| ID          | int(11)  | NO   | PRI | NULL    | auto_increment |
| Name        | char(35) | NO   |     |         |                |
| CountryCode | char(3)  | NO   | MUL |         |                |
| District    | char(20) | NO   |     |         |                |
| Population  | int(11)  | NO   |     | 0       |                |
+-------------+----------+------+-----+---------+----------------+
5 rows in set (0.03 sec)

UPDATE

What will be the estimated upper bound (Max size of table) with 1 million records and how can we estimate it.

share|improve this question
This is great. but is it possible to get the index size column wise. What is mean is if you have a table with (say) 5 columns indexed. Can we get the index size of each one of them? I will ask this as another question. thanks – Sushil Sep 11 at 10:04

3 Answers

Given the table description, I see

  • 66 bytes per row of data
  • 4 bytes per row for the primary key
  • 7 bytes per row for country code index
    • 3 bytes for the country
    • 4 bytes for Clustered Key attached to the country code
  • Total of 77 bytes of data and keys
  • This does not factoring housekeeping for BTREEs or Tablespace Fragmentation

For a million rows, that would 77,000,000 bytes (73.43 MB)

As for measuring the table, for a given table mydb.mytable, you can run this query

SELECT 
    CONCAT(FORMAT(DAT/POWER(1024,pw1),2),' ',SUBSTR(units,pw1*2+1,2)) DATSIZE,
    CONCAT(FORMAT(NDX/POWER(1024,pw2),2),' ',SUBSTR(units,pw2*2+1,2)) NDXSIZE,
    CONCAT(FORMAT(TBL/POWER(1024,pw3),2),' ',SUBSTR(units,pw3*2+1,2)) TBLSIZE
FROM
(
    SELECT DAT,NDX,TBL,IF(px>4,4,px) pw1,IF(py>4,4,py) pw2,IF(pz>4,4,pz) pw3
    FROM 
    (
        SELECT data_length DAT,index_length NDX,data_length+index_length TBL,
        FLOOR(LOG(IF(data_length=0,1,data_length))/LOG(1024)) px,
        FLOOR(LOG(IF(index_length=0,1,index_length))/LOG(1024)) py,
        FLOOR(LOG(data_length+index_length)/LOG(1024)) pz
        FROM information_schema.tables
        WHERE table_schema='mydb'
        AND table_name='mytable'
    ) AA
) A,(SELECT 'B KBMBGBTB' units) B;

To measure all tables grouped by Database and Storage Engine

SELECT
    IF(ISNULL(DB)+ISNULL(ENGINE)=2,'Database Total',
    CONCAT(DB,' ',IFNULL(ENGINE,'Total'))) "Reported Statistic",
    LPAD(CONCAT(FORMAT(DAT/POWER(1024,pw1),2),' ',
    SUBSTR(units,pw1*2+1,2)),17,' ') "Data Size",
    LPAD(CONCAT(FORMAT(NDX/POWER(1024,pw2),2),' ',
    SUBSTR(units,pw2*2+1,2)),17,' ') "Index Size",
    LPAD(CONCAT(FORMAT(TBL/POWER(1024,pw3),2),' ',
    SUBSTR(units,pw3*2+1,2)),17,' ') "Total Size"
FROM
(
    SELECT DB,ENGINE,DAT,NDX,TBL,
    IF(px>4,4,px) pw1,IF(py>4,4,py) pw2,IF(pz>4,4,pz) pw3
    FROM 
    (SELECT *,
        FLOOR(LOG(IF(DAT=0,1,DAT))/LOG(1024)) px,
        FLOOR(LOG(IF(NDX=0,1,NDX))/LOG(1024)) py,
        FLOOR(LOG(IF(TBL=0,1,TBL))/LOG(1024)) pz
    FROM
    (SELECT
        DB,ENGINE,
        SUM(data_length) DAT,
        SUM(index_length) NDX,
        SUM(data_length+index_length) TBL
    FROM
    (
       SELECT table_schema DB,ENGINE,data_length,index_length FROM
       information_schema.tables WHERE table_schema NOT IN
       ('information_schema','performance_schema','mysql')
       AND ENGINE IS NOT NULL
    ) AAA GROUP BY DB,ENGINE WITH ROLLUP
) AAA) AA) A,(SELECT ' BKBMBGBTB' units) B;

Run these queries and you can track changes in database/engine disk usage.

Give it a Try !!!

share|improve this answer
+1 ,Thanks for your explanation. – Abdul Manaf Jul 11 at 4:28
SELECT  Table_NAME "tablename",
           data_length   "table data_length in Bytes",
           index_length  "table index_length in Bytes",
           data_free  "Free Space in Bytes"
    FROM  information_schema.TABLES  where  Table_schema = 'databasename';

by executing this query you can get size used for Data and Index of a table , You can check this size against # of rows and predict for 1 million rows

share|improve this answer
I am not sure but will this give some what accurate results ? have you tested this ever ? – Abdul Manaf Jul 10 at 12:18
Actually I am testing this query result periodically to see growth (%) w.r.t. size – Peter Venderberghe Jul 10 at 12:40

It's tedious. But the details are in the docs.

To be as accurate as possible, which is rarely necessary, you'll need to read about the table structure and index structure, too.

If I were in your shoes, I'd build the table, populate it with a million rows of test data, and measure the change in size. Depending on your application, you might need to take the size of transaction log files into account, too.

share|improve this answer

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.