HomeContributionsMySQL OPTIMIZE TABLE: How to Slash Disk I/O and Say Goodbye to...

MySQL OPTIMIZE TABLE: How to Slash Disk I/O and Say Goodbye to Database Bloat

-

This article was contributed by Edwin Sanchez who is a brand ambassador of Devart, software developer and project manager.

There is this one problem that's killing your MySQL database performance. This killer causes your to cry for help because of immense I/O. It's a kind of database bloat you need to fix. So, what is it? And how can you optimize a MySQL table to kill this I/O monster?

Don't despair. That's the thing about this article. You will learn about its secrets. And knowing its secrets will make this monster easy to kill. And your secret weapon? The MySQL OPTIMIZE TABLE!

Here's how we will tackle this article:

  • What's This I/O Monster, and Why MySQL OPTIMIZE TABLE Can Stop It?
  • When You Should Use MySQL OPTIMIZE TABLE (And When You Should Not)
  • How to Find MySQL Tables for Optimization
  • How to Use MySQL OPTIMIZE TABLE
  • Optimizing Tables With dbForge Studio for MySQL

So, brace yourself and let's begin.

What's This I/O Monster, And Why MySQL OPTIMIZE TABLE Can Stop It?

Behold the I/O monster – fragmentation.

Let me explain by giving you some MySQL database file organization first. This assumes you're using MySQL 8 and the InnoDB engine. When you make a new table, MySQL creates a file with your table name and a .ibd file extension. And this is the default behavior. Unless you turn off the innodb_file_per_table system variable. MySQL also organizes data and indexes in fixed-size pages in the file (16KB by default). Over time, you do some INSERTs, UPDATEs, and DELETEs on your tables. Because of these frequent operations, your table becomes fragmented. The .ibd file becomes bloated with unused space. And the more you delay fixing it, the gaps will increase.

And it becomes an I/O monster when the pages are farther apart. This can lead to slower performance. Aside from I/O, the file gets larger, consuming more storage space. So, the secret is to keep these pages closer together and remove the bloat.

MySQL OPTIMIZE TABLE will reorganize the data and indexes. Then, it removes the gaps in the file, and reclaim unused space. And when this happens, you say goodbye to this I/O monster – for the time being.

Will you stop adding, updating, and deleting rows? Of course, you won't. If so, fragmentation will reappear in time. It's like battling an enemy that keeps resurrecting. But you always have MySQL OPTIMIZE TABLE at your disposal. So, it's a war you can always win. Note that there are other ways to fight this. But in this article, we'll focus on OPTIMIZE TABLE.

Later, you will see how to do it in the command line or using a GUI tool with a wizard.

When You Should Use MySQL OPTIMIZE TABLE (And When You Should NOT)

Should you jump in and optimize all the tables now?

If your database is running fine, should you?

Now, hold your horses! When your database is running fine, it may not need optimization now. But after a while, you may experience slow queries that were fast before. That's the time you plan to optimize tables.

Even then, you should know there's a dark side when you optimize MySQL tables. MySQL performs an exclusive table lock during its prepare and commit phases. This means that it will hold other transactions until MySQL releases the lock. So, other database operations need to wait. You might lose your patience when optimizing a very large table. Or your users will throw their frustrations at you because of their lagging app.

So, the best time to optimize MySQL tables is during off-hours. Or when there's little to no usage. Otherwise, you will announce a downtime period if it's busy 24/7.

Note also that not all tables are candidates for optimization. So, you need to find fragmented tables first.

How to Find MySQL Tables for Optimization

Instead of guessing which tables to optimize, it's wise to query them. Here's a sample query when you need to find fragmented tables in a WordPress MySQL database:

SELECT table_name, data_length, data_free
FROM information_schema.tables
WHERE table_schema='wp_blogdb'
ORDER BY data_free DESC;

The name of the WordPress database is wp_blogdb. You can change the schema to point to your database name in question. Here's the output using the MySQL CLI:

DATA_LENGTH is the size in pages multiplied by the InnoDB page size (16KB by default). And DATA_FREE is the number of allocated but unused bytes. So, when it shows a big number in bytes, the tables are candidates for MySQL OPTIMIZE TABLE. But note that a non-zero value in DATA_FREE does not mean that OPTIMIZE TABLE will remove all of it. More on this later.

Note that the value of DATA_FREE may not be accurate for partitioned tables. So, instead of using INFORMATION_SCHEMA.TABLES, use INFORMATION_SCHEMA.PARTITIONS. You may want to SUM the DATA_FREE to get unused bytes for all table partitions.

Here's an example:

SELECT table_name, SUM(data_free)
FROM information_schema.partitions
WHERE table_schema = 'db_name'
GROUP BY table_name;

Another alternative to finding fragmented tables is using SHOW TABLE STATUS. But there are many columns you don't need to see. So, you can stick to the queries above.

You can also check the entire MySQL server for fragmentation. Here's a sample:

SELECT table_schema AS database_name, table_name, data_length, data_free
FROM information_schema.tables
WHERE table_schema NOT IN ('mysql','performance_schema','sys','information_schema')
AND DATA_FREE > 0
ORDER BY table_schema, table_name, data_free DESC;

It displays the database name, table name, data length, and data free. It also avoids system databases like sysinformation_schema, and more. Moreover, it avoids tables with zero DATA_FREE or non-fragmented tables.

Here's a sample output in my machine using dbForge Studio for MySQL.

Now that you know how to find fragmented tables, it's time to optimize them.

How to Use MySQL OPTIMIZE TABLE

The syntax for MySQL OPTIMIZE TABLE is this:

OPTIMIZE TABLE , [table2], [table3], [tableN]

You can optimize one or more tables in one go. But you need at least 1 table to make it work.

For InnoDB, OPTIMIZE TABLE calls ALTER TABLE FORCE. Using more tables calls this ALTER TABLE for all the tables in succession. OPTIMIZE TABLE is the shorter option, especially for more than 1 table to optimize.

OPTIMIZE TABLE Example

Let's use the students3 table from the image in the previous section. And let's optimize it using OPTIMIZE TABLE command. The DATA_FREE is 5,242,880 bytes.

Observe How DATA_FREE Updates First

Let's try deleting some data first to fragment the table even more. Then, observe what will happen to DATA_FREE.

DELETE FROM students3
WHERE student_id BETWEEN 500 AND 1500;

This deleted 1000 rows. That's 50% of all rows. After this, I re-run the query below.

SELECT table_schema AS database_name, table_name, data_length, data_free
FROM information_schema.tables
WHERE table_schema NOT IN ('mysql','performance_schema','sys','information_schema')
AND DATA_FREE > 0
ORDER BY table_schema, table_name, data_free DESC;

Guess what? Nothing has changed. What does this mean?

INFORMATION_SCHEMA.TABLES does not update in real time. To update this at any given time, use ANALYZE TABLE. So, that's what I did:

ANALYZE TABLE students3;

Then, when I view the DATA_LENGTH and DATA_FREE, here's the result:

There's a significant jump in the DATA_FREE value. That's from more than 5 million to more than 36 million. And the DATA_LENGTH lowers from more than 66 million to more than 35 million. That's understandable since we deleted 50% of all rows. The pages occupied by the deleted records are now free. Notice also that DATA_FREE is more than DATA_LENGTH.

Optimize the Table

Now, let's optimize it.

OPTIMIZE TABLE students3;

And the result? Check out below:

There's nothing wrong with it. This is the expected result for InnoDB tables. For MyISAM tables, expect only the output in Line 2 above.

Then, after running ANALYZE TABLE, here's the new result for DATA_FREE for the students3 table.

DATA_FREE is not zero but significantly lowered from more than 36 million to more than 4 million.

But why not zero? According to the official documentation, InnoDB only fills pages 93% full. This will allow room for updates without having to split pages. So, expect some level of fragmentation.

So, in the end, you cannot fully defragment a fragmented MySQL table. As long as the pages where data and indexes live don't have gaps, queries will run faster. MySQL will use the remaining small unused space for other write operations.

Optimizing Tables with dbForge Studio

The code completion in dbForge Studio for MySQL is already amazing. But there's also a wizard to optimize and analyze tables in MySQL. Under the Table Maintenance tool, you can do 5 things to a table or tables. Two of them are OPTIMIZE and ANALYZE, which does the same thing we did earlier but easier.

STEP 1: Access the Table Maintenance Tool

Right-click your desired table in Database Explorer. Then, select Table Maintenance. See it below:

Or, from the drop-down menu by clicking Database -> Table Maintenance.

The next step uses the Table Maintenance coming from the Database Explorer.

STEP 2: Select More Tables (Optional)

If you want to maintain more tables, mark checked your desired tables from the list. Then click Next. See the screenshot below.

STEP 3: Select Operation

From here, you can select Optimize or Analyze. Then, click Execute. See the screenshot below with Optimize selected.

Then, a SQL window will appear and optimize the selected tables. See the sample below.

So, the wizard is a shortcut to typing all those texts, and running it.

You can try using Analyze with the same steps.

Takeaways

Fragmentation causes a table to have gaps in the pages occupied by data and indexes. So, running a query on a fragmented table becomes slow. OPTIMIZE TABLE removes these gaps by reorganizing the pages. When the pages are closer together, query performance becomes faster again.

This is easy to do in the command line. And even easier when using GUI tools like dbForge Studio for MySQL.

Would you like to try Devart's trusty tool dbForge Studio for MySQL? Its top-notch code completion, query profiler, and more makes this my tool of choice for MySQL. It can be for you too.

About the author

Edwin Sanchez  is a brand ambassador of Devart, software developer and project manager with a total of 20+ years of software development. His most recent technology preferences include C#, SQL Server BI Stack, Power BI, and Sharepoint. Edwin combines his technical knowledge with his most recent content writing skills to help new breed of technology enthusiasts.

Last Updated on June 21, 2023 1:39 pm CEST by Markus Kasanmascheff

Recent News