General best practices

This page provides best practices for getting the best performance, durability, and availability from Cloud SQL.

If issues occur with your Cloud SQL instance, review the following during troubleshooting:

Instance configuration and administration

Best practice More information
Read and follow the operational guidelines to ensure that your instances are covered by the Cloud SQL SLA.
Configure a maintenance window for your primary instance to control when disruptive updates can occur. See Maintenance window.
If you delete and recreate instances regularly, use a timestamp in the instance ID to increase the likelihood that new instance IDs are usable. You cannot reuse the ID of a deleted instance for a few days after the instance is deleted.
Don't start an administrative operation before the previous operation has completed.

Cloud SQL instances do not accept a new operation request until they have completed the previous operation. If you attempt to start a new operation prematurely, the operation request fails. This includes instance restarts.

The instance status in the Cloud Console does not reflect whether an operation is running. The green check mark denotes only that the instance is in the RUNNABLE state. To see whether an operation is running, go to the Operations tab and check the status of the most recent operation.

Set SQL Server settings so that they work optimally for Cloud SQL. See SQL Server settings.
Tune the instance optimally for test runs. The following table lists configuration values suitable for test runs.
  • vCPU: 40
  • Memory: 262144 MB
  • MAXDOP: 8
  • Cost threshold for parallelism: 120
  • tempdb files: 8. Pre-sized to prevent autogrowth.
  • User database files: Autogrow set in 64-128 MB. Presized to prevent autogrowth.
  • Storage: >= 4TB for the best IOPS
Determine the capacity of the I/O subsystem before you deploy SQL Server.

Test a variety of I/O types and sizes. The size of the I/O issued to the persistent disk storage coming from SQL Server affects the IOPS and throughput. The SQL Server workload is throttled when it reaches the IOPS limit or the throughput limit. The storage type being used in Cloud SQL is PD SSD, which is suitable for high-performance enterprise-level workloads.

Customize the VM to maximize performance as follows:

  • A disk size of 4TB or greater provides more throughput and IOPS.
  • Higher vCPU provides more IOPS and throughput. When using higher vCPU, monitor the DB waits for parallelism, which might also increase.
  • For optimal performance, issue I/O in parallel to achieve a higher I/O queue depth.
Prevent index fragmentation. Reorganize your index or set up a schedule to rebuild your index depending on how often your data is changing.
Update statistics regularly. If statistics are outdated, SQL query optimizer might generate suboptimal query plans. Update statistics especially after large amounts of data have been changed.
Prevent database files from becoming unnecessarily large.

Set autogrow in MBs rather than as a percentage, using increments appropriate to the requirement. Also, proactively manage the growth before autogrowth kicks in.

Additionally, ensure that the Cloud SQL Enable automatic storage increases feature is enabled so that Cloud SQL can add storage space if the database and the instance run out of space.

Data architecture

Best practice More information
Shard your instances where possible. When possible, using many smaller Cloud SQL instances is better than one large instance. Managing a large, monolithic instance presents challenges not posed by a larger number of smaller instances.
Don't use too many database tables.

Too many database tables can impact instance response time.

Application implementation

Best practice More information
Use good connection management practices, such as connection pooling and exponential backoff. Using these techniques improves your application's use of resources and help you stay within Cloud SQL connection limits. For more information and code samples, see Managing database connections.
Test your application's response to maintenance updates, which can happen at any time during the maintenance window. Changing the machine type of an instance is the closest approximation of a maintenance update. Make sure the application attempts to reconnect to the database, preferably using exponential backoff, for at least 10 minutes to ensure the application will resume operation after a maintenance event. For more information, see Managing database connections.
Test your application's response to failovers, which can happen at any time. You can manually initiate a failover using the Cloud Console, the gcloud, or the API. See Initiating failover.
Avoid large transactions. Keep transactions small and short. If a large database update is needed, do it in several smaller transactions rather than one large transaction.
If you are using the Cloud SQL Auth proxy, make sure you are using the most up-to-date version. See Keeping the Cloud SQL Auth proxy up-to-date.

Data import and export

Best practice More information
Speed up imports for small instance sizes. For small instances, you can temporarily increase the CPU and RAM of an instance to improve performance when importing large datasets.
If you are exporting data for import into Cloud SQL, be sure to use the proper procedure. See Exporting data from an externally managed database server.

Backup and recovery

Best practice More information
Protect your data with the appropriate Cloud SQL functionality.

Backups and exports are ways to provide data redundancy and protection. They each protect against different scenarios and complement each other in a robust data protection strategy.

Backups are lightweight; they provide a way to restore the data on your instance to its state at the time you took the backup. However, backups have some limitations. If you delete the instance, the backups are also deleted. You can't back up a single database or table. And if the region where the instance is located is unavailable, you cannot restore the instance from that backup, even in an available region.

Exports take longer to create, because an external file is created in Cloud Storage that can be used to recreate your data. Exports are unaffected if you delete the instance. In addition, you can export only a single database or even table, depending on the export format you choose.

When using the export backup feature on an Enterprise or Standard SQL Server instance, avoid creating a GZ archive file because it tries to compress a backup that's already natively compressed by SQL Server.

SQL Server settings

Some SQL Server settings are recommended for Cloud SQL. The following topics describe some recommendations.

Global configuration setting

Setting Recommendation
max worker threads Retain the default value of 0. This setting defines the number of threads available to SQL Server based on the number of CPUs. The value is automatically calculated by the SQL Server engine at startup.

Database settings to modify

For optimal performance of the SQL Server database, set the following SQL Server settings as suggested below.

Setting Recommendation
cost threshold for parallelism

This is the threshold at which the SQL optimizer executes a query using parallelism. The default value of 5 can cause too many queries to run in parallel, thereby increasing database wait time on parallel threads. To reduce this type of contention, increase the value.

The value is ignored when maxdop is set to 1.

max degree of parallelism (MAXDOP)

To reduce database waits due to parallelism, adjust this value based on specific recommendations pertaining to the number of logical processors available.

optimize for ad hoc workloads

Avoid having a large number of single-use plans in the plan cache. To improve the efficiency of the plan cache for workloads that contain many single use ad hoc batches, set this option to 1.

tempdb

Pre-size tempdb so that it does not need to autogrow. All files in tempdb should be equally sized and have the same file growth set.

The database wait type for tempdb contention appears as PAGELATCH_UP. To reduce the contention, add more files.

If the number of processors is less than or equal to 8, use the same number of files as logical processors. If the number of processors is greater than 8, use 8 data files. If contention continues, increase the number of files by multiples of 4 until there is no further contention.

Depending on your workload, you might want to modify the following settings as well.

Setting Recommendation
Close Cursor on Commit Enabled The default value is off, which means that cursors are not closed automatically when you commit a transaction.
Default Cursor This option controls the scope of a cursor that is used in T-SQL code. If you change this setting, evaluate the application code for any adverse effects.
Page Verify This option allows SQL Server to calculate a checksum for a database page before it is written to disk and store the checksum in the page header. When a page is read again, the checksum is recomputed to verify the integrity of the page. The recommended value is checksum.
Parameterization The default value is simple. Simple parameterization allows SQL Server to replace literal values in a query with parameters. Microsoft provides guidelines about how to change this value and use it with plan guides.

Database settings to retain

For optimal performance of the SQL Server database, retain the default values of the following SQL Server settings.

Setting Default value to be retained
Auto Close False. This setting, when on, opens and closes connections and flushes the procedure after each connect. This can cause performance degradation in databases that are frequently accessed.
Auto Shrink False. Turning it on can lead to database and index fragmentation and other performance issues, some of which are discussed in this SQL Server blog.
Date Correlation Optimization Enabled False. Enabling it can let the optimizer find and optimize relationships between dates across two related tables. Tracking this in SQL Server comes with some performance overhead.
Legacy Cardinality Estimation False. In some cases, SQL Server cannot accurately calculate cardinalities when this setting is enabled.
Parameter Sniffing ON. Parameter sniffing from database tables can help create execution plans for reuse. If the tables have unevenly distributed data, the resultant execution plans might lead to performance issues. With such data, use other options from the Query Store rather than modify this setting.
Query Optimizer Fixes False. When enabled, it can affect the performance of the SQL Server cardinality estimator. If you choose to enable it, test to ensure that there is no query regression.
Auto Create Statistics True. This option allows SQL Server to create single-column statistics that can improve cardinality estimates for query plans.
Auto Update Statistics True. This option allows SQL Server to update out-of-date statistics using a re-compilation threshold that is based on table cardinality.
Auto Update Statistics Asynchronously False. This option, when enabled, directs the SQL query optimizer to use the stale statistics for the current query execution, while updating the statistics asynchronously to benefit future workloads.

However, if you expect predictable response time for a frequently executed query or if your application frequently experiences client request timeouts while awaiting statistics updates, consider enabling this option and disabling Auto Update Statistics.

Target Recovery Time (Seconds) 60. This setting establishes an upper bound on the recovery time for a database by flushing dirty pages more or less frequently to the disk from the buffer pool. For highly transactional workloads, a lower value for this setting, combined with the storage IOPS near the maximum value, can contribute to a performance bottleneck.

Trace flag settings

Trace flags in SQL Server are used to set certain characteristics, alter the behavior of SQL Server databases, or debug issues in SQL Server.

Some SQL Server trace flags are supported in Cloud SQL and can be set using database flags. The recommended settings are as follows.

Trace flag Recommended
1204 Yes, except for workload-intensive servers that generate a lot of deadlocks.
1222 Yes, except for workload-intensive servers that generate a lot of deadlocks.
1224 No. This can result in more memory usage and cause memory pressure on the database.
2528 No. Parallel checking of objects is the default and is recommended. The degree of parallelism is automatically calculated by the database engine.
3205 No. Tape drives for backups is a feature of Cloud SQL for SQL Server.
3226 No, unless you need frequent backups, such as TLOG backups.
3625 No. Because the root account does not have system administrator access, it might not be able to see all error messages.
4199 No. This affects the cardinality estimator and can lead to query regression.
4616 No. This restriction lowers the security around application roles. It needs to be validated based on application requirements.
7806 Yes. If the database server becomes unresponsive, the dedicated admin connection (DAC) might be the only way to make a connection for diagnostics.