10 sql tips to speed up your database
Design your database with caution
This first tip may seems obvious, but the fact is that most database problems come from badly-designed table structure.
For example, I have seen people storing information such as client info and payment info in the same database column. For both the database system and developers who will have to work on it, this is not a good thing.
When creating a database, always put information on various tables, use clear naming standards and make use of primary keys.
Source: http://www.simple-talk.com/sql/database-administration/ten-common-database-design-mistakes/
Know what you should optimize
If you want to optimize a specific query, it is extremely useful to be able to get an in-depth look at the result of a query. Using the EXPLAIN statement, you will get lots of useful info on the result produced by a specific query, as shown in the example below:
EXPLAIN SELECT * FROM ref_table,other_table WHERE ref_table.key_column=other_table.column;
Source: http://dev.mysql.com/doc/refman/5.0/en/using-explain.html
The fastest query… Is the one you don’t send
Each time you’re sending a query to the database, you’re using a bit of your server resources. This is why, on high traffic sites, the best thing you can do in order to speed up your database is to cache queries.
There’s lots of solutions to implement a query cache on your server. Here are a few:
- AdoDB: AdoDB is a database abstraction library for PHP. It allows you to use the database system of your choice (MySQL, PostGreSQL, Interbase, and way much more) and it is designed for speed. AdoDB provides a simple, yet powerful caching system. And last but not least, AdoDB is licenced under the BSD, which means that you can use freely on your projects. A LGPL licence is also available for commercial projects.
- Memcached: Memcached is a distributed memory caching system which is often used to speed up dynamic database-driven websites by alleviating database load.
- CSQL Cache: CSQL Cache is an open-source data caching infrastructure. Never tested it personally, but it seems to be a great tool.
Don’t select what you don’t need
A very common way to get the desired data is to use the * symbol, which will get all fields from the desired table:
SELECT * FROM wp_posts;
Instead, you should definitely select only the desired fields as shown in the example below. On a very small site with, let’s say, one visitor per minute, that wouldn’t make a difference. But on a site such as Cats Who Code, it saves a lot of work for the database.
SELECT title, excerpt, author FROM wp_posts;
Use LIMIT
It’s very common that you need to get only a specific number of records from your database. For example, a blog which is showing ten entries per page. In that case, you should definitely use the LIMIT parameter, which only selects the desired number of records.
Without LIMIT, if your table has 100,000 different records, you’ll extract them all, which is unnecessary work for your server.
SELECT title, excerpt, author FROM wp_posts LIMIT 10;
Avoid queries in loops
When using SQL along with a programming language such as PHP, it can be tempting to use SQL queries inside a loop. But doing so is like hammering your database with queries.
This example illustrates the whole “queries in loops” problem:
foreach ($display_order as $id => $ordinal) { $sql = "UPDATE categories SET display_order = $ordinal WHERE id = $id"; mysql_query($sql); }
Here is what you should do instead:
UPDATE categories SET display_order = CASE id WHEN 1 THEN 3 WHEN 2 THEN 4 WHEN 3 THEN 5 END WHERE id IN (1,2,3)
Use join instead of subqueries
As a programmer, subqueries are something that you can be tempted to use and abuse. Subqueries, as show below, can be very useful:
SELECT a.id, (SELECT MAX(created) FROM posts WHERE author_id = a.id) AS latest_post FROM authors a
Although subqueries are useful, they often can be replaced by a join, which is definitely faster to execute.
SELECT a.id, MAX(p.created) AS latest_post FROM authors a INNER JOIN posts p ON (a.id = p.author_id) GROUP BY a.id
Source: http://20bits.com/articles/10-tips-for-optimizing-mysql-queries-that-dont-suck/
Be careful when using wildcards
Wildcards are very useful because they can substitute for one or more characters when searching for data in a database. I’m not saying that you shouldn’t use them, but instead, you should use them with caution and not use the full wildcard when the prefix or postfix wildcard can do the same job.
In fact, doing a full wildcard search on a million records will certainly kill your database.
#Full wildcard SELECT * FROM TABLE WHERE COLUMN LIKE '%hello%'; #Postfix wildcard SELECT * FROM TABLE WHERE COLUMN LIKE 'hello%'; #Prefix wildcard SELECT * FROM TABLE WHERE COLUMN LIKE '%hello';
Source: http://hungred.com/useful-information/ways-optimize-sql-queries/
Use UNION instead of OR
The following example use the OR statement to get the result:
SELECT * FROM a, b WHERE a.p = b.q or a.x = b.y;
The UNION statement allows you to combine the result sets of 2 or more select queries. The following example will return the same result that the above query gets, but it will be faster:
SELECT * FROM a, b WHERE a.p = b.q UNION SELECT * FROM a, b WHERE a.x = b.y
Source: http://www.bcarter.com/optimsql.htm
Use indexes
Database indexes are similar to those you can find in libraries: They allow the database to find the requested information faster, just like a library index will allow a reader to find what they’re looking for without loosing time.
An Index can be created on a single column or a combination of columns in a database table. A table index is a database structure that arranges the values of one or more columns in a database table in specific order.
The following query will create an index on the Model column from the Product table. The index is called idxModel:
CREATE INDEX idxModel ON Product (Model);
Source: http://www.sql-tutorial.com/sql-indexes-sql-tutorial/
Have you checked out the highly recommended Digging into WordPress book by Chris Coyier and Jeff Starr?