SQL performance is always a key issue that every DBA faces. This article introduces five ways for improving SQL performance.
- If subquery is a parameter, then use EXISTS instead IN
If you use subquery as a parameter, it is better that you us EXISTS than IN even these two statements return same result, the performance of EXISTS is superior to IN
- Use IN
SELECT * FROM Class_A WHERE id IN (SELECT id FROM CLASS_B);
- Use EXISTS
SELECT * FROM CLASS_A A WHERE EXISTS
(SELECT * FROM Class_B B WHERE A.id=B.id);
The reason why EXISTS has higher speed
- If id(key) has index, it then will reference index
- EXISTS will stop executing once it has found a qualified row, but IN will continue to search all the data for a table.
2. Avoid using order
DBMS does ordering execution frequently, so the user doesn’t have to care which ordering operation is performing
- Use All option
For example UNION, if you don’t need repeated data, you can use UNION ALL instead UNION
1.
SELECT * FROM CLASS_A
UNION
SELECT * FROM CLASS_B
2.
SELECT * FROM CLASS_A
UNION ALL
SELECT * FROM CLASS_B
Use EXISTS rather than DISTINCT
DISTINCT will exclude repeated data. When you try to combine two tables, you shall use EXISTS instead of DISTINCT to avoid ordering.
For example,
1. SELECT DISTINCT I.item_no FROM Items I INNER JOIN SalesHistory SH ON I.item_no = SH.item_no;
2. SELECT item_no FROM Items I WHERE EXISTS
(SELECT * FROM SalesHistory SH WHERE I.item_no=SH.item_no);
3.(MAX/MIN) use index
When you use MAX and MIN two math functions, these two will do ordering but if the parameter has index, it will seek not scan all the table
4.Write the statement in WHERE than HAVING
1.
SELECT sale_date, SUM(quantity) FROM SalesHistory
GROUP BY sale_date
HAVING sale_date = ‘2010-10-2’;
2.
SELECT sale_date, SUM(quantity) FROM SalesHistory
WHERE sale_date= ‘2010-10-2’ GROUP BY sale_date
The second example (WHERE) has better performance. There are two reasons. Firstly, index works on where statement. Secondly, GROUP BY will do ordering and hashing executions. Hence we have to minimize the row to improve the performance of ordering. Index does not work on HAVING.
5. Reduce the intermediary table
Intermediary table needs lots of memory and it is hard to use index, therefore minimize the use of intermediary table is a key to improve table performance.
Use HAVING
SELECT * FROM(SELECT sale_date, MAX(quantity) AS max_qty FROM SalesHistroy
GROUP BY sale_date)TMP WHERE max_qty >= 10;
Do not build extra middle table, use Having
SELECT sale_date, MAX(quantity) FROM SalesHistory
GROUP BY sale_date HAVING MAX(quantity) >= 10
The above five tips are the key method to improve overall SQL performance, but the most important thing is to find the fundamental problem and then solve it.