One of the keys to database performance if keeping your transactions as short as possible. In this article we will look at a couple of tricks using the CASE statement to perform multiple updates on a table in a single operation.
Multiple updates to a single column
This example uses the pubs database to adjust book prices for a sale by different amounts according to different criteria. In the example I am going to knock 25% off all business books from any publisher, and 10% off any non-business books from a particular publisher. You might be tempted to wrap two separate update statements into one transaction like this:
update titles set ...
update titles set ...
The down side of this technique is that it will read through the table twice, once for each update. If we code our update like the example below, then the table will only need to be read once. For large tables, this can save us a lot of disk IO, especially if the query requires a table scan over a long table
set price =
when type = "business"
then price * 0.75
when pub_id = "0736"
then price * 0.9
where pub_id = "0736" OR
type = "business"
Note that there is a definite "top-down" priority involved in the CASE statement. For business books from publisher 0736 the "business" discount will apply because this is the first condition in the list to be fulfilled. However, we will not give a further 10% publisher discount, even though the criteria for the second "when" clause is satisfied, because the CASE statement only evaluates criteria until it finds the first one that fits.
We can use the CASE statement to update multiple columns in a table, even using separate update criteria for each column. This example updates the publishers table to set the state column to "--" for non-USA companies, and changes the city for one particular publisher, all in one table read operation.
state = case
when country <> "USA"
city = case
when pub_id = "9999"
where country <> "USA" OR
pub_id = "9999"
The same format will work for updates across three or more rows with different update criteria.
You may come across fewer opportunities to use this second technique efficiently. This query will almost invariably result in a table scan because we are selecting on multiple columns that are unlikely to all be in a covering index. If each column is updated only a small number of times, and is indexed, it may still be more efficient to do separate updates.
A good place to use this technique might be in cleaning up multiple columns in a long interface file from another system.
Because we are using two separate case statements, one for each test criteria/update, each case statement will be evaluated for every row, and updates applied where required. Therefore if more than one column in the row requires an update, they will all be updated.
Two things are particularly important to remember in this example:
- The else [column] clause is required for each case statement used, otherwise you will end up nulling-out data you do not want to.
- The where clause at the end must be used to restrict the update to rows that require at least one column updating, otherwise every column in the table will be updated, increasing both execution time and pressure on the transaction log.