Wednesday, November 26, 2008

[mssql] Performing a case-sensitive search and replace in SQL 2000/2005

If you need to perform a case sensitive search and/or replace in an SQL 2000 or SQL 2005 database you need to use the correct collation method. In the situation I had today I needed to change some copy throughout a website, all of which is in a string resource table, but I had to be careful to maintian case used, i.e: ‘Shipping’ had to change to ‘Delivery’, but ’shipping’ had to change to ‘delivery’.

Your database may have been set up to use a case-sensitive collation method, or it may not have, or you may not have been involved in setup. I don’t know which collation method is the smartest in the world to use - I’m not a DBA - but here’s how to find out which collation you’re working with.

Execute this query in a query analyser:

select charindex('If the result is 0 you are in a case-sensitive collation mode', 'RESULT')

A 0 return result means you’re working with a case-sensitive collation mode. This means you can write your search/replace query (I’ll give an example below) without specifying the collation method you want to use and your query will consider ‘RESULT’ different to ‘result’.

If you DO have to specify a collation method you just have to declare it after the column name you’re interested in. Here’s an example:

update StringResource
set ConfigValue = replace(ConfigValue COLLATE Latin1_General_BIN, 'Shipping', 'Delivery')
from StringResource
where charindex('Shipping', configvalue COLLATE Latin1_General_BIN) > 0

update StringResource
set ConfigValue = replace(ConfigValue COLLATE Latin1_General_BIN, 'shipping', 'delivery')
from StringResource
where charindex('shipping', configvalue COLLATE Latin1_General_BIN) > 0

This query replaces the word ‘Shipping’ with ‘Delivery’ in the ConfigValue column in the StringResource table. There are two seperate statements, one for each case I’m replacing, because I need to specify the casing I’m concerned with explicitly. The important part of the collation type is the _BIN part, which specifies that I want to compare strings as binary data. More information about aspects of collation sorting here. This method may not be the smartest in the world so I would appreciate any comments.