Monday, November 16, 2009

[mssql] REPLACE Multiple Spaces with One

Introduction

Replacing multiple spaces with a single space is an old problem. If you Google the problem, you find that most folks still resort to While Loops in functions or maybe even a Tally table or (ugh!) XML in a function to solve this seemingly complex problem. The truth is that you don't need the RBAR of a User Defined Function at all.

This article explains the "puzzle solving" behind a common set based method to replace unknown numbers of adjacent spaces with a single space by establishing and replacing simple patterns in a set based fashion.

The Problem

You have a column of data that looks something like the following...

--===== Create and populate a test table.
     -- This is NOT a part of the solution.
DECLARE @Demo TABLE(OriginalString VARCHAR(8000))
 INSERT INTO @Demo (OriginalString)
 SELECT '  This      has multiple   unknown                 spaces in        it.   ' UNION ALL
 SELECT 'So                     does                      this!' UNION ALL
 SELECT 'As                                does                        this' UNION ALL
 SELECT 'This, that, and the other  thing.' UNION ALL
 SELECT 'This needs no repair.'
The goal is to convert the sections of multiple spaces of unknown length to a single space each as quickly as possible. There are some additional requirements. You can't make the data any bigger during the process because it might get too big for VARCHAR(8000) (or whatever size the column is) or you can't use VARCHAR(MAX) because you're using SQL Server 2000.

The Method Explained

I'll use the letter "O" to act as a "visible" space so you can see what I'm talking about... consider the following sets of spaces, please....
O
OO
OOO
OOOO
OOOOO
OOOOOO
OOOOOOO
OOOOOOOO

Remember the goal is to convert all of those sets of spaces to just a single space without the use of RBAR even if the RBAR is through the simple use of a UDF. I'll also state that the goal is to remove the extra spaces without making the original string any larger in the process because it may already be as large as it can be for the given datatype.

So... STEP 1 is to identify pairs of spaces. This is done by modifying the second space in each pair of spaces to be an "unlikely" character. In this case, I'll use the visible character of "X" (which isn't unlikely but serves this visual example) to represent a space that has been changed in a pair of spaces.

When we replace all pairs of space "OO" with "OX", we get the following
O
OX
OXO
OXOX
OXOXO
OXOXOX
OXOXOXO
OXOXOXOX

STEP 2 is to replace all occurrences of "XO" with NOTHING...
O
OX
OXO
OXOX
OXOXO
OXOXOX
OXOXOXO
OXOXOXOX

... and that leaves us with ...
O
OX
O
OX
O
OX
O
OX

STEP 3 is to replace "X" with NOTHING...
O
OX
O
OX
O
OX
O
OX

... and that leaves us with just singles spaces everywhere...
O
O
O
O
O
O
O
O

Again... the "O"s represent unmodified spaces and the "X"s represent spaces changed to some "unlikely character" like a special non printable, almost non type-able character like ASCII 7 (the "Bell" character). It's just a matter of 3 nested REPLACE functions to handle ANY number of spaces to accomplish the puzzle we solved above. It can all be done in a single set-based query without loops or even UDF's.

Be careful which "unlikely character" you pick, though. We'll talk more about that when we get to the "Unlikely Characters and Collation" section of this article further below.

The Code

Ok... now that you know how it works, here's the code that accomplishes the 3 steps as 3 nested REPLACE's. I've included the test table I previously covered in "The Problem" section of this article just to make things easy to run. Notice that I've also added an LTRIM/RTRIM to take out any leading or trailing spaces, as well...

--===== Create and populate a test table.
     -- This is NOT a part of the solution.
DECLARE @Demo TABLE(OriginalString VARCHAR(8000))
 INSERT INTO @Demo (OriginalString)
 SELECT '  This      has multiple   unknown                 spaces in        it.   ' UNION ALL
 SELECT 'So                     does                      this!' UNION ALL
 SELECT 'As                                does                        this' UNION ALL
 SELECT 'This, that, and the other  thing.' UNION ALL
 SELECT 'This needs no repair.'

--===== Reduce each group of multiple spaces to a single space
     -- for a whole table without functions, loops, or other
     -- forms of slow RBAR.  In the following example, CHAR(7)
     -- is the "unlikely" character that "X" was used for in 
     -- the explanation.
 SELECT REPLACE(
            REPLACE(
                REPLACE(
                    LTRIM(RTRIM(OriginalString))
                ,'  ',' '+CHAR(7))  --Changes 2 spaces to the OX model
            ,CHAR(7)+' ','')        --Changes the XO model to nothing
        ,CHAR(7),'') AS CleanString --Changes the remaining X's to nothing
   FROM @Demo
  WHERE CHARINDEX('  ',OriginalString) > 0

Because of the equivalent treatment of the Thorn (รพ) character and the "th" digraph, you end up with a most undesirable result...

Results from Code Above
CleanString
is has multiple unknown spaces in it.
So does is!
As does is
is, at, and e oer ing.
... which is nothing like what we wanted.

There's also a danger in selecting the wrong "control character" (ASCII characters 0 through 31) as the "unlikely character". I chose CHAR(7) which is a very benign character in today's world of electronics. It was designed to literally ring the bell on old Tele-Type machines and is just passed through by today's electronics.

If you chose CHAR(0), you've just chosen the "NULL" character and anything that follows it's appearance will simply disappear. Using the current test table in this article, you'd get an output that looks like the following...

Results from Code Above
CleanString
This
So
As
This,that,andtheotherthing.

If you chose CHAR(1), you've just chosen the ASCII "Start of Header" character which is still in use today. When certain electronics see this character, it can cause some very strange behavior usually resulting in failure of your code.

Another good "unlikely character" is CHAR(8) which is the ASCII "Backspace" character. It's normally never included in any type of assembled text now adays. CHAR(127) also works well because it's a left over for "Delete" from the paper tape world. It actually punches all the holes in a paper tape (7 of them) to quite literally delete a character.

For more information on the ASCII "Control Characters", please see the following URL:
http://www.lammertbies.nl/comm/info/ascii-characters.html#cont

Conclusion

To borrow a phrase from R. Barry Young's series of articles on the subject, "There must be 15 ways to lose your cursors". Even the seemingly complex task of condensing multiple adjacent spaces to a single space can be done without a loop.

When you run into a problem where you're absolutely sure there's no way to do it without a loop, have another look. There's usually a high performance way to avoid the loop.
Thanks for listening, folks.

--Jeff Moden

Wednesday, October 21, 2009

[picture] Ctrl+Alt+Del


Have you ever thought of the person who invented "CTRL + ALT + DEL " key combination.

"David Bradley"

He is the One who spent 1 minute and 23 seconds in writing the source code that rescues

The world's PC users for decades This extraordinary IBM employee is retiring on Friday, 25th March 2005 after a prolong service of 29 years.

His formula forces obstinate computers to restart when they no longer follow other commands.

By 1980, Bradley was one of 12 people working to create the debut.

The engineers knew they had to design a simple way to restart the computer When it fails to respond the user Bradley wrote the code to make it work.

Bradley says. "I did a lot of other things than Ctrl-Alt-Delete, but I'm famous for that one."

His fame and success is achieved each time a PC user fails.

He Commented His relationship with Bill gates by saying

"I may have invented it, but Bill gates made it famous by applying my formula When ever any Microsoft's Windows operating system made by him CRASHES,

Thus I win when ever he looses..........."

Thursday, October 8, 2009

[picture] Mouse Inside Cat



[picture] Honesty On The Internet



Tuesday, September 29, 2009

[picture] What Happens to IT Professionals after Death




Wednesday, September 2, 2009

[mssql] How To Get Table Row Counts Quickly And Painlessly

At some point in time we've all had to find out how many rows are in a table. The first answer you'll usually get when you ask someone how to do it is select count(*) from [table], however there are two problems with this approach: First, a table scan is required to figure out the answer; do a count(*) against a million row table and you're looking at hundreds of thousands of reads (and likely several minutes waiting for the result). Second, the count(*) method doesn't work well if you want to know how many rows are in every table in your database.

It just so happens there's a system function in SQL Server that can help solve both of these problems: sp_spaceused (BOL entry here) . When run without any parameters it returns usage information about the current database its being run in; when provided a specific object name (e.g. a table name) it returns the number of rows along with the amount of space used by\allocated for the table and its indexes. Looking under the covers of sp_spaceused reveals that the rowcount information is coming from the sysindexes table on SQL 2000 and the sys.dm_db_partition_stats DMV on SQL 2005\2008. Since the counts are coming from system objects there's no table scan involved - Problem #1 solved!

To solve problem #2 you could use a cursor to iterate through all tables (or the undocumented stored procedure sp_foreachtable), calling sp_spaceused for each table and storing the output in a temporary table...or just query the system objects directly.

Row Counts Using sysindexes
If you're using SQL 2000 you'll need to use sysindexes like so:
-- Shows all user tables and row counts for the current database
-- Remove OBJECTPROPERTY function call to include system objects
SELECT o.NAME,
       i.rowcnt
  FROM sysindexes AS i
   INNER JOIN sysobjects AS o 
     ON o.id = i.id
  WHERE i.indid < 2 AND
        OBJECTPROPERTY(o.id, 'IsMSShipped') = 0
  ORDER BY o.NAME
Row Counts Using DMVs
If you're using SQL 2005 or 2008 querying sysindexes will still work but Microsoft advises that sysindexes may be removed in a future version of SQL Server so as a good practice you should use the DMVs instead, like so:
-- Shows all user tables and row counts for the current database
-- Remove is_ms_shipped = 0 check to include system objects
-- i.index_id < 2 indicates clustered index (1) or hash table (0)
SELECT o.name, ddps.row_count
  FROM sys.indexes AS i
  INNER JOIN sys.objects AS o 
    ON o.OBJECT_ID = i.OBJECT_ID
  INNER JOIN sys.dm_db_partition_stats AS ddps 
    ON ddps.OBJECT_ID = i.OBJECT_ID AND 
       ddps.index_id = i.index_id
  WHERE i.index_id < 2 AND
        o.is_ms_shipped = 0
  ORDER BY o.NAME
Are The Counts Accurate?
Some system objects are only as accurate as the current statistics and occasionally statistics get outdated and need to be refreshed. Fortunately row count information in sysindexes\DMVs does not depend on updated statistics. To put this to the test I disabled the Auto Update Statistics option on a database that sees several thousand updates each day. After several days I compared the counts returned by the select count(*) method and the system objects and they matched perfectly.

Obviously you'll need to revert to the select count(*) method if you need to filter out rows (using a where clause), but for unfiltered row count information there's no excuses not to use the system objects!

Wednesday, August 12, 2009

[vb6] Open Any Type Of File With Its Associated Window's Program

You simply want to be able to press a command button, which will trigger a file (e.g. "c:\media\mysong.mp3"), to be opened with by its associated application (e.g. "Media Player") just as what would happen if I went to the file in windows explorer, and double clicked it.

1. Create module modTools as following:
Attribute VB_Name = "modTools"
'*******************************************************************************
' Name : modTools
' Author : Chandra Gunawan
' Date : 22-Jul-2009
' Description : Tools Library
'
' Maintenance Log
' ==============================================================================
' Date ID Description
' ------------------------------------------------------------------------------
'*******************************************************************************

Option Explicit

'==============================================================================
' CONSTANTS & VARIABLES DEFINITION
'==============================================================================
Private Declare Function ShellExecute Lib "shell32.dll" Alias "ShellExecuteA"_

(ByVal hwnd As Long, ByVal lpOperation As String, ByVal lpFile As String, _
ByVal lpParameters As String, ByVal lpDirectory As String, _
ByVal nShowCmd As Long) As Long

Private Const SW_SHOWNORMAL = 1

'------------------------------------------------------------------------------
' Name : OpenFile
' Author : Chandra Gunawan
' Date : 13-Aug-2009
' Description : Open any type of file with its associated window's program
'------------------------------------------------------------------------------
Public Function OpenFile(ByVal pForm As Form, ByVal pFilePath$) As Long

OpenFile = ShellExecute(pForm.hwnd, "open", pFilePath, vbNullString, vbNullString, SW_SHOWNORMAL)
End Function 'OpenFile

2. Execute OpenFile function inside your button's click-event
Private Sub Command1_Click()
Call OpenFile(Me, "c:\media\mysong.mp3")
End Sub