Monday, February 16, 2009

[t-sql] Can I create an index on a BIT column?

Try this. In Enterprise Manager, right-click the Tables view, and select New Table. Create a BIT column named "MyBitColumn", and then click on the "Manage Indexes / Keys..." button to create a new index:


When you do this, you are offered a "New" button. Click it, and under "Column name" try to select the "MyBitColumn" column. It's not there! Why? Because in this interface, Enterprise Manager does not offer you the ability to create an index on a BIT column. When you try to type the name yourself, you are rewarded with an annoying barrage of error dialogs. Each time you click "No" the first dialog reappears, and every time you try to click OK on the second, the same thing happens.


So, can you create an index on a BIT column? Of course. You can even still do it in Enterprise Manager. Save your table, and then right-click it in the Tables view and choose All Tasks > Manage Indexes... give your index a name, check the "MyBitColumn" column, and hit OK. Voila!


Another way is to create the index using T-SQL, which is the preferred way of generating DDL anyway:
    CREATE TABLE dbo.blat
(
MyBitColumn BIT
)
GO

CREATE INDEX MyBitIndex ON dbo.blat(MyBitColumn)
GO

Now, the question is, do you really WANT an index on a BIT column? We're going to run some experiments, but in general, it is highly unlikely that you will get much use out of such an index. The exception is when the data is heavily weighted towards, say, 1 (e.g. 95-99% of the table), and you are searching for 0. Or vice-versa.

What I'd like to test is the effect on execution time and the execution plan if you run different queries against large tables with a BIT column whose values are evenly distributed, or weighted heavily (in this case 97% - 3%), and in both cases, compare clustered to nonclustered to no index.

So, let's create six different tables in their own database:
    CREATE DATABASE Splunge
GO

USE Splunge
GO

-- 50/50, no index
CREATE TABLE dbo.Test1
(
myBit BIT NOT NULL
)

-- 50/50, nonclustered index
CREATE TABLE dbo.Test2
(
myBit BIT NOT NULL
)

CREATE INDEX bitIndex ON dbo.Test2(myBit)

-- 50/50, clustered index
CREATE TABLE dbo.Test3
(
myBit BIT NOT NULL
)

CREATE CLUSTERED INDEX bitIndex ON dbo.Test3(myBit)

-- 97/3, no index
CREATE TABLE dbo.Test4
(
myBit BIT NOT NULL
)

-- 97/3, nonclustered index
CREATE TABLE dbo.Test5
(
myBit BIT NOT NULL
)

CREATE INDEX bitIndex ON dbo.Test5(myBit)

-- 97/3, clustered index
CREATE TABLE dbo.Test6
(
myBit BIT NOT NULL
)

CREATE CLUSTERED INDEX bitIndex ON dbo.Test6(myBit

And let's populate each with 100,000 rows, the first three will have 50,000 of each value (0,1), and the second three will have 97,000 0's and 3,000 1's.

DECLARE
@i INT,
@ff BIT, -- 50/50 flag
@nf BIT -- 97/3 flag

SELECT
@i = 1,
@ff = 0,
@nf = 0

WHILE @i <= 100000
BEGIN
IF @i > 50000
SET @ff = 1

IF @i > 97000
SET @nf = 1

INSERT dbo.Test1(myBit) SELECT @ff
INSERT dbo.Test2(myBit) SELECT @ff
INSERT dbo.Test3(myBit) SELECT @ff

INSERT dbo.Test4(myBit) SELECT @nf
INSERT dbo.Test5(myBit) SELECT @nf
INSERT dbo.Test6(myBit) SELECT @nf

SET @i = @i + 1
END

On my system, this took roughly seven minutes. Your mileage may vary.

So now that we have the data in there, let's run the following sets of queries.
    SELECT COUNT(*) FROM dbo.Test1 
SELECT COUNT(*) FROM dbo.Test2
SELECT COUNT(*) FROM dbo.Test3
SELECT COUNT(*) FROM dbo.Test4
SELECT COUNT(*) FROM dbo.Test5
SELECT COUNT(*) FROM dbo.Test6

    SELECT COUNT(*) FROM dbo.Test1 WHERE MyBit=0 
SELECT COUNT(*) FROM dbo.Test1 WHERE MyBit=1
SELECT COUNT(*) FROM dbo.Test2 WHERE MyBit=0
SELECT COUNT(*) FROM dbo.Test2 WHERE MyBit=1
SELECT COUNT(*) FROM dbo.Test3 WHERE MyBit=0
SELECT COUNT(*) FROM dbo.Test3 WHERE MyBit=1
SELECT COUNT(*) FROM dbo.Test4 WHERE MyBit=0
SELECT COUNT(*) FROM dbo.Test4 WHERE MyBit=1
SELECT COUNT(*) FROM dbo.Test5 WHERE MyBit=0
SELECT COUNT(*) FROM dbo.Test5 WHERE MyBit=1
SELECT COUNT(*) FROM dbo.Test6 WHERE MyBit=0
SELECT COUNT(*) FROM dbo.Test6 WHERE MyBit=1

If you observe the execution plan and statistics, you will see that those with the table scan (Test1 and Test4) require the least amount of reads and percentage of work.

However, these results change ever so slightly if you are performing grouping and aggregates in the same query:
    SELECT MyBit, COUNT(*) FROM dbo.Test1 GROUP BY MyBit 
SELECT MyBit, COUNT(*) FROM dbo.Test2 GROUP BY MyBit
SELECT MyBit, COUNT(*) FROM dbo.Test3 GROUP BY MyBit
SELECT MyBit, COUNT(*) FROM dbo.Test4 GROUP BY MyBit
SELECT MyBit, COUNT(*) FROM dbo.Test5 GROUP BY MyBit
SELECT MyBit, COUNT(*) FROM dbo.Test6 GROUP BY MyBit

Here we see that the clustered index has a slight edge in query cost, but slightly higher I/O cost:


And in SQL Server 2005, the clustered index scan has an even greater advantage:

So, the answer to this one is yes, you can create a clustered index on a BIT column.

However, as for whether you SHOULD—as with so many other choices—it depends.

Jeff Gray correctly points out that the optimizer will do a little bit better if you explicitly tell it that you are dealing with a BIT (since the engine assumes INT), e.g.:
    WHERE MyBit = CONVERT(BIT, 0) 
--or
WHERE MyBit = CAST(0 AS BIT)

Though that can come with a trade-off as well; namely, remembering to explicitly convert values on every statement. Whereas, if you use a CHAR(1) constrained to 'T'/'F', for example, no explicit conversion is necessary for the proper index to be used.



Now clean up, because you probably don't want this data to hang around:
    DROP TABLE 
dbo.Test1, dbo.Test2, dbo.Test3,
dbo.Test4, dbo.Test5, dbo.Test6

Or, just:
    DROP DATABASE Splunge

Friday, February 6, 2009

[vb6] Get object reference by its name in a form

In some cases you might want to activate an object in a form by given its name. To get it, you can use the controls collection of the form object:
  Private Sub Form_Load()
MsgBox GetCtlByName(Me, "Label1").Caption
End Sub

Private Function GetCtlByName( _
ByVal pForm As Form, _
ByVal pName$, _
Optional ByVal pIndex% = -1) As Control

On Error GoTo PROC_ERR

If pIndex < 0 Then
Set GetCtlByName = pForm.Controls(pName)
Else
Set GetCtlByName = pForm.Controls(pName, pIndex)
If pIndex <> GetCtlObj.Index Then GoTo PROC_ERR
End If

Exit Function

PROC_ERR:
Set GetCtlByName = Nothing
MsgBox "Error in procedure GetCtlByName for item name = '" & pName & "'.", vbExclamation
End Function 'GetCtlByName

Tuesday, February 3, 2009

[mssql] A Deadlock Occurence and Resolution

Introduction

An upgrade was performed at my company recently to the application code on the SQL Server, which involved basically the addition of columns to few tables. After the upgrade, the customer started facing severe deadlocks. Around 8 out of 10 transactions were getting deadlocked and were failing.

Steps Taken in resolving the deadlocks:

I will walk through the steps we took to determine what was causing the deadlock and resolve the issue.

1. First I enabled the trace flag 1204 on the server to capture the deadlock details in the error log with the command listed below:
  DBCC traceon(1204,-1)

Next I confirmed that the trace was enabled with the below command:
  DBCC Tracestatus(-1)

The output shown below indicated that the trace flag 1204 was enabled.
  TraceFlag Status Global Session
--------- ------ ------ -------
1204 1 1 0

2. I then gathered the deadlock details from the SQL error log with the command 'xp_readerrorlog'. One instance of deadlock occurred was given below:
  Deadlock encountered .... Printing deadlock information
Wait-for graph
Node:1 PAG: 9:1:18061 CleanCnt:2 Mode: SIU Flags: 0x2
Grant List 1::
Grant List 2::
Owner:0x27c007e0 Mode: S Flg:0x0 Ref:0 Life:00000001 SPID:84 ECID:0
SPID: 84 ECID: 0 Statement Type: UPDATE Line #: 11
Input Buf: RPC Event: dbo.Example_Stored_proc
Requested By: 0
ResType:LockOwner Stype:'OR' Mode: IX SPID:78 ECID:0 Ec:(0x44AA55F0) Value:0x3affcd00 Cost:(0/0)
Node:2 PAG: 9:1:18134 CleanCnt:2 Mode: SIU Flags: 0x2
Grant List 1::
Owner:0x28e6f060 Mode: S Flg:0x0 Ref:0 Life:00000001 SPID:78 ECID:0
SPID: 78 ECID: 0 Statement Type: UPDATE Line #: 11
Input Buf: RPC Event: dbo. Example_Stored_proc
Grant List 2::
Requested By:
ResType:LockOwner Stype:'OR' Mode: IX SPID:84 ECID:0 Ec:(0x239955F0) Value:0x3affc940 Cost:(0/0)
Victim Resource Owner:
ResType:LockOwner Stype:'OR' Mode: IX SPID:84 ECID:0 Ec:(0x239955F0) Value:0x3affc940 Cost:(0/0)

3. From the captured details, it was found that the deadlocks were occurring due to UPDATE statements in the stored procedure 'Example_stored_proc' while trying to get an Intent exclusive lock on the pages.

4. My next step was to read the contents of the stored procedure 'Example_stored_proc' through sp_helptext.

5. The procedure was calling around 5 procedures and each of those were calling 4 more procedures.

6. As going through all the procedures to identify the deadlock statements was difficult, I was able to get to the correct update statement through the Page numbers in the deadlock details captured earlier. The deadlock details captured had the page numbers that are involved in the deadlock as shown below:
  Node:1 PAG: 9:1:18061 CleanCnt:2 Mode: SIU Flags: 0x2

7. I gathered the Page numbers that were being deadlocked and used the undocumented DBCC command 'DBCC Page' to get the table name associated with the page. The details of the command syntax are in this link:http://support.microsoft.com/kb/83065

Though the article refers to earlier SQL versions, the command is still valid in SQL Server 2000 and provides the page information.

I then used this command to get the page details:
  DBCC page(9,1,18061,0)

The parameters were taken from the deadlock details shown above. The output obtained was pasted below. It contains the object name associated with that page. Under the PAGE HEADER, the value m_objId gives the table name associated with the deadlock:
  PAGE: (1:18061)
---------------
BUFFER:
-------
BUF @0x01665900
---------------
bpage = 0x1DF58000 bhash = 0x00000000 bpageno = (1:18061)
bdbid = 9 breferences = 1 bstat = 0xb
bspin = 0 bnext = 0x00000000
PAGE HEADER:
------------
Page @0x1DF58000
----------------
m_pageId = (1:18061) m_headerVersion = 1 m_type = 1
m_typeFlagBits = 0x0 m_level = 0 m_flagBits = 0x8000
m_objId = 1013578649 m_indexId = 0 m_prevPage = (0:0)
m_nextPage = (0:0) pminlen = 52 m_slotCnt = 82
m_freeCnt = 3075 m_freeData = 5009 m_reservedCnt = 0
m_lsn = (2689:87968:2) m_xactReserved = 0 m_xdesId = (0:0)
m_ghostRecCnt = 0 m_tornBits = 0
.................

Once the object id was obtained, table name was found out with the command:
  Use DBNAME
Select object_name(OBJECT_ID)

8. I then reviewed the code of the dependent stored procedures on the table and found one update procedure among the list.

9. The code of the update procedure was similar to that shown below:
  UPDATE {table}
SET {column1} = @C1,
{column2} = @C2,
{column2} = @C3,
WHERE ID = @id

10. I then verified the indexes on the table and found that the table did not have an index on the ID column. This was forcing the update to perform a table scan causing it to take an INTENT exclusive lock on all the pages thereby causing the deadlock in the process.

11. As there was only one non-clustered index existing on that table, I created a clustered index on the table on the ID column with this command:
  IF NOT EXISTS (SELECT name FROM sysindexes
WHERE name = 'IX_Clustered_index_column')
CREATE CLUSTERED INDEX IX_Clustered_index_column
ON dbo.TABLE(ID) ON [PRIMARY]

12. After the index creation, the intent exclusive lock was not requested on all the pages of the update statement and as a result the deadlocks got resolved.

Conclusion

On further discussion with the application team about the reason for the deadlock occurrence after the upgrade performed by them, I came to know the following:

* Before the upgrade, for every insertion, one record was inserted or updated in the table identified in the deadlock
* As part of the upgrade, the application team performed modifications to the stored procedure code such that every insertion may cause up to 10 updates in the table.

This confirmed the reason why the deadlocks started occurring after the upgrade. As the table access increased heavily due to table scans, transactions took a longer time for each update thereby causing the deadlock scenario.