Programmer Collection: 2007

Sunday, December 16, 2007

[mssql] Choosing SQL Server 2000 Data Types

Introduction

Choosing an appropriate data type is very important, because the errors made in a table design can result in large performance degradation. These problems often surface later, when a large amount of data is inserted. In this article, I want to tell you about built-in and user-defined data types, how SQL Server 2000 stores data on a data page, and show some general tips to choose an appropriate data type.

Built-in data types

In Microsoft SQL Server 2000, each object (such as column, variable, or parameter) has a related data type, which is an attribute that specifies the type of data that the object can hold.

SQL Server 2000 ships with 27 built-in (system) data types. They are:

Data Types	Description
bigint	Integer data from -2^63 through 2^63-1
int	Integer data from -2^31 through 2^31 - 1
smallint	Integer data from -2^15 through 2^15 - 1
tinyint	Integer data from 0 through 255
bit	Integer data with either a 1 or 0 value
decimal	Fixed precision and scale numeric data from -10^38 +1 through 10^38 -1
numeric	Fixed precision and scale numeric data from -10^38 +1 through 10^38 -1
money	Monetary data values from -2^63 through 2^63 - 1
smallmoney	Monetary data values from -214,748.3648 through +214,748.3647
float	Floating precision number data from -1.79E + 308 through 1.79E + 308
real	Floating precision number data from -3.40E + 38 through 3.40E + 38
datetime	Date and time data from January 1, 1753, through December 31, 9999, with an accuracy of 3.33 milliseconds
smalldatetime	Date and time data from January 1, 1900, through June 6, 2079, with an accuracy of one minute
char	Fixed-length character data with a maximum length of 8,000 characters
varchar	Variable-length data with a maximum of 8,000 characters
text	Variable-length data with a maximum length of 2^31 - 1 characters
nchar	Fixed-length Unicode data with a maximum length of 4,000 characters
nvarchar	Variable-length Unicode data with a maximum length of 4,000 characters
ntext	Variable-length Unicode data with a maximum length of 2^30 - 1 characters
binary	Fixed-length binary data with a maximum length of 8,000 bytes
varbinary	Variable-length binary data with a maximum length of 8,000 bytes
image	Variable-length binary data with a maximum length of 2^31 - 1 bytes
cursor	A reference to a cursor
sql_variant	A data type that stores values of various data types, except text, ntext, timestamp, and sql_variant
table	A special data type used to store a result set for later processing
timestamp	A database-wide unique number that gets updated every time a row gets updated
uniqueidentifier	A globally unique identifier

Some of these data types (bigint, sql_variant, and table) are only available in SQL Server 2000, while some were supported under the previous SQL Server versions.

User-defined data types

SQL Server 2000 supports user-defined data types too. User-defined data types provide a mechanism for applying a name to a data type that is more descriptive of the types of values to be held in the object. Using user-defined data type can make it easier for a programmer or database administrator to understand the intended use of any object defined with the data type. The user-defined data types are based on the system data types and can be used to predefine several attributes of a column, such as its data type, length, and whether it supports NULL values. To create a user-defined data type, you can use the sp_addtype system stored procedure or you could add one using the Enterprise Manager. When you create a user-defined data type, you should specify the following three properties:

Data type's name.

Built-in data type upon which the new data type is based.

Whether it can contain NULL values.

The following example creates a user-defined data type based on money data type named cursale that cannot be NULL:

EXEC sp_addtype cursale, money, 'NOT NULL'
GO

Both system and user-defined data types are used to enforce data integrity. It is very important that we put forth a lot of effort while designing tables: the better you design your tables, the more time you can work without any performance problems. In an ideal case, you never will update the structure of your tables.

Tips to choose the appropriate data types

SQL Server 2000 stores data in a special structure called data pages that are 8Kb (8192 bytes) in size. Some space on the data pages is used to store system information, which leaves 8060 bytes to store user's data. So, if the table's row size is 4040 bytes, then only one row will be placed on each data page. If you can decrease the row size to 4030 bytes, you can store two rows within a single page. The less space used, the smaller the table and index, and the less the I/O SQL Server has to perform when reading data pages from disk. So, you should design your tables in such a way as to maximize the number of rows that can fit into one data page. To maximize the number of rows that can fit into one data page, you should specify the narrowest columns you can. The narrower the columns, the less data that is stored, and the faster SQL Server is able to read and write data.

Try to use the following tips when choose the data types:

If you need to store integer data from 0 through 255, use tinyint data type. The columns with tinyint data type use only one byte to store their values, in comparison with two bytes, four bytes and eight bytes used to store the columns with smallint, int and bigint data types accordingly. For example, if you design tables for a small company with 5-7 departments, you can create the departments table with the DepartmentID tinyint column to store the unique number of each department.

If you need to store integer data from -32,768 through 32,767, use smallint data type. The columns with smallint data type use only two bytes to store their values, in comparison with four bytes and eight bytes used to store the columns with int and bigint data types accordingly. For example, if you design tables for a company with several hundred employees, you can create an employee table with the EmployeeID smallint column to store the unique number of each employee.

If you need to store integer data from -2,147,483,648 through 2,147,483,647, use int data type. The columns with int data type use only four bytes to store their values, in comparison with eight bytes used to store the columns with bigint data types. For example, to design tables for a library with more than 32,767 books, create a books table with a BookID int column to store the unique number of each book.

Use smallmoney data type instead of money data type, if you need to store monetary data values from 214,748.3648 through 214,748.3647. The columns with smallmoney data type use only four bytes to store their values, in comparison with eight bytes used to store the columns with money data types. For example, if you need to store the monthly employee payments, it might be possible to use a column with the smallmoney data type instead of money data type.

Use smalldatetime data type instead of datetime data type, if you need to store the date and time data from January 1, 1900 through June 6, 2079, with accuracy to the minute. The columns with smalldatetime data type use only four bytes to store their values, in comparison with eight bytes used to store the columns with datetime data types. For example, if you need to store the employee's hire date, you can use a column with the smalldatetime data type instead of datetime data type.

Use varchar/nvarchar columns instead of text/ntext columns whenever possible. Because SQL Server stores text/ntext columns on the Text/Image pages separately from the other data, stored on the Data pages, it can take more time to get the text/ntext values.

Use char/varchar columns instead of nchar/nvarchar if you do not need to store unicode data. The char/varchar value uses only one byte to store one character, the nchar/nvarchar value uses two bytes to store one character, so the char/varchar columns use two times less space to store data in comparison with nchar/nvarchar columns.

See SQL Server Optimization Tips for more Microsoft SQL Server optimization tips.

Source: http://www.databasejournal.com/features/mssql/article.phpr/2212141

Monday, November 26, 2007

[mssql] Cannot resolve the collation conflict between "Latin1_General_CI_AI" and "SQL_Latin1_General_CP1_CI_AS" in the equal to operation.

I'm trying to get a match between the sysobjects.name and the objname returned by ::fn_listextendedproperty from the extended properties table (or view or wherever they hide that stuff in SQL 2005):

SELECT o.[id] AS tbl_id, o.[name] AS tbl_nm,
       x.value AS col_desc
  FROM sysobjects o (NOLOCK)
  INNER JOIN ::fn_listextendedproperty(
               N'MS_Description',
               N'user', N'dbo',
               N'table', N'MyTable',
               NULL, default) x
  ON x.objname = o.name
  WHERE o.name = 'MyTable'

The script returns an error message like this:

Msg 468, Level 16, State 9, Line 1
Cannot resolve the collation conflict between "Latin1_General_CI_AI" and "SQL_Latin1_General_CP1_CI_AS" in the equal to operation.

Solution: Rewrite your join like this

ON o.[name] = e.objname COLLATE Latin1_general_CI_AS

Source: http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=177141&SiteID=1

Sunday, November 25, 2007

[mssql] Use ADO, getrows, or a Stored Procedure?

Introduction

Presenting a large set of data in readable pages is a standard task for Web-application developers. Applications that need to produce search results in a digestible manner need to apply paging algorithms. This article will compare the three ways in which paging can be achieved for an ASP/SQL Server 7 application. The first uses ActiveX Data Object (ADO) paging; the second uses getrows and an array; and the third uses a SQL Server 7 stored procedure. Of those, the third is the most efficient because it returns the minimum set of data from the database server to the Web server.

Solution 1: Paging via ADO

To use ADO's paging system, the cursor location of either the recordset or connection object must be set to adUseClient, or a client-side cursor, before the recordset is filled.

rstProjects.CursorLocation = adUseClient

Next, execute a command or query and obtain a recordset. Set the page size property of the recordset to the number of records that a page should show. And set the absolute page property of the recordset to the current page number. These two properties enable ADO to find and present the subset of the recordset that is relevant for the current page. Note that with these properties set, ADO will still return the entire recordset to the client; only the BOF and EOF markers (ADO properties) will be different.

To maintain state information about the current page, either use a query string or a hidden field. In either case, ensure that if the user chooses to move to the next page, the current page variable increases by one, and if the user chooses to move to the previous page, the current page variable decreases by one.


If (Len(Request.QueryString("page")) = 0) Or (IsNumeric(Request.QueryString("page")) = False) Then
  intCurrentPage = 1
Else
  intCurrentPage = CInt(Request.QueryString("page"))
End If
rstProjects.PageSize = 25
rstProjects.AbsolutePage = intCurrentPage

The next step is to construct the loop that will display the records for the current page. As per sound ADO practice, when looping through a recordset to display data, set up local variables that reference the various fields and then access those local variables.

If rstProjects.RecordCount > 0 Then

  Dim fldProject, fldBuyer, fldBidder, fldAverageBid
  Set fldProject = rstProjects.Fields("Project")
  Set fldBuyer = rstProjects.Fields("Buyer")
  Set fldBidder = rstProjects.Fields("Bidder")
  Set fldAverageBid = rstProjects.Fields("AverageBid")

  Dim intCurrentRecord
  intCurrentRecord = 1

  Do
    Response.Write("<p>" & fldProject.Value & "</p>" & vbCrLf)
    Response.Write("<p>" & fldBuyer.Value & "</p>" & vbCrLf)
    Response.Write("<p>" & fldBidder.Value & "</p>" & vbCrLf)
    Response.Write("<p>" & fldAverageBid.Value & "</p>" & vbCrLf)

    rstProjects.MoveNext
    intCurrentRecord = intCurrentRecord + 1
    If rstProjects.EOF Then Exit Do
  Loop While intCurrentRecord <> rstProjects.PageSize

Any kind of loop could achieve the same effect, but I've presented a fairly verbose Do-Loop-While loop for heuristic purposes.

Solution 2: Paging via getrows and an Array

The getrows approach will mimic the same algorithm as ADO paging but will use an array instead of an ADO recordset. By putting all the data into an array shortly after the recordset is filled, the recordset object can be closed and set to nothing, thereby freeing up system resources. As a result, this approach should be more efficient that the ADO approach.


If rstProjects.RecordCount > 0 Then

  Dim arrProjects
  arrProjects = rstProjects.GetRows

  rstProjects.Close
  Set rstProjects = Nothing

  Dim intTotalRecords
  intTotalRecords = UBound(arrProjects,2)

  Dim intPageSize
  intPageSize = 25

  Dim intStart
  intStart = ((intCurrentPage - 1) * intPageSize)

  Dim intEnd
  intEnd = intStart + intPageSize - 1

  Dim fldProject, fldBuyer, fldBidder, fldAverageBid

  Dim intCurrentRecord
  intCurrentRecord = intStart

  Do
    fldProject = arrProjects(0, intCurrentRecord)
    fldBuyer = arrProjects(1, intCurrentRecord)
    fldBidder = arrProjects(2, intCurrentRecord)
    fldAverageBid = arrProjects(3, intCurrentRecord)

    Response.Write("<p>" & fldProject & "</p>" & vbCrLf)
    Response.Write("<p>" & fldBuyer & "</p>" & vbCrLf)
    Response.Write("<p>" & fldBidder & "</p>" & vbCrLf)
    Response.Write("<p>" & fldAverageBid & "</p>" & vbCrLf)

    intCurrentRecord = intCurrentRecord + 1
    If intCurrentRecord >= intTotalRecords Then Exit Do
  Loop While intCurrentRecord < intEnd

When using getrows, avoid misusing the two-dimensional array built by getrows. In particular, it is not uncommon to mistake a column when setting up the fields to display.

Solution 3: Paging via a SQL Server 7 Stored Procedure

The third and final approach involves a stored procedure. This is the most efficient approach because, unlike ADO and getrows which both return the entire set of records to the Web server, the stored procedure returns only the records that are needed for the current page.

Once again, the paging algorithm is roughly the same as ADO, but here the current page and the page size are passed into the stored procedure as input parameters. The stored procedure then selects the set of records needed for the current page by setting up a temporary table with an identity field and using the identity field to determine which records should be returned, given the current page and page size.


CREATE PROCEDURE "sprocInformationTechnologyProjects"
  @Page int,
  @Size int
AS

  DECLARE @Start int, @End int

  BEGIN TRANSACTION GetDataSet

  SET @Start = (((@Page - 1) * @Size) + 1)
  IF @@ERROR <> 0
    GOTO ErrorHandler

  SET @End = (@Start + @Size - 1)
  IF @@ERROR <> 0
    GOTO ErrorHandler

  CREATE TABLE #TemporaryTable
  (
    Row int IDENTITY(1,1) PRIMARY KEY,
    Project varchar(100),
    Buyer int,
    Bidder int,
    AverageBid money
  )
  IF @@ERROR <> 0
    GOTO ErrorHandler

  INSERT INTO #TemporaryTable
  SELECT ...
  // Any kind of select statement is possible with however many joins
  //  as long as the data selected can fit into the temporary table.
  IF @@ERROR <> 0
    GOTO ErrorHandler

  SELECT Project, Buyer, Bidder, AverageBid
  FROM #TemporaryTable
    WHERE (Row >= @Start) AND (Row <= @End)   IF @@ERROR <> 0
    GOTO ErrorHandler

  DROP TABLE #TemporaryTable

  COMMIT TRANSACTION GetDataSet
  RETURN 0

ErrorHandler:
  ROLLBACK TRANSACTION GetDataSet
  RETURN @@ERROR

With SQL Server 2000, the stored procedure for paging described above can be enhanced by using a local table and a "user-defined function."

Conclusion

In summary, programming within a stored procedure is always superior because network traffic is kept to a minimum. Even with ASP.NET's DataGrid with in-built paging around the corner, the stored procedure approach to paging is the best when dealing with large sets of data.

About the Author

Stephen Lian is the principal of Active Data Online Pty Ltd, http://www.activedataonline.com.au.

Source: http://www.15seconds.com/Issue/010308.htm

Wednesday, November 21, 2007

[mssql] Count Number of Certain Character in A String

Some times we need to know number of certain character in a field, for example to know:

How deep the directory path by calculating the number of backslash in a path.
How many item in a list of string by calculating number of comma in the string.

Following the statements for that purpose:

DECLARE @cPath varchar(100)

SET @cPath = 'D:\TEMP\test.txt'
SELECT len(@cPath) - len(replace(@cPath, '\', ''))

Tuesday, November 13, 2007

[mssql] Uniqueidentifier vs. IDENTITY

Kim writes: "with an identity column you can SELECT @@identity after inserting a row. How do you get the value of a uniqueidentifier column after an insert? thanks!"

Uniqueidentifiers are new to SQL 7.0, and behave a little differently than our friend the IDENTITY column. Read on for the answer to Kim's question along with the SQLGuru's opinions on which is better as a key value.

First off, for those of you not familiar with the uniqueidentifier datatype, here's the lowdown:

Uniqueidentifiers are also referred to as GUIDs. (Globally Unique IDentifier)
That is, the API call that returns a GUID is guaranteed to always return a unique value across space and time. I don't know the full mechanics of creating a GUID, but I seem to remember that it has something to do with the MAC address on your network card and the system time.
To get a GUID in SQL Server (7.0+), you call the NEWID() function.
The uniqueidentifier data type in SQL Server is stored natively as a 16-byte binary value.
This is an example of a formatted GUID: B85E62C3-DC56-40C0-852A-49F759AC68FB.

Now, unlike an IDENTITY column, a uniqueidentifier column doesn't automagically get an assigned value when a row is inserted into a table. You either need to place a default on the uniqueidentifier column (DEFAULT NEWID()), or do something like the following:

DECLARE @GUID uniqueidentifier
SET @GUID = NEWID()
INSERT Item VALUES (@GUID,'Yak Hoof')

So, to answer Kim's question: Sorry, there isn't a way to get the value of a uniqueidentifier column after an insert. You can get it before the insert, however, by using the above code.

The major advantage of using GUIDs is that they are unique across all space and time. This comes in handy if you're consolidating records from multiple SQL Servers into one table, as in a data warehousing situation. GUIDs are also used heavily by SQL Server replication to keep track of rows when they're spread out among multiple SQL Servers.

The main disadvantage to using GUIDs as key values is that they are BIG. At 16 bytes a pop, they are one of the largest datatypes in SQL Server. Indexes built on GUIDs are going to be larger and slower than indexes built on IDENTITY columns, which are usually ints (4 bytes).

Not only that, but they're just plain hard to read. Unless you need a truly globally unique identifier, you're probably better off sticking with an IDENTITY.

Source: http://www.sqlteam.com/article/uniqueidentifier-vs-identity

Sunday, November 4, 2007

[mssql] Get Login Name

DECLARE @cLogin varchar(25)     --> declare variable
SELECT @cLogin = suser_sname()  --> store to variable
SELECT @cLogin AS login_name    --> show result

Friday, November 2, 2007

[mssql] Get Client IP Address

DECLARE @cHost varchar(25)     --> Declare variable
SELECT @cHost = host_name()    --> store to variable
SELECT @cHost AS host_name     --> show result

Source:
http://www.dbforums.com/archive/index.php/t-1604725.html
http://msdn2.microsoft.com/en-us/library/ms174427.aspx

[mssql] Bit Manipulations

Introduction

When you use a value in your database or application, the value must be stored somewhere in the computer memory using a certain amount of space. A value occupies space that resembles a group of small boxes. In our human understanding, it is not always easy to figure out how a letter such as as B is stored in 7 seven small boxes when we know that B is only one letter.

Bit manipulation or a bit related operation allows you to control how values are stored in bits. This is not an operation you will need to perform very often, especially not in the early stages of your database. Nevertheless, bit operations (and related overloaded operators) are present in all or most programming environments, so much that you should be aware of what they do or what they offer.

Bits Operators: The Bitwise NOT Operator ~

One of the operations you can perform on a bit consists of reversing its value. That is, if a bit holds a value of 1, you may want to change it to 0 and vice-versa. This operation can be taken care of by the bitwise NOT operator that is represented with the tilde symbol ~

The bitwise NOT is a unary operator that must be placed on the left side of its operand as in

~Value

Here is an example:

PRINT ~158

To perform this operation, the Transact-SQL interpreter considers each bit that is part of the operand and inverts the value of each bit from 1 to 0 or from 0 to 1 depending on the value the bit is holding. This operation can be resumed in the following table:

Bit	~Bit
1	0
0	1

Consider a number with a byte value such as 248. In our study of numeric systems, we define how to convert numbers from one system to another. Based on this, the binary value of decimal 248 is 1111 1000 (and its hexadecimal value is 0xF8). If you apply the bitwise NOT operator on it to reverse the values of its bits, you would get the following result:

Value	1	1	1	1	1	0	0	0
~Value	0	0	0	0	0	1	1	1

Bits Comparison: The Bitwise AND Operator &

The bitwise & is a binary operator that uses the following syntax

Operand1 & Operand2

This operator considers two values and compares the bit of each with the corresponding bit of the other value. If both corresponding bits are 1, the comparison produces 1. Otherwise, that is, if either bit is 0, the comparison produces 0. This comparison is resumed as follows:

Bit1	Bit2	Bit1 & Bit2
0	0	0
1	0	0
0	1	0
1	1	1

Imagine you have two byte values represented as 187 and 242. Based on our study of numeric systems, the binary value of decimal 187 is 1011 1011 (and its hexadecimal value is 0xBB). The binary value of decimal 242 is 1111 0010 (and its hexadecimal value is 0xF2). Let’s compare these two values bit by bit, using the bitwise AND operator:

	Binary								Decimal
N1	1	0	1	1	1	0	1	1	187
N2	1	1	1	1	0	0	1	0	242
N1 & N2	1	0	1	1	0	0	1	0	178

Most of the times, you will want the interpreter to perform this operation and use the result in your program. This means that you can get the result of this operation and possibly display it to the user. The above operation can be performed by the following program:

PRINT 187 & 242

This would produce 178

Bits Comparison: The Bitwise OR Operator |

You can perform another type of comparison on bits using the bitwise OR operator that is represented by |. Its syntax is:

Value1 | Value2

Once again, the interpreter compares the corresponding bits of each operand. If at least one of the equivalent bits is 1, the comparison produces 1. The comparison produces 0 only if both bits are 0. This operation is resumed as follows:

Bit1	Bit2	Bit1 \| Bit2
0	0	0
1	0	1
0	1	1
1	1	1

Once again, let’s consider decimals 187 and 242. Their bitwise OR comparison would render the following result:

	Binary								Decimal
N1	1	0	1	1	1	0	1	1	187
N2	1	1	1	1	0	0	1	0	242
N1 \| N2	1	1	1	1	1	0	1	1	251

You can also let the compiler perform the operation and produce a result. Here is an example:

PRINT 187 | 242

This would produce 251

Bits Comparison: The Bitwise-Exclusive XOR Operator ^

Like the previous two operators, the bitwise-exclusive OR operator performs a bit comparison of two values. It syntax is:

Value1 ^ Value2

The compiler compares the bit of one value to the corresponding bit of the other value. If one of the bits is 0 and the other is 1, the comparison produces 1. In the other two cases, that is, if both bits have the same value, the comparison produces 0. This operation is resumed as follows:

Bit1	Bit2	Bit1 ^ Bit2
0	0	0
1	0	1
0	1	1
1	1	0

We will again consider decimals 187 and 242. Their bitwise-exclusive XOR comparison would render the following result:

	Binary								Decimal
N1	1	0	1	1	1	0	1	1	187
N2	1	1	1	1	0	0	1	0	242
N1 ^ N2	0	1	0	0	1	0	0	1	73

If the interpreter performs this operation, it can produce a result as in the following example:

PRINT 187 ^ 242;

This would produce 73.

Monday, October 29, 2007

[windows] Shutdown Windows Automatically

Following are the steps to create the scheduled task to shutdown the windows in specified time:

1.	Click Start, and then click Control Panel.
2.	Click Performance and Maintenance, and then click Scheduled Tasks.
3.	Double-click Add Scheduled Task. The Scheduled Task Wizard starts.
4.	Click Next.
5.	Under Click the program you want Windows to run, click Browse.
6.	In the Select Program to Schedule dialog box, locate the %SystemRoot%\System32 folder, locate and click the Shutdown.exe file, and then click Open.
7.	Under Perform this task, specify a name for the task and how frequently you want this task to run, and then click Next.
8.	Under Select the time and day you want this task to start, specify a start time and date for the task, and then click Next.
9.	Type the user name and password to run this task under, and then click Next.
10.	Click to select the Open advanced properties for this task when I click Finish check box, and then click Finish.
11.	Click the Task tab.
12.	In the Run box, specify any additional parameters that you want to use with Shutdown.exe.
13.	Click OK

Tuesday, October 23, 2007

Recommended Add-ons for Mozilla Thunderbird

Following are the recommended add-ons for Mozilla Thunderbird:

XNote
Persistent sticky notes for Thunderbird associated to mails.
Clippings
Saves and manages frequently-entered text for later pasting into forms and input fields without the hassle of retyping or repetitive copying and pasting. Perfect for forms, web mail, blogging, wiki editing and forum posts.
ReminderFox
An extension that displays and manages lists of date-based reminders and ToDo's. ReminderFox does not seek to be a full-fledged calendar system. In fact, the target audience is anybody that simply wants to remember important dates (birthdays, anniversaries, etc) without having to run a fat calendar application.
Auto Zip Attachments
Allows you to easily compress (zip) attachments. A new button named "AutoZip" is added to the compose toolbar.
Upon pressing the button all the attachments would be compressed into a single file. Then the attachments are replaced with the zipped file.
Attachment Extractor
Just select the messages containing the attachments you want to extract and select 'Extract Attachments' and this extension does the rest. No more having to go through each message to extract the attachments!
ConfirmFolderMove
Introduces a preference that allows users to be prompted before moving folders.
Enigmail
Enigmail adds OpenPGP message encryption and authentication to your email client. It features automatic encryption, decryption and integrated key management functionality. Enigmail requires GnuPG (www.gnupg.org) for the cryptographic functions.
Note: GnuPG is not part of the installation.
NotToForward
Enables you to avoid sending mail to specific recipients.
SenderFace
Enables you to associate images (faces) with Email senders (from field).

Recommended Add-ons for FireFox

Following are the recommended Add-ons for FireFox:

WebDeveloper
Adds a menu and a toolbar with various web developer tools.
FireBug
Firebug integrates with Firefox to put a wealth of development tools at your fingertips while you browse. You can edit, debug, and monitor CSS, HTML, and JavaScript live in any web page.
GSpace
This extension allows you to use your Gmail Space (2.8 GB and growing) for file storage. It acts as an online drive, so you can upload files from your hard drive and access them from every Internet capable system. The interface will make your Gmail account look like a FTP host.
FlashGot
Download one link, selected links or all the links of a page together at the maximum speed with a single click, using the most popular, lightweight and reliable external download managers. Supported download tools are dozens, see http://flashgot.net for details. This extension offers also a Build Gallery functionality which helps to collect in a single page serial movies and images scattered on several pages, for easy and fast "download all".
Clipmarks
With Clipmarks, you can clip the best parts of web pages. Whether it's a paragraph, sentence, image or video, you can capture just the pieces you want without having to bookmark the entire page.

You can save your clips privately or publicly, email them to friends, or post them directly to your blog. Our new ClipSearch feature lets you search based on matching keywords inside the content you clip, as well as any tags, titles and descriptions you add. In effect, by clipping the best parts of Web pages, you build a personal search engine for the things that matter to you.
iMacros
The Web browser is probably the most frequently used software today, and many tasks are highly repetitious. iMacros for Firefox relieves the repetition of checking on the same sites every day, remembering passwords and filling out web forms. Web professionals can use the software for functional testing and regression web testing of web applications. Another use is to automate web scraping, for example online stock quotes or web store prices.
VideoDownloader
Download videos from Youtube, Google, Metacafe, iFilm, Dailymotion and other 60+ video sites ! And all embedded objects on a webpage (movies, mp3s, flash, quicktime, etc) ! Directly.

Friday, October 5, 2007

[windows] Shutdown Windows Using Script

Sub ShutDown()
  Dim Connection, WQL, SystemClass, System

  'Get connection To local wmi
  Set Connection = GetObject("winmgmts:root\cimv2")

  'Get Win32_OperatingSystem objects - only one object In the collection
  WQL = "Select Name From Win32_OperatingSystem"
  Set SystemClass = Connection.ExecQuery(WQL)

  'Get one system object
  'I think there is no way To get the object using URL?
  For Each System In SystemClass
    System.Win32ShutDown (2)
  Next
End Sub

Value	Meaning
0	Log Off
0 + 4	Forced Log Off
1	Shutdown
1 + 4	Forced Shutdown
2	Reboot
2 + 4	Forced Reboot
8	Power Off
8 + 4	Forced Power Off

Simple extension to shutdown remote computer.

Sub ShutDownEx(Server, User, Password)
  Dim Connection, WQL, SystemClass, System

  'Get connection To remote wmi
  Dim Locator
  Set Locator = CreateObject("WbemScripting.SWbemLocator")
  Set Connection = Locator.ConnectServer(Server, "root\cimv2", User, Password)

  'Get Win32_OperatingSystem objects - only one object In the collection
  WQL = "Select Name From Win32_OperatingSystem"
  Set SystemClass = Connection.ExecQuery(WQL)

  'Get one system object
  'I think there is no way To get the object using URL?
  For Each System In SystemClass
    System.Win32ShutDown (2)
  Next
End Sub

http://www.motobit.com/tips/detpg_wmi-windows-system-shutdown/

Wednesday, October 3, 2007

[mssql] 4 Simple Rules for Handling NULLs

Introduction

"As we know,
There are known knowns.
There are things we know we know.
We also know
There are known unknowns.
That is to say
We know there are some things
We do not know.”- Donald Rumsfeld

In a recent article by James Travis, the handling of NULLs in SQL Server was explored. I discovered some inaccuracies in the article; some caused by inaccuracies in SQL Server Books Online, others rooted in trying to apply other standard 3GL and 4GL programming language two-valued logic to the three-valued logic employed by ANSI SQL.

This article was written to explain ANSI SQL three-valued logic and correct the inaccuracies presented

What is NULL?

First, let's start by defining what a NULL is in SQL-speak. ANSI-SQL defines NULL as "a special value, or mark, that is used to indicate the absence of any data value." (ANSI-92) NULL in SQL is not a "data value", but rather an indicator that we have missing data. As an example, we’ll look at the following table which contains last name, first name and middle name.

Last_Name	First_Name	Middle_Name
'Smith'	'Linda'	'Anita'
'Washington'	'Larry'	''
'Jones'	'John'	NULL

In this example, we know that Linda Smith has a middle name, and we know that it is 'Anita'. We have populated Larry Washington's Middle_Name column with a Zero-Length String ('') indicating that we know for a fact Larry has no middle name. These are the "known knowns" in Rumsfeld-speak.

In the case of John Jones, however, we have set the Middle_Name column to NULL. This indicates that we do not know whether John has a middle name or not. We may discover, in the future, that John has a middle name; likewise, we may discover he has no middle name. In either case, we will update his row accordingly. So, NULLs represent our "known unknowns"; the only thing we know for a fact is that we do *not* know whether John has a middle name or not.

Because of the inherent issues surrounding NULLs and data integrity, three-valued logic, scalar arithmetic and performance, DBA's and Database Designers tend to try to minimize the use of NULLABLE columns in their tables.

Rule #1: Use NULLs to indicate unknown/missing information only. Do not use NULLs in place of zeroes, zero-length strings or other "known" blank values. Update your NULLs with proper information as soon as possible.

Three-Valued Logic

Always in the Top 10 List of the most confusing aspects of ANSI SQL, for those coming from other 3GL and 4GL languages is three-valued logic. Anyone coming from a background in most other computer languages will instantly recognize the two-valued logic table:

Two-Valued Logic Table
p	q	p AND q	p OR q
True	True	True	True
True	False	False	True
False	True	False	True
False	False	False	False

And of course the logic for NOT is fairly simple:

Two-Valued NOT Logic Table
p	NOT p
True	False
False	True

So what is Three-Valued Logic? In SQL three-valued logic, we introduce the concept of Unknown into our logic, which is closely associated with NULL. Our SQL Logic Table looks like this:

Three-Valued Logic Table
p	q	p AND q	p OR q
True	True	True	True
True	False	False	True
True	Unknown	Unknown	True
False	True	False	True
False	False	False	False
False	Unknown	False	Unknown
Unknown	True	Unknown	True
Unknown	False	False	Unknown
Unknown	Unknown	Unknown	Unknown

And in SQL logic, the NOT logic table looks like this:

Three-Valued NOT Logic Table
p	NOT p
True	False
False	True
Unknown	Unknown

So how do we arrive at Unknown in these logic tables? Simply put, we try to compare NULLs to data values, or other NULLs. Per the ANSI-92 SQL Standard, comparisons with NULL always result in Unknown.

Try the following SQL Query in Query Analyzer (requires the Northwind Sample Database):

SET ANSI_NULLS ON
SELECT * FROM [Northwind].[dbo].[Employees]
WHERE [Region] IS NULL

This query returns four employee records. Now try this query:

SET ANSI_NULLS ON
SELECT * FROM [Northwind].[dbo].[Employees]
WHERE [Region] = NULL

This query returns no records. The reason for this is that '[Region] = NULL' evaluates to Unknown every time. Since it never evaluates to True for any rows, it returns no records.

NOTE: It has been widely misstated (even in SQL Server Books Online) that "the result of a comparison is false if one of the operands is NULL." In fact, the ANSI-92 SQL Standard states that if one of the operands is NULL, the result of the comparison is "UNKNOWN". (ANSI-92 SQL Standard) SQL Books Online has recently been updated to reflect the accurate ANSI Standard concerning NULL comparisons (See the update here: Microsoft ANSI_NULLS).

NULL Versus NULL

NULLs are not equal to other NULLs. In his article, Understanding the Difference Between "IS NULL" and "=NULL", James Travis states that a comparison to a variable initialized to NULL returns False. And it would appear so by his example:

DECLARE @val CHAR(4)
SET @val = NULL
SET ANSI_NULLS ON
If @val = NULL
PRINT 'TRUE'
ELSE
PRINT 'FALSE'

Mr. Travis' example prints 'FALSE', which indicates that the result of the comparison '@val = NULL' is False. This is not the case, however. The result of '@val = NULL' is Unknown. The IF statement will print 'TRUE' if, and only if, the result of the comparison is True; otherwise it falls through to the ELSE statement and prints 'FALSE'. For three-valued logic, we need three comparisons to figure out what the actual result of the '@val = NULL' comparison is:

SET ANSI_NULLS ON
DECLARE @val CHAR(4)
SET @val = NULL
SET ANSI_NULLS ON
If @val = NULL
PRINT 'TRUE'
ELSE IF NOT(@val = NULL)
PRINT 'FALSE'
ELSE
PRINT 'UNKNOWN'

Now we see that the result of the comparison is neither True nor False. It is the third value in our three-valued logic, Unknown. Try this statement to demonstrate that NULL is not equal to other NULLs:

SET ANSI_NULLS ON
IF NULL = NULL
PRINT 'TRUE'
ELSE IF NOT (NULL = NULL)
PRINT 'FALSE'
ELSE
PRINT 'UNKNOWN'

This statement prints 'UNKNOWN' because 'NULL = NULL' evaluates to Unknown. Since 'NULL = NULL' evaluates to Unknown, if we look back at the NOT Logic table for three-valued logic, we see that 'NOT(Unknown)' also evaluates to Unknown. This means that 'NOT(NULL = NULL)' is Unknown as well. So what do we expect when we run this statement?

SET ANSI_NULLS ON
IF NULL <> NULL
PRINT 'TRUE'
ELSE IF NOT (NULL <> NULL)
PRINT 'FALSE'
ELSE
PRINT 'UNKNOWN'

If you expect to see ‘TRUE’, guess again! ‘FALSE’? Not a chance. Remember what we said earlier – any comparisons with NULL evaluate to Unknown (neither True nor False). Since the comparisons in the IF statements evaluate to Unknown, ‘UNKNOWN’ is printed in this instance.

The same is true for any comparisons with NULL under the ANSI Standard; whether you are comparing NULL to a CHAR, INT or any other value, variable or table column.

Rule #2: In ANSI SQL, NULL is not equal to anything, even other NULLs! Comparisons with NULL always result in UNKNOWN.

NULL Variables

Variables that have been declared, but not yet initialized via SET statement default to NULL. They are treated exactly like any other NULL values. We can modify the previous queries to see this for ourselves:

SET ANSI_NULLS ON
DECLARE @MyRegion NVARCHAR(15)
SELECT * FROM [Northwind].[dbo].[Employees]
WHERE [Region] = @MyRegion

This returns no rows, just like in the previous query with '=NULL' in the WHERE clause. This has exactly the same effect as the following:

SET ANSI_NULLS ON
DECLARE @MyRegion NVARCHAR(15)
SET @MyRegion = NULL
SELECT * FROM [Northwind].[dbo].[Employees]
WHERE [Region] = @MyRegion

NOTE: You cannot put a variable in place of NULL in the IS NULL clause. The following query will *not* work:

SET ANSI_NULLS ON
DECLARE @MyRegion NVARCHAR(15)
SELECT * FROM [Northwind].[dbo].[Employees]
WHERE [Region] IS @MyRegion

ANSI Versus Microsoft

As we described above, the ANSI SQL Standard describes the result of any comparison with NULL as Unknown. So NULL = NULL, NULL <> 10 and NULL < ‘Hello’ are all Unknown.

Presumably to make the transition smoother for 3GL and 4GL programmers from other languages moving over to SQL, Microsoft implemented the SET ANSI_NULLS OFF option, which allows you to perform = and <> comparisons with NULL values. To demonstrate this in action, we can perform the following query:

SET ANSI_NULLS OFF
SELECT * FROM [Northwind].[dbo].[Employees]
WHERE [Region] = NULL

This allows you to perform basic equality comparisons to NULL; however, SQL written using this option may not be portable to other platforms, and might confuse others who use ANSI Standard NULL-handling. Also, if you try to perform this query without turning ANSI_NULLS OFF first, you will get no rows returned. The safe bet is to use SET ANSI_NULLS ON.

Rule #3: Use SET ANSI_NULLS ON, and always use ANSI Standard SQL Syntax for NULLs. Straying from the standard can cause problems including portability issues, incompatibility with existing code and databases and returning incorrect results.

NULL Math

One final point concerns math and string operations with NULLs. If you perform scalar math operations and string concatenation functions with NULL, the result is always NULL. For instance, this query returns NULL:

SELECT 1 + NULL

A numeric value plus, minus, divided by, or multiplied by NULL always equals NULL. A string concatenation with NULL equals NULL as well:

SELECT 'Joe' + NULL

So what if you want to treat NULL as a zero in a math operation? Use SQL's COALESCE() function:

SELECT 1 + COALESCE(NULL, 0)

The COALESCE() function returns the first non-NULL value in its list of values. So in the statement above, "COALESCE(NULL, 0)" returns 0. The SELECT statement above returns 1. Similarly you can use COALESCE() in string concatenations:

SELECT 'Joe' + COALESCE(NULL, '')

The COALESCE() function in this case returns a Zero-Length String, which is appended to 'Joe'. The end result is that 'Joe' is returned instead of NULL. SQL Server provides an additional function called ISNULL() which performs similarly to COALESCE(). Use the COALESCE() function (or alternatively use CASE), as it is ANSI standard syntax.

Rule #4: The ANSI Standard COALESCE() and CASE syntaxes are preferred over ISNULL() or other proprietary syntax.

Conclusions

SQL NULLs and three-valued logic can be confusing for a new SQL Database Designer; particularly if you are coming from a background in another 3GL or 4GL language that uses two-valued logic (i.e., C++, VB, etc.) Learning when and how to use NULLs in your databases, how SQL handles NULLs, and the results of comparisons and math/string operations with NULLs in SQL is very important in order to obtain consistent and optimized results.

For further information, you can visit:

What is the Best Method to Back up Data?

This being the age of internet, chat rooms, blogging and innovative hackers, their expectation never gets fulfilled. The news always leaks out or is methodically publicised by a rival or a gleeful hacker. Thereafter, insult is added to their injury as irate customers and the eager press demand explanations about every unsavoury detail of the incident. But data loss is not a problem affecting only the big players only any more. All of us deal with some amount of data and almost all of it is important to us. Albums have been replaced by .jpeg files, our music is inside the hard drive of our pc or laptop, and our emails are stored in offline folders or software like Microsoft Outlook. Students keep their study material in their machines, and a doctoral thesis is usually a file on the hard disk instead of being a thick volume bound in leather covers. It is, therefore, strange that we keep losing data because we have not taken the precaution of backing it up elsewhere. Several start-up home businesses have not seen the light of the day anymore after an initial incident of data loss. All of this underlines the importance of backup, which really cannot be overstressed.

Some Methods of Obtaining Backups

The method, or rather, the medium chosen for backing up data will depend on the volume of data to be stored, as well as the nature of the data. Some common methods are being enlisted below:

CD and Floppy: The floppy has largely gone out of use by now, due to its many limitations, small capacity and security problems. The CD is a good choice for obtaining backups, and is one of the favourite choices in both domestic and corporate spheres. It is cheap, readily available, portable and compatible with several file formats. Storing a CD and sharing information from it are also quite simple.

DVD: The DVD is a good choice for storing audio visual information. It may not be as cheap as the CD, but offers better quality, often lasts longer, and has way more space. DVD RWs can be used in such a way that literally every bit is used to store information. A DVD also has the same facilities when it comes to sharing and easy transporting. However, the DVD RW, many market observers feel, may turn out to be an intermediate technology and be replaced by something superior but very similar in function, or may evolve in such a way that the problems would get solved. It must be noted here that the constant research in this regard, and inventions like the Blue Ray disk etc. point out that we can hope for a better technology soon.

USB Devices: These are the new favourites all over the world. The greatest proof of the popularity of this technology lies in two facts. Firstly, the storage space of pen drives / flash drives/ memory sticks is constantly increasing. Secondly, their prices are plummeting downwards rapidly. Almost all kinds of files can be stored on these; they are very easy to carry, do not require a separate software or booting up of the machine to start functioning, and it is very convenient to share the data stored on them. For storing smaller music libraries, an iPod may be an option too.

External Hard Drives: For larger data storage requirements, external hard drives are excellent solutions. Arrays or stacks of hard drives are available for corporate purposes and are provided by all good hard drive manufacturers. Seagate, Maxtor, ASB, LaCie all make external hard drives of varying capacities and prices.

Online and Offline Storage Spaces: These are ‘spaces’ provided by professional firms, and are extremely secure for sensitive data storage.

Tapes and Printouts: Traditional storage should not be left out of the list. There is some information that one simply needs to see on the good old piece of paper. Tape has been in use for a long time now, and will probably continue for some time.

By: Jammesh Wallsh
http://saveav.blogspot.com/2007/09/what-is-best-method-to-back-up-data.html

Ensuring Data Protection

All of us require computers be it for personal use or work, or for some commercial purpose. For the same reason, it is important to take precautions because data is prone to leakage. The situation is most risky when you use the Internet as well.

Think about what you experience when you lose important data. You cannot even sleep properly if your data is lost. It is thus essential to have regular backups done.

Nobody wants their personal and confidential information to be disclosed over the Internet but unfortunately with the growing use of computers these days, data is no longer secure. The size of your organization or business is hardly a concern. It doesn't matter if you are using your computer at home or in office. The only concern is data protection. It is a great challenge in itself to protect confidential data. It is also important to be aware of what exactly are the reasons behind these data loss.

Various tools are available in the market these days in order to protect your valuable data. Some of these are backups, firewalls and various protection software packages for viruses.

Negligence by an employee is one of the other reasons for data leakage. Thus, it is considered therefore that employees can prevent a major portion of this data loss. If the employee is equipped with basic knowledge about computers and are honest while performing their tasks, lots of important data can be prevented from getting lost. If some strong security system is not implemented, employees generally take this for granted and work carelessly. This may mean that confidential data is lost.

Here are some methods that would help to protect your data:

Encryption: You can protect your data using a technique called encryption. In this technique, data is encrypted or is converted into some code, which can be decoded by only those who know the key or the related password. These days with the availability of so many data encryption software in the market, you can choose any one of them to ensure safe transaction of data.

Backups: Backing up your data is one of the tools that you should always use in order to avoid any sudden data loss. It is most effective in case of data loss due to a sudden power failure. If you are performing regular backups, these situations can easily be avoided. There is no specified interval during which you are expected to perform a backup. However, once in a week is the least limit, though thrice a week is normally advised to perform these backups. The best thing would be to ensure back up on a daily basis. Confidentiality of your data is the only criteria to set the limit for these backups.

Firewalls: A layered type of system is followed when it comes to security. An operating system may also serve as protector of data or else a firewall may be used. Through firewalls, nobody can access your system without your permission. This way, your data becomes much safer. It is recommended to update your computer at regular intervals to maintain security in a more consistent manner.

With this helpful information, you can ensure that your data is protected and not lost.

http://saveav.blogspot.com/2007/09/ensuring-data-protection.html

Top 10 Ways To Make A Computer Run Faster

Today I am going to be talking about different ways you can make a computer run faster. There is actually a lot you can do yourself that will boost your PC's performance. So lets take a look at different ways we can remedy that slow computer of yours!

1. Free Up Disk Space

A great way to make a computer run faster is to free up some disk space. Windows comes with a disk cleanup tool that helps you free up space on your hard disk. The utility identifies files that you can safely delete without messing anything up, and then enables you to choose whether you want to delete some or all of the identified files.

How to use disk cleanup tool.

* Click Start, point to All Programs, point to Accessories, point to System Tools, and then click Disk Cleanup. If several drives are available, you might be prompted to specify which drive you want to clean.

* In the Disk Cleanup for dialog box, scroll through the content of the Files to delete list. (If you have multiple hard drives you will have to select which one to scan.)

* Clear the check boxes for files that you don't want to delete, and then click OK.

* When prompted to confirm that you want to delete the specified files, click OK.

2. Use Defrag Tool

When you add a file or a new program to a new computer, the hard drive is relatively empty so new data is saved to the hard drive in one whole block. When you need to use that information, the computer can quickly access it because it is all in one place. Defragging your system is definitely a way to make a computer run faster.

How to use defrag tool.

* From the start menu point to "all programs"
* Point to "Accessories"
* Point to "System Tools"
* Click on "Disk Defragmenter"
* The disk defragmenter will display the hard drives on your computer. Just select and click Defragment.

3. Get rid of spyware

Definition

Spyware is software with malicious intent - by design; it does something bad to your computer. Usually, it gets installed without your knowledge. It sometimes takes advantage of the fact that most people click "I Agree" to software licenses without reading them.

The only way to get rid of spyware is to have some kind of scanner, sort of like an anti-virus. In order to make a computer run faster and keep it safer your going to need an anti-spyware program. This is tricky, a lot of the spyware programs out there are malicious and will actually add viruses and spyware to your computer by using the "spyware scanner and blocker". So please remember to be very careful when choosing your anti-spyware program.

4. Find and repair disk errors

Detect and Repair Disk Errors

In addition to running Disk Cleanup and Disk Defragmenter to optimize the performance of your computer, you can check the integrity of the files stored on your hard disk by running the Error Checking utility.

How to run the error checking utility.

* Click Start, and then click My Computer.
* In the My Computer window, right-click the hard disk you want to search for bad sectors, and then click Properties.
* In the Properties dialog box, click the Tools tab.
* Click the Check Now button.
* In the Check Disk dialog box, select the Scan for and attempt recovery of bad sectors check box, and then click Start.
* If bad sectors are found, choose to fix them.
Tip: Only select the Automatically fix file system errors check box if you think that your disk contains bad sectors.

5. Get an external hard drive

Adding an External Hard Drive to your computer will allow you to clear out a lot of the things you don't need to be storing on your normal hard drive.

A great example would be your music or pictures. All those songs and pictures are sitting on your computer making it go slower. You will see a decline in your computers performance as soon as you use over half of your hard drive space.

Moving all your songs or pictures to your external hard drive will free up a lot of space on your main hard drive. You can even move applications such as photoshop or games.

An external hard drive is also handy because it is portable. You can unplug it from your computer and take it to a friends house and plug it in there, sharing photos and pictures very easily.

This is truly a great way to make a computer run faster.

By : Brooks Carver
http://saveav.blogspot.com/2007/09/top-10-ways-to-make-computer-run-faster.html

[vb6] InputBox is OK or Cancel?

You display an InputBox. But did the user press Cancel - or OK without entering anything? If you thought there was no way to tell the difference, you're wrong.

You see, if the user presses cancel, vbNullString is returned. However if they press OK, the empty string ("") is sent back. But in Visual Basic, vbNullString equates to an empty string, so you can't compare the two - even though in reality, they're completely different.

However you can use the StrPtr ('string pointer') function to determine whether the return string is indeed a vbNullString - as by definition, a vbNullString 'pointer' is zero.

Private Sub Form_Load()
  Dim strInput

  strInput = InputBox("Enter something:")

  If StrPtr(strInput) = 0 Then
    MsgBox "You pressed Cancel button!"
  ElseIf strInput = "" Then
    MsgBox "You press OK button with blank value!"
  End If
  End
End Sub

Tuesday, October 2, 2007

[mssql] Getting directory path

Following is the function to get the directory path of certain full path.

DECLARE @cPath varchar(100),
        @cDir  varchar(100)

SET @cPath = 'D:\TEMP\test.txt'
SET @cDir = (CASE WHEN charindex('\', @cPath) = 0
               THEN ''
               ELSE left(@cPath, len(@cPath) - charindex('\', reverse(@cPath)))
             END)
SELECT @cDir

[vb6] Get the Random Number

Following function to return the random value in string for certain length.

Private Function GetRandomVal(ByVal pLength%, Optional ByVal pSequence&) As String
  If pSequence = 0 Then Randomize Else Randomize pSequence
  GetRandomVal = _
    Right(String(pLength - 1, "0") & _
      Format(CSng(((10 ^ pLength) - 1 + 1) * Rnd + 1) + pSequence, _
      String(pLength, "#")), pLength)
End Function

Private Sub Form_Load()
  MsgBox GetRandomVal(4) & vbNewLine & _
         GetRandomVal(5, 1) & vbNewLine & _
         GetRandomVal(5, 2) & vbNewLine
End Sub

Monday, October 1, 2007

[mssql] Indexes and Performance

October 18, 2000
SQL Server - Indexes and Performance
By Steve Jones

Introduction

One of the keys to SQL Server performance is ensuring that you have the proper indexes on a table so that any queries written against this table can run efficiently. There are more articles written about designing indexes, choosing columns, etc for optimizing performance, so I will refrain from repeating most of what is written elsewhere. I have included a few resources at the end of this article for this topic.

However once you have built the indexes, there is still work to be done. As your data sets grow over time, SQL Server will continue to rebuild indexes and move data around as efficiently as possible. This happens in a number of ways, but the result is that you may need to perform maintenance on your indexes over time despite all of the automatic tools built into SQL Server. This article will discuss some of the issues with data growth over time as well as a technique to find tables in need of maintenance and how to perform this maintenance.

NOTE: I am including some features of SQL 2000 in this article, but the techniques are still useful in SQL v7. and v6.5 with minor modifications.

What happens over time?

If SQL Server includes auto statistic updating, a query optimizer that can learn to be more efficient with your queries, etc., why do we need to perform maintenance? Well, let's examine what happens over time.

When you build an index on a table (let's assume a clustered index), SQL Sever parcels the data across pages and extents. With v7.x and above, extents can be shared between objects (with v6.5 extents contain a single object). As a result, let's assume you create a table with rows that are <>

If you assume that we expand this example over time, we may grow to 100 pages of data. These (at a minimum) require 7 extents if this object does not share any extents. Each page within the extents links to another page with a pointer. The next page in the chain, however, may not be in the same extent. Therefore as we read the pages, we may need to "switch" to another extent.

The simplest example is assume we take 3 consecutive pages of data in the following order:

Extent 1      Extent 2
Page n        Page n + 1
Page n + 2

These are any three pages where page n links to page n+1 next, then to page n+2 and so on. To read these three pages we read extent 1, then switch to extent 2, then switch back to extent 1. These "switches" do not necessarily entail physical I/O, but all of these switches add up. They may not be a big deal on your local server or even a lightly loaded server, a web application that has hundreds or thousands of users could see a large performance impact from repeated scans of this table. Why does the table end up looking like this? This is how the table is designed to function over time. SQL Server will allocate space for each row based on the space available at that time. As a result, while a clustered index stores the data in physical order on a page, the pages may not be in physical order. Instead each page has a linkage to the next page in the sequence. Just as your hard disk can become fragmented over time as you delete and insert files, the allocations of pages for a table can be fragmented over time across extents as the data changes.

So why doesn't SQL Server just rebuild the indexes? I am not sure if I would even want it to do so. I would hate for this to occur right after a large web marketing campaign! Instead the engineers in Redmond have left it up to the DBA to track this fragmentation and repair it as necessary. How do we remove this fragmentation? Read on...

Running DBCC SHOWCONTIG

Prior to SQL Server 2000, you had to first get the object ID using the following command

select object_id('object-name')

For the user table, I ran

select object_id('user')

This returned me some long number (from sysobjects) that means nothing to me, but the SQL team in Redmond must use this often and did not feel like including the join in their code. I guess someone complained long and loud enough because in SQL 2000 you can use the name of the object in dbcc showcontig like this:

dbcc showcontig (user)

This produces the following statistics on your indexes:

DBCC SHOWCONTIG scanning 'User' table...
Table:'User' (962102468); index ID: 1, database ID: 7
TABLE level scan performed.
-Pages Scanned................................: 899
-Extents Scanned..............................: 121
-Extent Switches..............................: 897
-Avg. Pages per Extent........................: 7.4
-Scan Density [Best Count:Actual Count].......: 12.58% [113:898]
-Logical Scan Fragmentation ..................: 99.89%
-Extent Scan Fragmentation ...................: 80.99%
-Avg. Bytes Free per Page.....................: 2606.5
-Avg. Page Density (full).....................: 67.80%

Let's decode this output:

Pages Scanned - Gives the # physical pages in the database scanned in this index. Not really relevant, but gives you the total size occupied by this index ( each page is 8k)

Extents scanned - An extent is 8 pages. So this should be pretty close to Pages Scanned / 8. In this example we have 121 extents which is 968 pages. Since the index is only 899 pages, we have a number of shared extents. Not necessarily a bad thing, but this gives you an idea that you are slightly fragmented. Of course, you do not know how much physical fragmentation this is which can contribute to longer query times. The minimum number for the 899 pages above would be 113. (899/8)

Extent Switches - # times the scan forced a switch from one extent to another. As this gets close to # pages, you have pretty high fragmentation. . If you see number close to # pages, then you may want to rebuild the index. See a Detailed Example.

Average Pages/Extent - Gives the math of Pages Scanned / Extents Scanned. Not of any great value other than you don't have to run Calculator to get the number. Fully populated extents would give a value of 8 here. I guess this is good for me

Scan Density [Best Count:Actual Count].......: 12.58% [113:898]

This is the tough one. This shows a percentage and 2 numbers separated by a colon. I explain this as I missed it the first two times around. The percentage is the result of dividing number 1 (113) by number 2 (898). So what are the two numbers?

The first number is the ideal number of extent changes if everything was linked in the a contiguous chain. The second number is the number of extents moved through which is 1 more than the number of extent switches (by definition). This is really another view of fragmentation. 100% would be minimal (I hate to say zero) fragmentation. As you can see, this table is fairly fragmented. The scan is constantly switching back and forth from one extent to another instead of finding a link from one page to another within an extent.

Logical Scan Fragmentation ..................: 99.89%

The official definition from Books Online (v7.x and 2000 Beta 2) is:

"Percentage of out-of-order pages returned from scanning the leaf pages of an index. This number is not relevant to heaps and text indexes. An out of order page is one for which the next page indicated in an IAM is a different page than the page pointed to by the next page pointer in the leaf page."

I am still not sure what this means. I have not gotten a good explanation of this anywhere, so here is my best interpretation. This shows how many pages (as a percentage) in the index which have a pointer to the next page that is different than the pointer to the next page that is stored in the leaf (data) page. This is only relevant for clustered indexes as the data (leaf pages) should be physically in the order of the clustered index.

So how do you use this? If you figure it out, let me know. Since this number is high for me and other items lead me to think this index is fragmented, I think this is bad. So try for a low number in OLAP systems and a medium number in OLTP systems.

Extent Scan Fragmentation ...................: 80.99%

Again, here is the official BOL explanation (v7.x and 2000 Beta 2).

Percentage of out-of-order extents in scanning the leaf pages of an index. This number is not relevant to heaps. An out-of-order extent is one for which the extent containing the current page for an index is not physically the next extent after the extent containing the previous page for an index.

This shows the percentage of pages where the next page in the index is not physically located next to the current page. This tells me the I/O system must move fairly often (80% of the time) when scanning the index to find the next page. A Detailed Explanation is given below.

Avg. Bytes Free per Page.....................: 2606.5

This tells you (on average) how many bytes are free per page. Since a page is 8096 bytes, it appears that I have on average, filled about 68% of the pages. This can be good or bad. If this is an OLTP system with frequent inserts to this table, then with more free space per page, there is less likely going to be a page split when an insert occurs. You want to monitor this on tables with heavy activity and periodically rebuild this index to spread out the data and create free space on pages. Of course you do this during periods of low activity (read as 3am) so that there is free space and page splits are minimal during periods of high activity (when everyone can yell at you for a slow database). Since this is an OLTP system, I am in good pretty shape.

If this were an OLAP system, then I would rather have this be closer to zero since most of the activity would be read based and I would want the reads to grab as much data as possible from each page (to reduce the time it takes to read the index). As your OLAP table grows, this becomes more critical and can impact (substantially) the query time for a query to complete.

(build test data of 10,000,000 rows and test index of 99% v 1% fillfactor).

Avg. Page Density (full).....................: 67.80%

This gives the percentage based on the previous number (I calculated the number above as 1 - (2606.5 / 8096) and rounded up.

So what does this all mean?

Well, to me this says I need to defragment this table. There are a large number of extent switches that occur, each of which could potentially cause a large I/O cost to queries using this table and index.

Defragmenting Indexes

In previous versions, and I guess in this one you can still rebuild the clustered index which causes the server to read this clustered index and then begin moving the data to new extents and pages which should start putting everything back in physical order and reduce fragmentation. There is another way:

In SQL 2000, the SQL developers added another DBCC option which is INDEXDEFRAG. This can defragment both clustered and nonclustered indexes which (according to BOL) should improve performance as the physical order will match the logical order and (theoretically) reduce the I/O required by the server to scan the index.

A couple of caveats about this: If your index spans files, then it defragments each file separately and does NOT move pages between files. Not a good thing if you have added a new filegroup and allowed objects to grow across files. If

A good thing that is way, way, way, extremely, absolutely, without-a-doubt long overdue is the reporting of progress by DBCC INDEXDEFRAG as it works. Every 5 minutes this will report the estimated progress back to the user. Of course many of us who have installed software with a feedback progress bar often wonder why the bar moves quickly to 99% and remains there for 80% of the total install time. So time will tell whether this is of any use, but I think some feedback is better than none.

Another addition that is way, way, way, (you get the idea) overdue is the ability to stop the DBCC. I cannot tell you how many late nights I wished I could do this in v6.5. In fact I often held off on running DBCC until the latest possible time since I could not stop it once it started. (well, there was that O-N-O-F-F switch.)

Still one further addition, that ranks above the other two is that this is an online operation. Let me repeat that. Its an ONLINE operation. It does not hold locks on the table since it operates as a series of short transactions to move pages. It also operates more quickly than a rebuild of a new index and the time required is related to the amount of fragmentation for the object. Of course this means that you must have extensive log space if this is a large index. Something to keep in mind and watch the log growth when you run this to see how much space it eats up. Everything's a trade-off though.

Conclusion

Maintaining indexes still requires DBA intervention for optimal performance for your database server. How often do you have to maintain these indexes? That depends (like everything else in the RBDMS world) on your system. This is still an art and requires some practice, testing, and careful notes over time.

While Microsoft continues to improve this process over time and automate some of the work, it will probably still require some DBA intervention for the foreseeable future. Practice with tuning indexes is an often overlooked maintenance item, but one which will become more important to your users as their numbers swell.

Good luck and as always, please send me an email with comments, questions, or suggestions.

References

Here are some of the references for Indexing on the Internet and in publication. I have used these in the past, though not necessarily for this article. These are the ones I recommend.

Professional SQL Server 7 Programming by Robert Vieira. Chapter 9 devoted to indexes. See my book review page for more information.

The Gurus Guide To Transact SQL by Ken Henderson. See my book review page for more information.

Index Optimizaton Tips by Alexander Chigrik on Swynk. A concise list of tips for building good indexes.

Planning and Creating Indexes from Microsoft SQL Server v7.0 Database Implementation Training Kit. This is a good explanation of how indexes are structured in the server.

Performance Tuning Guide from Microsoft. Everyone should read this at least once and refer to it whenever you are trying to improve performance.

Most SQL Server books include basic information in indexes which is useful IF you read it. Indexing is definitely an art and you have to practice it to get better.

http://www.databasejournal.com/features/mssql/article.php/1467801