Wednesday, September 10, 2008

[vb6] Why ADO's Find method is the devil

Takeaway: How do you use ADO's Find method with multiple criteria? Easy. You don't. But don't fret, you should be using Filter anyway. Take a look at this superior alternative.

Q: How can I get ADO's Find method to find records based on the values in two different columns?

—Ali Siasifar


A: Well, the short answer is that you can't. ADO's Recordset.Find method works on one column, and one column only. Having said that, you could use something like the code in Listing A, which finds all records in a hypothetical table where the City and Country fields are "London" and "UK," respectively. It does this by finding the first record where Country is "UK" and checking the City field to see if it equals "London." If it does, we have found one of the records we are looking for. If not, we look for the next record where Country is "UK" and try again.

I have to ask, though, why are you using Find? Although Find is appropriate in some situations, using it to locate records is generally very inefficient, both in terms of speed and memory use. Find works by examining each record in a Recordset for the criteria you give it after you have created the Recordset and retrieved all the data from your database. Retrieving lots of unwanted records, particularly down the wire from a server, when you are really interested in only a handful—or even worse, just one bit of data—creates a lot of unnecessary overhead. Unless you have a very compelling reason for using Find, I'd recommend using an alternative approach, like filtering your Recordset or using SQL, for serious performance gains.

Filtering
The Recordset.Filter method filters out records that don't match one or more criteria. You can specify multiple conditions in a filter by using and or or, so using Filter would allow you to check multiple columns. For example, suppose we open a Recordset and execute the following code:
 
rs.Filter = "LastName='Adams' and FirstName='Lamont'"

The Recordsetrs would then contain only records where the LastName and FirstName fields were Adams and Lamont, respectively. If no records existed that match these criteria, rs would be empty, with rs.EOF and rs.BOF both true. To remove a filter from a Recordset, set the Filter property to an empty string ("") or the constant adFilterNone.

Filter does have its drawbacks, of course. Filtering a Recordset requires the creation of a new cursor and makes the "topmost" record in the filtered set the current record. Like Find, a Filter occurs on the client side after the Recordset has been created and data retrieved from the database. So once again, you could be retrieving more data that you need. All these factors contribute to make filtering a Recordset fairly expensive.

Unlike Find, however, filtering does makes sense in some circumstances, such as when you’re working with a large set of data on which you expect to make quite a few filter operations. In situations such as this, the overhead involved in creating multiple Recordsets retrieving small amounts of data may be less than that involved in one trip and multiple filters.

Restricting the results with SQL
Your last option would be to use some kind of a SQL SELECT statement to ensure that the Recordset contained only records with the values you are looking for. SQL processing is done by the data provider and is generally the fastest way to retrieve a restricted set of data because the only real overhead is the opening of the Recordset, which you have to do anyway. Most database systems interpret SQL with blinding speed, at least until you throw multiple joins or subqueries into the mix, so any extra processing time is negligible. Also, by retrieving only the records you want, you cut the waste and overhead of retrieving an entire table of data and filtering or using Find to locate the desired records.

I don’t have space in this column for a class in SQL, but every developer should have at least a basic understanding of it. I will explain, though, just for clarity, that assuming a table name of Customers, the SQL equivalent of the Filter we performed above would be:

rs.Open "SELECT * FROM Customers " _
& "WHERE LastName='Adams' AND " _
& "FirstName='Lamont';", cn

where cn is an open connection object.

A picture is worth 50 records
Throughout this column, I've been stressing that ADO’s Find and Filter methods are inefficient compared to restricting your result set with SQL. But how inefficient are they, exactly? I did a little informal benchmarking on the performance differences of locating multiple records using Find, Filter, and a SQL statement. The results, shown in Figure A, provide a little concrete data to back me up.

Figure A
SQL outperformed Find and Filter in my informal benchmarking.


The time results represent the average number of seconds needed to use a client-side, static cursor to open the Customers table in SQL Server's sample Northwind catalog and locate all records belonging to customers in London. I ran 50 trials using ADO version 2.6. These results will be a bit faster than real-world performance, since the data provider cached the same Recordsets that were opened, closed, and reopened in rapid succession for this rough test. But the differences in speed are still telling, particularly when operating over a network. Keep these results in mind the next time you need to find a few records and you're working with ADO.