Paging Data with Different Cursor Types

This post is a follow up to a post I did last week: Paging Data with the SQL Server Drivers for PHP: Simplified.  In that post, I showed how to leverage scrollable cursors to simplify code for paging data. However, my investigation used only one type of scrollable cursor: a static cursor. In this post, I’ll investigate the behavioral differences of paging data with a static cursor and the other two scrollable cursor types: keyset and dynamic cursors. I did write a high-level comparison of the different cursor types last year, but what I really want to focus on in this post are the practical differences between the different cursor types in the  paging scenario.

I’ll start off with a summary of what I learned. Keep in mind that functionality common to all scrollable cursors is, by definition, the ability to scroll back and forth within a result set. Also keep in mind that the behavior of cursors is affected by the transaction isolation level set on the connection. To keep things simple here, I’ve used the default value of SQLSRV_TXN_READ_COMMITTED.

  • Static cursors: When you execute a query with a static cursor, the server copies the result set into a temporary table, but it doesn’t make any updates to the temp table while the cursor is open. Practically, this means that any UPDATEs, INSERTs, or DELETEs made on another connection will not be visible to the cursor.
  • Keyset cursors: When you execute a query with a static cursor, the server copies the result set into a temporary table AND it maintains references to data in the original table while the cursor is open. Practically, this means that UPDATEs made on another connection to the server are visible to the cursor, but INSERTs are not. DELETEs appear as “holes” in the result set, which results in some interesting behavior in the SQL Server Drivers for PHP. (Hopefully, the behavior of DELETEs with keyset cursors will be clearer in my example below.)
  • Dynamic cursors: When you execute a query with a dynamic cursor, the server copies the result set into a temporary table AND it keeps data in the temp table in sync with all changes that occur in the original table. Practically this means that all UPDATEs, INSERTs, and DELETEs made on another connection are visible to the cursor.

Enough summary. What really drove home the differences in behavior was seeing what happened when I executed some code, which I’ll share here…

To test the behavior of each cursor type, I wrote a script that mimics a paging scenario while an UPDATE, INSERT, or DELETE takes place on a different connection (the complete script is attached to this post). Basically, the script opens a scrollable cursor and retrieves 3 rows of data (a “page” of data). Before retrieving the next 3 rows of data (the next page), a second connection is opened and an UPDATE, INSERT, or DELETE is performed. When the second page of data is retrieved from the cursor, we can see what happens for each type of cursor.

As part of the script, I create a table for testing purposes, insert some data, run my queries with different cursor types, then drop the table:

$conn1 = sqlsrv_connect($serverName, $connOptions);
if (!$conn1)
    die( print_r( sqlsrv_errors(), true));
$sql_createTbl = "CREATE TABLE [dbo].[Table_1](
                [id] [int] NOT NULL,
                [data] [varchar](50) NULL,
                 CONSTRAINT [PK_Table_1] PRIMARY KEY CLUSTERED
                (
                    [id] ASC
                ))

                DECLARE @i INT;
                SET @i=2;
                WHILE(@i<=12)
                BEGIN
                    INSERT INTO Table_1 (id, data) VALUES (@i, 'data');
                    SET @i = @i+2;
                END";

if(!sqlsrv_query($conn1, $sql_createTbl))
    die( print_r( sqlsrv_errors(), true));

    /* Queries with scrollable cursors here. */

sqlsrv_query($conn1, "DROP TABLE Table_1");

For each cursor type, here is the code I execute. (it is actually executed inside two nested foreach loops, one for each cursor type and one for each query type.):

$sql_select = "SELECT * FROM Table_1";
$stmt2 = sqlsrv_query($conn1,
                      $sql_select,
                      array(),
                      array( "Scrollable" => $cursor_type ));

// Get the first 3 rows (the first "page").
$i = 0;
while(($row = sqlsrv_fetch_array($stmt2,
                                 SQLSRV_FETCH_NUMERIC,
                                 SQLSRV_SCROLL_ABSOLUTE, $i))
        && $i < 3)
{
    echo $row[0]." : ".$row[1]."<br />";
    $i++;
}

// Simulate another user by creating a new connection.
// Insert, update, or delete data
$conn2 = sqlsrv_connect($serverName, $connOptions);

// $sql_query (below) is loops through each of these
// queries for each cursor type:
// INSERT INTO Table_1 (id, data) values (7, 'new_data')
// UPDATE Table_1 SET data='changedData' WHERE id = 8
// DELETE FROM Table_1 WHERE id = 10
if ( !sqlsrv_query($conn2, $sql_query) )
    die( print_r( sqlsrv_errors(), true));

// Get the remaining rows from the cursor.   
$i = 0;
while($row = sqlsrv_fetch_array($stmt2,
                                SQLSRV_FETCH_NUMERIC,
                                SQLSRV_SCROLL_ABSOLUTE,
                                3 + $i) )
{
    echo $row[0]." : ".$row[1]."<br />";
    $i++;
}

What were the results? Keep in mind that the original table has 6 entries with id’s that are even numbers 2 through 12 and that the first page of data in each case was the same since no changes were made to any data before getting the first 3 rows. So, the first page has this data in all cases:

Page 1 Data 
id data
2 data
4 data
6 data

Now, for each cursor type we see different results for the second page of data after an INSERT, UPDATE, or DELETE query is executed:

Static cursor: Because a static cursor works from a temp table that is unaware of any changes to the original data, the second page of data is the same for all 3 queries (and appears as if we are querying against the original data):

Page 2 Data – Static cursor, after all query types
id data
8 data
10 data
12 data

Keyset cursor: In this case, page 2 data looked exactly as it did for a static cursor in the case of an INSERT query (i.e. no data changes were reflected in the 2nd page of data). However, when the UPDATE statement (UPDATE Table_1 SET data='changedData' WHERE id = 8), was executed after retrieving the 1st page, the second page looked like this:

Page 2 Data – Keyset cursor, after UPDATE query
id data
8 changedData
10 data
12 data

In the case of a keyset cursor, what happened after a DELETE statement was executed was interesting:

Page 2 Data – Keyset cursor, after DELETE query
id data
8 data

Only the first row of the second page was returned! This is because there was a  “hole” in the dataset. A keyset cursor holds a reference to the deleted data, but when we try to retrieve the data with sqlsrv_fetch_array, the reference is empty, so the API returns false. So, the while loop exited and no other rows are returned (even though they are there). This makes it very difficult (impossible?) to distinguish between a “hole” in a result set and the end of a result set. (Note: I’ve pointed this out the the SQL Server Driver for PHP team. They are investigating the behavior. They would be interested in what YOU think the behavior of the driver should be when it encounters a “hole” when using a keyset cursor.)

Dynamic cursor: In this case, ALL changes made after we retrieve the 1st page are reflected in the second page:

Page 2 Data – Dynamic cursor, after UPDATE query
id data
8 data
10 data
12 data
Page 2 Data – Dynamic cursor, after INSERT query
id data
7 new_data
8 data
10 data
12 data
Page 2 Data – Dynamic cursor, after DELETE query
id data
8 data
12 data

So, I like to think of cursors along a spectrum. At one end of the spectrum are static cursors, which are least expensive on the server but do not provide any visibility into changed data. On the other end of the spectrum are dynamic cursors, which are the most expensive on the server but give you visibility to all changes to your data. In between these two are keyset cursors, which have medium cost on the server while giving you insight into updated data in your result set. Which of these you choose to use will, obviously, depend on the needs of your application. e.g. How important is it for users to always see updates, inserts, and or deletes to data?

If you like to play with this behavior on your own, the attached script may be helpful.

Thanks.

-Brian

Share this on Twitter

No Comments