Suppose you have a stored procedure that does wonderful things, but returns a lot of data. So you want to page that data but want to make the paging happen as far back in the data retrieval cycle as possible. So if your data is coming from a SQL Server database, you can apply the pattern described below:
In order to make the paging minimally invasive to already working stored procedures, create a new stored procedure which will return the paged record set and the total number of records. Here it us:
create proc usp_SkipTake(
@columnDefs varchar(1000), --the column names and dataTypes in the exact same order as the stored procedure which will be paged.
--example: firstName varchar(50), lastname varchar(50)
@sprocWithInputParams varchar(1000), --this is the stored procedure which will be paged.
@skip int, --the number of records to skip
@take int, --the number of records to return
@orderBy varchar(255)=null --allows you to specify the sort order of the data. This is the only new feature your old stored procedure will need to implement to fit this design pattern.
)
as Begin
declare @p varchar(8000)
declare @lastRecord varchar(9)
set @lastRecord = isnull(@skip,0) + case when isnull(@take,0)=0 then 1000000 else @take end
set @p = 'create table #tmptbl (' + @columnDefs + ', _rowNo int identity)' +
'insert into #tmptbl exec ' + @sprocWithInputParams + '
select * from #tmptbl
where (_rowNo > ' + convert(varchar(9), @skip) + ' AND _rowNo <= ' + @lastRecord + ')
select recordCount=count (_rowNo) from #tmptbl
drop table #tmptbl'
exec(@p)
End
Note that if you supply the @take value of 0, it will default to getting the first million records. This was an arbitrary number. I assumed that would be the same as retrieving all records for the vast majority of use cases.
A note about performance: I found that this approach will likely double the amount of time it takes to return data from the database. Coming up next, we will examine how caching can help improve performance.
Only if the data changes will the underlying stored procedure be run. Otherwise, the data will persist in a real table, but different segments will be fetched based on the skip and take parameters.
Two new parameters are introduced to the stored procedure:
- @cacheTableName: the table name used to cache the data. I made it so that it must contain tblCache in the name, for the sake of clarity.
- @dependencyTableName: the table which will be monitored for changes. It must contain a timestamp/record-version column named ts.
--#region create tblCacheMeta if it doesn't exist. This table is used to track whether the cached data is still valid. if object_id( 'tblCacheMeta' ) is null Begin CREATE TABLE dbo.[tblCacheMeta]( [cacheId] [int] IDENTITY(1,1) NOT NULL, [cacheTable] [nvarchar](100) NOT NULL, [dependencyTable] [nvarchar](1000) NOT NULL, --must have timestamp column named ts [columnDefs] [varchar](1000) NULL, [command] [varchar](1000) NULL, [maxTS] [int] NULL, [records] [int] NULL, lastCreateTime dateTime null, reasonForRebuild varchar(100) CONSTRAINT [PK_tblCacheMeta] PRIMARY KEY CLUSTERED ([cacheId] ASC), CONSTRAINT [IX_Table_1_cacheTable] UNIQUE NONCLUSTERED ([cacheTable] ASC) ) End GO
create proc usp_SkipTake(
@cacheTableName varchar(100),
@dependencyTableName varchar(100),
@columnDefs varchar(1000), --the column names and dataTypes in the exact same order as they in the stored procedure which will be paged.
--example: firstName varchar(50), lastname(varchar(50)
@sprocWithInputParams varchar(1000), --this is the stored procedure which will be paged.
@skip int, --the number of records to skip
@take int, --the number of records to return
@orderBy varchar(255)=null --allows you to specify the sort order of the data. This is the only new feature your old stored procedure will need to implement to fit this design pattern.
)
as Begin
declare @p varchar(8000)
declare @lastRecord varchar(9)
set @lastRecord = isnull(@skip,0) + case when isnull(@take,0)=0 then 1000000 else @take end
declare @errStr varchar(255)
if lower(@cacheTableName) not like '%tblcache%'
Begin
set @errStr = 'The cache table name must contain tblCache;
You specified: [' + @cacheTableName + '].
This is a rule of thumb to help you avoid deleting critical tables by accident!'
raiserror(@errStr,16,1)
return
End
declare @createTable bit, @cacheId int
--get the cacheId from the cacheTable
--set the temp table path to the right database.
set @cacheTableName = 'tempdb..' + @cacheTableName
select @cacheId = (select cacheId from tblCacheMeta where cacheTable=@cacheTableName)
if @cacheId is null
Begin
insert into tblCacheMeta (cacheTable, dependencyTable) values(@cacheTableName, @dependencyTableName )
set @cacheId = SCOPE_IDENTITY()
End
--drop the cacheTable and recreate it if out of date (record added, modified, or deleted in the dependency table) or the column defs have changed
--store the current max time stamp and record-count into a temp
create table #tmp1 (maxTS int, records int)
set @p = 'insert into #tmp1 (maxTS, records)
select maxTS=max(ts+0), records=count(*) from ' + @dependencyTableName
exec(@p)
declare @reasonForRecalc varchar(255) set @reasonForRecalc = ''
print 'object_id of [' + @cacheTableName + '] is ' + isnull(convert(varchar(99), object_id(@cacheTableName)), 'NULL')
if object_id(@cacheTableName) is not null Begin
--check if the column defs have changed
declare @columnDefsChanged bit, @outOfDate bit
set @columnDefsChanged = isnull((select case when [columnDefs]=@columnDefs then 0 else 1 end from tblCacheMeta where cacheId=@cacheId),0)
--set the out-of-date flag
set @outOfDate = isnull((select case when cm.maxTS=m.maxTS and cm.records=m.records then 0 else 1 end from
#tmp1 m, tblCacheMeta cm where cm.cacheId=@cacheId), 0)
--print '@outOfDate: ' + convert(varchar, @outOfDate) + ', @columnDefsChanged: ' + convert(varchar, @columnDefsChanged)
if @outOfDate=1 or @columnDefsChanged=1
Begin
if @columnDefsChanged=1 set @reasonForRecalc = @reasonForRecalc + ' columndefs have changed'
if @outOfDate=1 set @reasonForRecalc = @reasonForRecalc + '; the data is out of date'
set @p='drop table ' + @cacheTableName
exec(@p)
set @createTable = 1
End
End
Else Begin
set @reasonForRecalc = @reasonForRecalc + '; Cache table ' + @cacheTableName + ' does not exist so it will be created.'
set @createTable=1 --turn on the createTable flag so that the table will be created
End
if @createTable=1
Begin
--create and insert into the cacheTable
set @p = 'create table ' + @cacheTableName + '(' + @columnDefs + ', _rowNo int identity)
insert into ' + @cacheTableName + ' exec ' + @sprocWithInputParams
exec(@p)
--update the cacheMetadata table select * from tblCacheMeta
update tblCacheMeta set dependencyTable=@dependencyTableName, columnDefs=@columnDefs, command=@sprocWithInputParams, lastCreateTime=getDate(), maxTS=m.maxTS, records=m.records,
reasonForRebuild= @reasonForRecalc
from #tmp1 m
where cacheId=@cacheId
End
set @p = 'select * from ' + @cacheTableName + '
where (_rowNo > ' + convert(varchar(9), @skip) + ' AND _rowNo <= ' + @lastRecord + ')
select recordCount=count(_rowNo), cacheTable=''' + @cacheTableName + ''' from ' + @cacheTableName
exec(@p)
End
Todo: Have a way of deleting stale tables from the tempDB database. Possibly by moving the tblCacheMeta table to the tempDB database. This will allow the clean-up script to drop the stale tables which haven't been referenced in a while.
No comments:
Post a Comment