Suppose you have a stored procedure that does wonderful things, but returns a lot of data. So you want to page that data but want to make the paging happen as far back in the data retrieval cycle as possible. So if your data is coming from a SQL Server database, you can apply the pattern described below:
In order to make the paging minimally invasive to already working stored procedures, create a new stored procedure which will return the paged record set and the total number of records. Here it us:
create proc usp_SkipTake( @columnDefs varchar(1000), --the column names and dataTypes in the exact same order as the stored procedure which will be paged. --example: firstName varchar(50), lastname varchar(50) @sprocWithInputParams varchar(1000), --this is the stored procedure which will be paged. @skip int, --the number of records to skip @take int, --the number of records to return @orderBy varchar(255)=null --allows you to specify the sort order of the data. This is the only new feature your old stored procedure will need to implement to fit this design pattern. ) as Begin declare @p varchar(8000) declare @lastRecord varchar(9) set @lastRecord = isnull(@skip,0) + case when isnull(@take,0)=0 then 1000000 else @take end set @p = 'create table #tmptbl (' + @columnDefs + ', _rowNo int identity)' + 'insert into #tmptbl exec ' + @sprocWithInputParams + ' select * from #tmptbl where (_rowNo > ' + convert(varchar(9), @skip) + ' AND _rowNo <= ' + @lastRecord + ') select recordCount=count (_rowNo) from #tmptbl drop table #tmptbl' exec(@p) End
Note that if you supply the @take value of 0, it will default to getting the first million records. This was an arbitrary number. I assumed that would be the same as retrieving all records for the vast majority of use cases.
A note about performance: I found that this approach will likely double the amount of time it takes to return data from the database. Coming up next, we will examine how caching can help improve performance.
Only if the data changes will the underlying stored procedure be run. Otherwise, the data will persist in a real table, but different segments will be fetched based on the skip and take parameters.
Two new parameters are introduced to the stored procedure:
- @cacheTableName: the table name used to cache the data. I made it so that it must contain tblCache in the name, for the sake of clarity.
- @dependencyTableName: the table which will be monitored for changes. It must contain a timestamp/record-version column named ts.
--#region create tblCacheMeta if it doesn't exist. This table is used to track whether the cached data is still valid. if object_id( 'tblCacheMeta' ) is null Begin CREATE TABLE dbo.[tblCacheMeta]( [cacheId] [int] IDENTITY(1,1) NOT NULL, [cacheTable] [nvarchar](100) NOT NULL, [dependencyTable] [nvarchar](1000) NOT NULL, --must have timestamp column named ts [columnDefs] [varchar](1000) NULL, [command] [varchar](1000) NULL, [maxTS] [int] NULL, [records] [int] NULL, lastCreateTime dateTime null, reasonForRebuild varchar(100) CONSTRAINT [PK_tblCacheMeta] PRIMARY KEY CLUSTERED ([cacheId] ASC), CONSTRAINT [IX_Table_1_cacheTable] UNIQUE NONCLUSTERED ([cacheTable] ASC) ) End GO
create proc usp_SkipTake( @cacheTableName varchar(100), @dependencyTableName varchar(100), @columnDefs varchar(1000), --the column names and dataTypes in the exact same order as they in the stored procedure which will be paged. --example: firstName varchar(50), lastname(varchar(50) @sprocWithInputParams varchar(1000), --this is the stored procedure which will be paged. @skip int, --the number of records to skip @take int, --the number of records to return @orderBy varchar(255)=null --allows you to specify the sort order of the data. This is the only new feature your old stored procedure will need to implement to fit this design pattern. ) as Begin declare @p varchar(8000) declare @lastRecord varchar(9) set @lastRecord = isnull(@skip,0) + case when isnull(@take,0)=0 then 1000000 else @take end declare @errStr varchar(255) if lower(@cacheTableName) not like '%tblcache%' Begin set @errStr = 'The cache table name must contain tblCache; You specified: [' + @cacheTableName + ']. This is a rule of thumb to help you avoid deleting critical tables by accident!' raiserror(@errStr,16,1) return End declare @createTable bit, @cacheId int --get the cacheId from the cacheTable --set the temp table path to the right database. set @cacheTableName = 'tempdb..' + @cacheTableName select @cacheId = (select cacheId from tblCacheMeta where cacheTable=@cacheTableName) if @cacheId is null Begin insert into tblCacheMeta (cacheTable, dependencyTable) values(@cacheTableName, @dependencyTableName ) set @cacheId = SCOPE_IDENTITY() End --drop the cacheTable and recreate it if out of date (record added, modified, or deleted in the dependency table) or the column defs have changed --store the current max time stamp and record-count into a temp create table #tmp1 (maxTS int, records int) set @p = 'insert into #tmp1 (maxTS, records) select maxTS=max(ts+0), records=count(*) from ' + @dependencyTableName exec(@p) declare @reasonForRecalc varchar(255) set @reasonForRecalc = '' print 'object_id of [' + @cacheTableName + '] is ' + isnull(convert(varchar(99), object_id(@cacheTableName)), 'NULL') if object_id(@cacheTableName) is not null Begin --check if the column defs have changed declare @columnDefsChanged bit, @outOfDate bit set @columnDefsChanged = isnull((select case when [columnDefs]=@columnDefs then 0 else 1 end from tblCacheMeta where cacheId=@cacheId),0) --set the out-of-date flag set @outOfDate = isnull((select case when cm.maxTS=m.maxTS and cm.records=m.records then 0 else 1 end from #tmp1 m, tblCacheMeta cm where cm.cacheId=@cacheId), 0) --print '@outOfDate: ' + convert(varchar, @outOfDate) + ', @columnDefsChanged: ' + convert(varchar, @columnDefsChanged) if @outOfDate=1 or @columnDefsChanged=1 Begin if @columnDefsChanged=1 set @reasonForRecalc = @reasonForRecalc + ' columndefs have changed' if @outOfDate=1 set @reasonForRecalc = @reasonForRecalc + '; the data is out of date' set @p='drop table ' + @cacheTableName exec(@p) set @createTable = 1 End End Else Begin set @reasonForRecalc = @reasonForRecalc + '; Cache table ' + @cacheTableName + ' does not exist so it will be created.' set @createTable=1 --turn on the createTable flag so that the table will be created End if @createTable=1 Begin --create and insert into the cacheTable set @p = 'create table ' + @cacheTableName + '(' + @columnDefs + ', _rowNo int identity) insert into ' + @cacheTableName + ' exec ' + @sprocWithInputParams exec(@p) --update the cacheMetadata table select * from tblCacheMeta update tblCacheMeta set dependencyTable=@dependencyTableName, columnDefs=@columnDefs, command=@sprocWithInputParams, lastCreateTime=getDate(), maxTS=m.maxTS, records=m.records, reasonForRebuild= @reasonForRecalc from #tmp1 m where cacheId=@cacheId End set @p = 'select * from ' + @cacheTableName + ' where (_rowNo > ' + convert(varchar(9), @skip) + ' AND _rowNo <= ' + @lastRecord + ') select recordCount=count(_rowNo), cacheTable=''' + @cacheTableName + ''' from ' + @cacheTableName exec(@p) End
Todo: Have a way of deleting stale tables from the tempDB database. Possibly by moving the tblCacheMeta table to the tempDB database. This will allow the clean-up script to drop the stale tables which haven't been referenced in a while.
No comments:
Post a Comment