Pages

Sunday, 20 July 2014

Sybase Temp DB Thoughts

I was asked by a friend of mine to write up my thoughts on configuring tempdb in Sybase.

Tempdb is a scratchpad database within Sybase ASE. It is used for storing temporary results from queries that are running, such a order by or group. It is also used for storing the results of queries that will be re-used in subsequent queries. Typically these interim results do not need to be persisted beyond the life of the database session or stored procedure.

Sizing

I don't think to much about this at the start, it is all dependent on the queries being executed. It is rarely dependent on the size of the data being stored on the server. There is no reason why a database server storing 100GB of data would need 10GB of tempdb for the application to function. The correct size of tempdb is determined by the applications use. Tempdb is used to support the following operations:

order by
group by
distinct
union
in addition to any temporary (#) or permanent tables that are created.

Probably a reasonable starting size is 500MB and then resize as necessary.

Placing tempdb

Sybase requires a tempdb to start and as such it is created during the install. As usual I'm creating these using the command line and here's the resource file for the ASE on this laptop. I specify tempdb to have its own device and a size of 500MB.

sybase@laptop:/opt/sybase-ase-15_7/ASE-15_0/init/sample_resource_files$ sdiff -w 200 laptop_sql.rs srvbuild.adaptive_server.rs | grep tempdb
sqlsrv.tempdb_device_physical_name: /syb_devices/laptop_sql/temp01.dev               |    sqlsrv.tempdb_device_physical_name: PUT_THE_PATH_OF_YOUR_TEMPDB_DEVICE_HERE_OR_REMOVE_THIS_LINE
sqlsrv.tempdb_device_size: 500                                                           |    sqlsrv.tempdb_device_size: USE_DEFAULT
sqlsrv.tempdb_database_size: 500                                                         |    sqlsrv.tempdb_database_size: USE_DEFAULT

If the defaults are accepted, tempdb is placed on the master device. This increases the risk to the master device because there is a potential that tempdb writes might corrupt the device at some point in the future. If the defaults have been accepted, it is impossible to completely move tempdb from the master device, although damage can be restricted by removing segments once a tempdb device has been created.

Consider the following:

Tempdb has been created on the master device. To restrict the damage to the master device create a tempdb device, alter the database so that the data and long segments are on the device and then remove the default, system and logsegment segments from the master device.

disk init name="tempdb01", physname="/syb_devices/laptop_sql/tempdb01.dev", size=256000
go
alter database tempdb on tempdb01 = 500
go


Once altered, the segments can be removed from the master device

use tempdb
go
sp_dropsegment 'default', tempdb, master
go
sp_dropsegment 'system', tempdb, master
go
sp_dropsegment 'logsegment', tempdb, master
go



Default has to be in quotes because it clashes with the default keyword.

After the last segment has been dropped the following is shown.
 
DBCC execution completed. If DBCC printed error messages, contact a user with System Administrator (SA) role.
DBCC execution completed. If DBCC printed error messages, contact a user with System Administrator (SA) role.
The last-chance threshold for database tempdb is now 16 pages.
Segment reference to device dropped.
WARNING: There are no longer any segments referencing device 'master'.  This device will no longer be used for space allocation.
(return status = 0)
1>


This can be confirmed by running sp_helpdb...

1> sp_helpdb tempdb
2> go
 name         db_size                    owner      dbid     created                  durability             lobcomplvl           inrowlen       
     status                                                                                                                                                                    ; ; ; ;         
 ------------ -------------------------- ---------- -------- ------------------------ ---------------------- -------------------- ----------------
     ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 tempdb            504.0 MB              sa            2     Jul 12, 2014             no_recovery                     0               NULL       
     select into/bulkcopy/pllsort, trunc log on chkpt, mixed log and data, allow wide dol rows                                                                                         

(1 row affected)
 device_fragments                                             size                       usage                                    created                                          
     free kbytes                     
 ------------------------------------------------------------ -------------------------- ---------------------------------------- --------------------------------------------------
     --------------------------------
 master                                                              4.0 MB              data only                                Dec 25 2013  7:23PM                              
                 2344                
 tempdbdev                                                         500.0 MB              data and log                             Dec 25 2013  7:24PM                              
               509846                
 device             segment                                                   
 ------------------ ----------------------------------------------------------
 master              -- unused by any segments --                             
 tempdbdev          default                                                   
 tempdbdev          logsegment                                                
 tempdbdev          system                                                    
(return status = 0)

It can seen the master device is unused by any segments.

Physical Considerations

The physical devices used for temp should be the fastest devices available. Tempdb receives frequent updates and being able to write the updates to disc quickly can have a significant impact on performance.

Simply starting the ASE shows how the underlying storage affects the performance, tempdb_hd takes 2 minutes 25 seconds to initialise where tempdb_ssd takes just over 1.5 seconds. The two databases are the same size and configuration in every respect, except one is on a rotating disk and the other is on a solid state disk.

00:0006:00000:00001:2014/07/13 22:16:10.34 server  Clearing temporary database 'tempdb_hd'.
00:0006:00000:00001:2014/07/13 22:16:21.34 server  Processed 1800 allocation unit(s) out of 18000 units (allocation page 460544). 10% completed.
00:0006:00000:00001:2014/07/13 22:16:30.26 server  Processed 3600 allocation unit(s) out of 18000 units (allocation page 921344). 20% completed.
00:0006:00000:00001:2014/07/13 22:16:30.80 server  Processed 5400 allocation unit(s) out of 18000 units (allocation page 1382144). 30% completed.
00:0006:00000:00001:2014/07/13 22:16:31.27 server  Processed 7200 allocation unit(s) out of 18000 units (allocation page 1842944). 40% completed.
00:0006:00000:00001:2014/07/13 22:16:31.75 server  Processed 9000 allocation unit(s) out of 18000 units (allocation page 2303744). 50% completed.
00:0006:00000:00001:2014/07/13 22:16:32.23 server  Processed 10800 allocation unit(s) out of 18000 units (allocation page 2764544). 60% completed.
00:0006:00000:00001:2014/07/13 22:17:08.94 server  Processed 12600 allocation unit(s) out of 18000 units (allocation page 3225344). 70% completed.
00:0006:00000:00001:2014/07/13 22:17:18.07 server  Processed 14400 allocation unit(s) out of 18000 units (allocation page 3686144). 80% completed.
00:0006:00000:00001:2014/07/13 22:17:27.38 server  Processed 16200 allocation unit(s) out of 18000 units (allocation page 4146944). 90% completed.
00:0006:00000:00001:2014/07/13 22:17:36.42 server  Processed 18000 allocation unit(s) out of 18000 units (allocation page 4607744). 100% completed.



00:0006:00000:00001:2014/07/13 22:17:36.44 server  Clearing temporary database 'tempdb_ssd'.
00:0006:00000:00001:2014/07/13 22:17:36.81 server  Processed 1800 allocation unit(s) out of 18000 units (allocation page 460544). 10% completed.
00:0006:00000:00001:2014/07/13 22:17:36.95 server  Processed 3600 allocation unit(s) out of 18000 units (allocation page 921344). 20% completed.
00:0006:00000:00001:2014/07/13 22:17:37.10 server  Processed 5400 allocation unit(s) out of 18000 units (allocation page 1382144). 30% completed.
00:0006:00000:00001:2014/07/13 22:17:37.24 server  Processed 7200 allocation unit(s) out of 18000 units (allocation page 1842944). 40% completed.
00:0006:00000:00001:2014/07/13 22:17:37.38 server  Processed 9000 allocation unit(s) out of 18000 units (allocation page 2303744). 50% completed.
00:0006:00000:00001:2014/07/13 22:17:37.54 server  Processed 10800 allocation unit(s) out of 18000 units (allocation page 2764544). 60% completed.
00:0006:00000:00001:2014/07/13 22:17:37.68 server  Processed 12600 allocation unit(s) out of 18000 units (allocation page 3225344). 70% completed.
00:0006:00000:00001:2014/07/13 22:17:37.83 server  Processed 14400 allocation unit(s) out of 18000 units (allocation page 3686144). 80% completed.
00:0006:00000:00001:2014/07/13 22:17:37.97 server  Processed 16200 allocation unit(s) out of 18000 units (allocation page 4146944). 90% completed.
00:0006:00000:00001:2014/07/13 22:17:38.11 server  Processed 18000 allocation unit(s) out of 18000 units (allocation page 4607744). 100% completed.


Implementing tempdb on solid state disks has it's own issues. Solid state disks have a "write life", at some point the disk will become unusable and will need to be replaced. This should be considered when calculating the Total Cost of Ownership for the database server. In the examples above the spinning disk is a Segate Momentus XT 750 GB spinning at 7,200 rpm with a SATA 6Gb/s interface. The solid state disk is an OCZ Agility 3, 240GB.

Enterprise hard disks start at around 7,200 rpm and can spin as fast as 15,000 rpm. If there is a choice, choose the fastest storage available.

In the past there was a temptation to place tempdb in a RAM disk. Under Solaris, it was simply a case of placing tempdb in the /tmp filesystem. The /tmp filesystem is a tmpfs file system that is located in virtual memory. While there is free memory, the tempdb will be in RAM and as the memory is used up, the tempdb will be transferred to swap. On paper this sounds like it's a way of finding better performance for tempdb. It works, provided the system administrator and database administrator agree on the size of the database server and the RAM requirements. However in a more varied environment, tempdb's performance may become erratic as increasing memory demands are placed on the system.

Also consider this, RAM allocated for use as a RAM disk cannot be used as data cache by ASE. Adding the extra memory to the default data cache might benefit all the tables and indexes, rather than just the ones in tempdb. There is also a good chance the data in tempdb remains in cache for the duration of it's requirement, effectively being stored in memory twice.

Dsync and DirectIO

These are are not applicable on raw devices because Sybase ASE writes directly devices. They were introduced because users were creating databases on Unix filesystems, which while flexible is not recommended for production use. It's worth understanding what these options are and why they were introduced.

Creating devices on filesystems used to be filled with the potential for catastrophe. I have been called out to rescue database servers from events where the databases have been corrupted by file system devices. If a device is created within the unix filesystem, unix is managing the writes to the device. ASE requests the operating system to write to an address within a file and passes the data to the operating system to perform the write. Before the dsync and direct io options, the operating system would return a notification saying the write had happened, even when the write had only been written to the operating system cache. This could cause a problem in the event of a sudden outage because the writes may not have been flushed to disk, regardless of what ASE had been informed.

The "dsync" option prevents this caching of the write. Sybase uses dsync to open the device and any subsequent writes happen synchronously. This means Sybase physically waits for the write to be performed before processing the next write. Normally disk IO is performed asynchronously in ASE and once completed ASE receives the notification of the success or failure of the IO, with dsync, ASE waits for the IO to complete. This significantly impacts the performance of writes to those devices experiencing writes. With tempdb being supported by filesystem devices, I recommend dsync be disabled because the durability of the writes is unimportant and there is a significant performance overhead.

The directio option allows ASE to by-pass the operating system's write cache. This option clearly solves the issue of the operating system caching writes and preventing recovery from happening. DirectIO has a small performance impact, but guarantees the updates to the device will be durable, even in the event of an ASE failure. The IO happens happens asynchronously and performance is very similar to raw disk IO. For tempdb in supported by files system devices I recommend directio be disabled, because it adds no benefit and introduces a small performance degradation.

Tuning dsync and directio can be done when creating the tempdb devices or subsequently using sp_deviceattr:

exec sp_deviceattr tempdb_hd, directio, false
exec sp_deviceattr tempdb_ssd, directio, false
go


The setting can be confirmed in sp_helpdevice

1> sp_helpdevice
2> go
 device_name            physical_name                                                                              
     description                                                                                                                                                                  & ; ;nbs p;                                         
     status       cntrltype          vdevno       vpn_low        vpn_high        
 ---------------------- --------------------------------------------------------------------------------------------
     ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
     ------------ ------------------ ------------ -------------- ----------------
 data01                 /syb_devices/laptop_sql/data01.dev                                                     
     file system device, special, dsync off, directio on, physical disk, 1000.00 MB, Free: 400.00 MB                                                                                                                        
          2               0               4             0          511999        
 data02                 /syb_devices/laptop_sql/data02.dev                                                     
     file system device, special, dsync off, directio on, physical disk, 6000.00 MB, Free: 0.00 MB                                                                                                                          
          2               0               6             0         3071999        
 logs01                 /syb_devices/laptop_sql/logs01.dev                                                     
     file system device, special, dsync off, directio on, physical disk, 1000.00 MB, Free: 400.00 MB                                                                                                                        
          2               0               5             0          511999        
 master                 /syb_devices/laptop_sql/master.dev                                                     
     file system device, special, dsync on, directio off, default disk, physical disk, 200.00 MB, Free: 177.00 MB                                                                                                           
          3               0               0             0          102399        
 sysprocsdev            /syb_devices/laptop_sql/sybsystemprocs.dev                                             
     file system device, special, dsync off, directio on, physical disk, 172.00 MB, Free: 0.00 MB                                                                                                                           
          2               0               1             0           88063        
 systemdbdev            /syb_devices/laptop_sql/sybsystemdb.dev                                                
     file system device, special, dsync off, directio on, physical disk, 3.00 MB, Free: 0.00 MB                                                                                                                             
          2               0               2             0            1535        
 tempdb_hd              /tempdb_hd/tempdb_hd_1.dev                                                                 
     file system device, special, dsync off, directio off, physical disk, 9000.00 MB, Free: 0.00 MB                                                                                                                         
          2               0               7             0         4607999        
 tempdb_ssd             /tempdb_ssd/tempdb_ssd_1.dev                                                               
     file system device, special, dsync off, directio off, physical disk, 9000.00 MB, Free: 0.00 MB                                                                                                                         
          2               0               8             0         4607999        
 tempdbdev              /syb_devices/laptop_sql/temp01.dev                                                     
     file system device, special, dsync off, directio off, physical disk, 500.00 MB, Free: 0.00 MB                                                                                                                          
          2               0               3             0          255999 
       

(9 rows affected)
(return status = 0)

Splitting the segments

Tempdb typically has data and log in the same segments, this OK from a transaction recovery point of view, but does mean that data and transaction IO is competing for the same physical resources. For the highest performance, tempdb should be separated into it's three segments; logsegment, default and system. The logsegment is the same as other databases, this is where the database's transaction log is recorded and every ddl and dml statement affects it. The default segment is where "permanent" temporary tables are placed. Those are tables that defined in the "model" database and created in every new database and recreated in tempdb with each restart. They are also the tables created by the users where the table name does not have a # at the start of the name. The system segment contains the system tables and the traditional temporary tables starting with #.

By splitting the DB by segment across a number of devices, you may reduce the amount of IO contention. On a database server with a lot of tempdb activity this may be a worthwhile configuration change. For most cases it simply introduces an additional level of management overhead because there are three times as many devices to manage for tempdb and anyone of them can fill. Recommendation, probably not worth the hassle for most database servers.

Use sp_sysmon to detect large number of writes to tempdb devices.

Cache binding

Binding objects to caches helps keep those objects in cache, similarly binding a database to a cache keeps all the objects in that database in cache, however cache binding reduces the flexibility of the database server. Consider the following. If the application requires high speed access to a set of tables, the natural action is to bind those tables to a cache. While those tables will be cached in they are limited to the size of the cache they are bound to. An alternative solution is create a smaller named cache and bind the less important tables to this cache. This prevents them from pushing the important data from cache. This solution is harder to tune because it easy to incorrectly size the named cache. Application testing and monitoring with sp_sysmon should be done to ensure acceptable cache hit rates. The benefit is the less important tables keep to their area of the cache and are prevented from flushing the more important data from its cache.

Log IO Size

Tempdb is fairly log intensive, lots of updates to the devices. Sybase recommends a log IO size of 4K to prevent log cache flushes. When a transaction log page is written, it is written to the user log cache. A user log cache flush is fairly expensive during the middle of the transaction. Sybase lists the events as:
  • When a transaction ends, either a commit or a rollback has happened.
  • When the user log cache is full. The transaction has written a large amount of data and has filled the cache associated with the user
  • When the transaction is changing tables in another database
  • When another process needs to write to a page referenced in the user log cache. This is a reason for not making the ULC too large
The user log cache can fill quickly in an environment with large transactions or high transaction rates, using a log IO size of 4K reduces this.

To ensure the user log cache uses 4K IOs, create a 4K buffer pool in the data cache used by the database.

If the database uses 2K pages or the number of the number of log writes per second is high or the number of log writes per transaction is above 1, consider 4K IO for the user log cache.

sp_poolconfig can be used to specify a 4K IO buffer pool.

exec sp_poolconfig 'default data cache', '20M', '4K'
go

The log IO size can be specified for each database and 4K is a recommendation for all, not just tempdb. The stored procedure sp_logiosize can be used to specify the log IO size for each database independently.

To confirm the size of the log IO, use the stored procedure sp_logiosize as follows:

exec sp_logiosize 'all'
go

Configuring 16K buffer pool

A 16K buffer pool can significantly help some classes of queries, queries that table or index scan, select into queries. A 16K pool allows larger amounts of data to processed at one time. with 2K pages a 16K pool can load 8 pages at a time and save on the physical IO. A 16K IO pool can be used to help all the tables, not just the ones in tempdb.