What is Compression
- By minimizing the disk space that is used by your data and indexes, it’s easy to save money
- Helps improving I/O
- Ability to store data rows in compressed format on disk
- Saves up to 90% of row storage space
- Ability to estimate possible compression ratio
- Fits more data onto a page
- Fits more data into buffer pool
- Reduces logical log usage
How IDS Storage Optimization works!
- By considering the entire row and all its columns
- IDS looks for repeating patterns and stores those patterns as symbols in a compression dictionary
- By considering the entire row and all its columns
- IDS looks for repeating patterns and stores those patterns as symbols in a compression dictionary
Creating a compression dictionary
Compressing the data in table
- After creating the dictionary, IDS starts a background process that goes through the table or fragment and compresses the table or fragment rows.
- The process compresses each row and leaves it in the page where it was compressed. Any new rows that are inserted or updated are also compressed.
- This compress operation runs while other transactions and queries are occurring on the table.
- Therefore, IDS performs the operation in small transactions and holds locks on the rows being actively compressed for only a short duration.
Reclaiming free space
- After all the rows have been repacked, the shrink operation removes the unused table or fragment space and returns free space to the dbspace that contains the table or fragment.
What we are using behind the scene!
- Lempel-Ziv (LZ) based algorithm – static dictionary, built by random sampling
- Frequently repeating patterns replaced with 12-bit symbol numbers
- Any byte that does not match a pattern is also replaced with a 12-bit reserved symbol number
- Patterns can be up to 15 bytes long
- Max possible compression = 90% (15 bytes replaced with 1.5 bytes = 12 bits)
Performance Impact of Compression
- IO-bound workloads
- Compression may improve performance by reducing IOs (both data page and logical log)
- More data fits on a page, therefore more in buffer pool
- Log records are smaller, therefore less logging
- For CPU-bound workloads
- Additional CPU used to compress and expand rows
- Should not be a large impact
What cannot be compressed
- You cannot compress data in rows in the following database objects: tables or fragments that are in the sysmaster, sysutils, sysuser, syscdr, and syscdcv1 databases
- Catalogs
- Temporary tables
- The tblspace tblspace
- Internal partition tables
- Dictionary tables (these tables, one per dbspace, hold compression dictionaries for the fragments or tables that are compressed in that dbspace and metadata about the dictionaries.)
- You cannot compress a table while an online index build is occurring on the table
- Encrypted data, data that is already compressed by another algorithm
- Small tables are not good candidates for compression, because you might not be able to gain back enough space from compressing the rows to offset the storage cost of the compression dictionary.
Compression is a 3 Phase Operation
- Compress
- Repack
- Shrink
And it has some prerequisites:
- There must be at least 2,000 rows in each fragment of the table, not just a total of 2,000 rows in the whole table.
- You must be able to connect to the sysadmin database (by default only user OneDB), and you must be a DBSA.
- Logical and physical logs are large enough to handle normal processing and compression operations. Compression, repacking, and shrinking, operations can use large amounts of logs.
Storage optimization methods
- You can use the COMPRESSED option in the CREATE TABLE statement to enable automatic compression of the table when the table has at least 2000 rows.
- You can use the COMPRESSED option in the CREATE INDEX statement to enable automatic compression of the index if the index has 2000 or more keys. Compression is not enabled if the index has fewer than 2000 keys.
- You can use the SQL administration API task or admin function to perform any type of storage optimization on a table, fragment, or index.
Compression with replication
- All are supported on compressed tables
- HDR/RSS/SDS
- Tables will be compressed on secondary if they are compressed on primary
- ER
- Compression status of tables is independent between source and target, specified by user
Compression Benefits
- Reduce the space occupied by the row
- Compressing a table can be done online
- Compress either a table or fragment
- Smaller Archives
- More data in the buffer pool
- Fewer long/forwarded rows
- Few I/O for same amount of data read/written
- You have saved your money
Uncompressed data
- You can uncompress a previously compressed table or fragment.
- Uncompressing a table or fragment deactivates compression for new inserts and updates, uncompress all compressed rows, and marks the compression dictionary as inactive.
- Once a dictionary is marked as inactive, IDS will no longer use it to compress new or changed rows.
- The uncompress operation normally needs to allocate new pages for rows that no longer fit on their original pages
- You can uncompress online or offline, using the uncompress or uncompress_offline argument.
- An uncompress_offline operation is the same as the uncompress operation, except this operation is performed while holding an exclusive lock on the fragment, preventing all other access to the fragment data until the operation is completed
Uncompress Data- API’s
- EXECUTE FUNCTION task(“table uncompress”, “table_name”, “database_name”, “owner_name”);
- EXECUTE FUNCTION task(“table uncompress_offline”, “table_name”, “database_name”, “owner_name”);
Monitoring Interface
- System-Monitoring Interface (SMI)
- IDS stores compression dictionaries, both active and inactive, in a dictionary repository called rsccompdict.
- Each dbspace that contains a compressed table or fragment has a dictionary repository.
- The syscompdicts view in the sysmaster database brings together the information from all dictionary repositories. Each row represents a compression dictionary associated with a table or fragment by its part number (partnum).
- Listing 3 shows an active dictionary that only has creation information (such as a creation timestamp, the log unique id, and the log). An inactive dictionary would have additional drop information.
Monitoring Interface : SMI/Oncheck
Compression Magic
How DBA Can get a good appraisal!
- Reducing the number of extents, a table contains
- Move all rows to the beginning of a table
- Return unused space at the end of a table to the system
- Shrink a partial used extent at the end of a table
Comment wrap
Start a Conversation with Us
We’re here to help you find the right solutions and support you in achieving your business goals.