Data Warehouse Compression

Dear Pros

I have a large scale data warehouse, that is been loaded with 100+ GB every day (spreaded throughout "Seconds" of the 24 hours). This results in a huge table size (TBs), and of course  degraded performance both in querying and loading.

I am reading about DWH optimization, and I came across the term Compression. I've seen a remarkable improvement in querying. My questions are:

1) Can I load data into a compressed table? In other words, can compression be applied on a LIVE table?

2) If no. 1 is valid, does compression improves or negatively affect loading process? Is there any issues with data consistency?

3) Can compression be applied on a partitioned table? What are the implications?

4) I would ask same above questions, but this time for index compression (for 2: index rebuild or update)

I appreciate your kind support.

Regards

April 26th, 2015 6:26am

In general large Data Warehouse tables should be compressed with PAGE compression or (2014+) be Clustered Columnstore Indexes (CCI).  The CCI is not only has the highest level of compression, but it's designed to be very fast to scan, supporting data warehouse queries.

1) Yes.  Compression on a table or index can be changed on REBUILD.

2) Typically compression makes loading slower, though often not by much.  It depends on the kind of compression.

3) Yes.  Compression can be specified on a per-partition basis.  Except for Columnstore compression, which is only available for secondary Columnstore indexes and Columnstore Clustered Indexes.

4) Table and index compression are the same thing.  Non-leaf pages are not PAGE compressed, but can benefit from ROW compression.

April 26th, 2015 11:56am

Thank you Visakh and David, it was really helpful from you guys :D :D

Free Windows Admin Tool Kit Click here and download it now
April 27th, 2015 2:59am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics