Partition index and table

Hello people and thank you for reading this question. 

lets say I have a table A with historical sales, this table is partitioned by year, each year on a different filegroup/partition, each filegroup on a different hardisk on another NAS server. index is also clustered and partitioned with same function.

90 % of my reports get information using the whole year.

Now I have a table B with same data, but partitioned by number of the month, so 12 files, 12 partitions on 12 hardisk, but cluster index is by year. 

Using table A, when I got information, it will load the information from 1 disk, so I am limited to access speed of only one disk.

Using table B, I got the speed of all disks, but the index need to look all over all disks.

with my logic, table B should be faster? or not? or I am damaging the

August 6th, 2013 5:39pm

Each partition of the table (clustered table or a heap table) or partition of the non-clustered index is stored on a single filegroup. Several or all partitions or even all objects can share the same filegroup. The key points are:

1) single filegroup can consist of more than one files. And should, in most of the cases! Think about 4 equal files per filegroup. Too much files in filegroup is also not good, but single file per filegroup is often not good.

2) you hold a database file on a single disc? That is not just bad for performance (because multiple discs running in parallel will easily outperform that) but it is also dangerous. If the disk is damaged, you don't have redundancy and you have downtime and lower availability. Consider RAID10 arrays. You need at least 4 disks, even number of disks. Capacity is half of the disks, but it is the king in speed and reliability, for both reads and writes. If RAID10 is too expensive for the available budget for disks, consider RAID5. RAID5 is much much slower in writes, several times slower than RAID10. Reads are ok, comparable to RAID10, sometimes slightly exceeding.

3) build your clustered and non-clustered indexes so data you read and write is physically close together. Typically, it is by time value followed by id. E.g. (DATE, INT) would be the typical clustered key for such tables. INT is identity, ever-growing column, and DATE is also growing most of the time.

4) Consider combining partitioning with partitioned views. It is a view that combines tables that are partitioned. Partitioned view enables you independent indexing and independent online rebuilds. In SQL2008R2 single partition rebuild is offline operation.

5) Compressing the older (less used) partitions can help you reduce space occupies

IMHO, the biggest problem you have are disks, not the partitions. Build the RAID10 array for current data that is frequently read and written, and build RAID5 array for the older data that are mostly used for reading.

Free Windows Admin Tool Kit Click here and download it now
August 6th, 2013 7:09pm

>>>partitions on 12 hardisk, but cluster index is by year.

What do you mean, you have  yearid col?  

Can you compare the execution plan? I would see that using table A might be faster than using table B but it seems there are many differences between two approaches.

August 7th, 2013 5:15am

Yes, there is a YEAR column and a Month Column (int value). - Note, there is also a date type column, just to maintain integrity of the date. 
Free Windows Admin Tool Kit Click here and download it now
August 7th, 2013 5:17pm

Generally if your query touches one partition it would more efficiently rather touching all 12 partitions....

August 10th, 2013 2:37am

Hello, the thing is that right now, when somebody want to make a report anual report, that specific HD goes to 100% ussage making any work on the same disk a disaster on performance because its a bottleneck.  For this reason I was thinking on this kind of approach. 

Free Windows Admin Tool Kit Click here and download it now
August 12th, 2013 10:10pm

Try with B. If the query execution plan is paralellized, you should see improvement. If it is not paralellized, you will still have one disk at 100% and the rest 11 disks idle. Much finer level of paralellization, at stripe level, you will get by putting your disks in to RAID array. Backup the database, reconfigure storage to raid 10 or 5 with stripe size 64KB, format LUNs with 64KB allocation unit size, restore the database. You should see all the disks work in paralell now, regardless of the execution plan.
August 13th, 2013 4:54am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics