Hello everyone. I am pretty confused on whether I should be adding an autoIncremeted Primary key (surrogate key) to my table, or simply defining a primary Key on my 6 column Natural key in my specific instance. I want to give some background as to why the confusion and some information on my data-base table and I am hoping people can comment.
Background on Table Purpose:
My company receives files from clients that have ~200 columns. I created one very large data-table (I know I could theoreticallysplit it out into multiple tables, let's ignore that for now). The table now has 14 million + records. The Natural Key on the table are the following columns.
LoanNumber, Module, Data-month, LoanServicer,ClientName,LoadedDate. (Note LoadedDate IS party of the key, since I can get the same record on multiple days, and I want to store all records so the LoadedDate is included to make the row unique).
Every single query that is going to be run against the database will use 3-5 of the Natural Key columns. They will be queries such as "select * from population where Datamonth='8/31/2015' and module='ABC' and ClientName='Client #1' and LoanServicer='Servicer1".
The table gets updated with ~1 million rows a month, which happens in 100-200k records submitted per transaction. Meaning I have 5-10 transactions a month generating new records.
Because of this fact I thought I should define a PRIMARY KEY (NON clustered) on these columns. It is my understanding that a Primary Key constraint will require the columns to be unique and Not Null (which I need) and also add indexes to the columns. I then added an Auto-incremented (Identity) Column to the database and made that the clustered index. (Since it is an IDENTITY column it is always increasing and a good candidate for the Clustered Index). However from reading online, I have gotten the impression that your Clustered index should also Be your Primary key.... Is this true? I don't see why they have to relate to each-other at all?
My Questions are....
#1 Should I make my Identity column my primary key and remove it from my Natural Keys? (Aka should I have a surrogate key by definition)
#2 If I do this I still want to maintain Uniqueness on my natural keys... How should I do this? Should I simply make a Not Null Unique constraint?
#3 Zero queries will access this database through the now "surrogate" key (the auto-incremented natural key).... They will all use the natural key columns... So do I need to add regular indexes back onto the Natural Key columns? If so what type of indexes should I include on them?
#4 What's the point of a clustered index on an auto-incremented column if that column will Never be used in queries? I know it is "not good practice..." but what if I put a clustered index on the 6 natural key columns (5 of which are large varchars() ).... Since a clustered index helps reads from the table, wouldn't this be the best option? It would slow down inserts some (and cause index fragmentation), but if i'm only doing inserts 5-10 times a month is this really a problem? Couldn't I just re-build the index nightly during system off hours?
I appreciate all of your help and guidance!
- Edited by Bloodofgods 2 hours 37 minutes ago