Delta (CRL) Force? (Network Steve Forum)

Delta (CRL) Force?

Folks, I'd like to open a discussion about why delta CRLs have such a seemingly significant role in Microsoft PKI documentation - such that most people would ordinarily infer that they are generally appropriate (in a similar way to how three tiers have appeared to become the perceived norm). I've worked on countless PKI engagements and have never had a compelling reason to implement D-CRLs. Sure, in the very biggest PKIs if you had monster CRLs I could see how there might be some advantage; however, the way that much of the Microsoft PKI documentation suggests things like one hour deltas (gulp) - I'm imagining that most people are implementing D-CRL on the misconception (in my mind) that they improve the freshness of revocation status in their solutions. I suggest this is a misconception because D-CRLs are critical in just the same way that a base CRL is. So, if you had one hour D-CRLs and for whatever reason they were not available... phaaaarp! A general principal I've worked to is designing base CRL validity periods around SLAs to recover broken CAs, generally being a day or two - given that, I've not really had much use for D-CRLs. I know that there are other means to keep things ticking along in the event of broken CAs, like CRL re-signing, but I like to keep these in reserve for real disaster scenarios rather than design them in as a factor influencing CRL validity periods. Anyhows, this post isn't intended as a criticism, a rant, or even a question. It's simply an invitation for anyone to share any experiences they've had to justify D-CRLs as I'd like to be convinced ;-) Cheers, Dave

July 12th, 2010 5:40pm

To me, there are 2 primary reasons for using Deltas: 1) As you already pointed out - size. A larger base CRL can be cumbersome to download for some clients so a delta offers a much smaller update on a more reqular basis. 2) Again, eluded to in your above posting - An answer to the formulaic debate of update frequency vs. recovery time. You want to update your CRL often enough that it is meaningful - that if you had to revoke something right now that it would be within an acceptable period of time before the relying parties would definitely start distrusting it. However, if you published a base CRL every hour - if there were to be an issue with publishing the CRL or something dramatic happened to the CA, then you have 1 hour to recover from it - this is not realistic for most recovery scenarios. If a delta publishing fails, clients do not err the same way that they do when a base CRL fails. So a day or a week for a base CRL is more common for recovery flexibility, with a much more frequent delta CRL for the update frequency. All this being said - as you are debating, delta CRLs are not the end-all be-all of CRL greatness. For most non-commercial PKIs, base CRLs are not very large. We've got hundreds of thousands of machine certs annually here, with a few thousand manual issuances, and tens of thousands and quickly growing of various user certs. Our largest CRL is currently a mere 14 KB. Delta CRLs are not always recognized by all PKI-aware software - particularily handhelds/appliances. This is getting better, but still an issue. For most smaller companies, a day or a few days is acceptable turnaround time for recognition of a revoked certificate. You can always keep your base CRL recovery time at a day or two or whatever and instead of issuing every 1/2-life you can issue hourly. Clients will still refresh on the same cycle - if anything they will probably be more spread out, reducing potential network congestion. If there's something big enough, you could push a log-on script or an email to update the CRL manually instead of waiting for the cached copy to expire. Another aspect to through into the discussion, is OCSP. OCSP is either real-time or near real-time responses of very tiny size, eliminating the CRL in the first place for 99.99% of all revocation checks for OCSP enabled environments. Noting that some OCSP installations are unfortunately CRL based (*ahem*), so in these cases it is more important to issue an updated CRL more frequently so that the OCSP responses are appropriately valid, in this case the CRLs should be issued in harmony with the OCSP response validity period. Large base CRLs are only partially addressed by deltas - the client does still need to download them occasionally and may need a different and even less supported method such as CRL partitioning, mini-CRLs, etc. to address those issues where OCSP is not available.

Free Windows Admin Tool Kit Click here and download it now

July 12th, 2010 7:22pm

As with any industry, it is the mark of an expert that can recognize the correct scalability for their clients. You need to determine what constitutes an improper solution - in regards to both overdoing it and not doing enough. It sounds like you are following the SLA for determining the base CRL frequency - that sounds logical to me. Many of the guidelines are for advising non-experts quickly in decisions that they are not equipped to quickly determine, so the blanks tend to be filled in. In my PKI guides I try to discuss each area that requires it, but still provide generalized recommendations for a more-secure typical installation. I have issues with many of the guides out there on the web that merely advise their users to 'just click next' to install their enterprise root CA with web services on their DC. It sounds like you have a solid grasp of the issues at hand - a 3 tier PKI is generally only needed if you expect to do cross-signing. Many companies don't bother going to that level, much less even know that they can. Heck, its a good day when you can walk into a larger company and only find one group managing a single enterprise PKI! A 2-tier is important so you can keep the root offline for maximum protection and to have the ability to revoke the issuing CA, just in case. And if you ever did need a 3rd tier, just drop it in there under the common root as it probably shouldn't be mixed with your internal PKI anyways even at the 2nd/policy level.

July 12th, 2010 7:38pm

Steve, Thanks for taking the time to respond. I'm curious about your comment regarding delta CRL failures not causing clients to err in the same way - I paraphrase. In my (admittedly limited) testing of D-CRLs, once the D-CRL became stale revocation failed (testing was done with smart card logon). I think your comment on CRL based OCSP is rather prescient... seems like the MS Online Responder (oops, I said it) caches retrieved CRLs for their authoritative period, thus ignoring over-issued CRLs - behaviour that lots of us seem to be disappointed in! Cheers, Dave

Free Windows Admin Tool Kit Click here and download it now

July 12th, 2010 8:23pm

This topic is archived. No further replies will be accepted.