DNSSEC in Windows Server 2012 signature refresh on secondary

Hi,

I am running 2 Windows Server 2012 DNS servers and I noticed something odd regarding the DNSSEC signature refresh on the secondary server.

On the primary server I've setup a few DNSSEC signed zones, all runs fine and the automated features run without any problems.

Though I noticed that due to some odd reason the other server serving the secondary zone does not update when the primary automatically refreshes the DNSSEC records. In short the records on the secondary expire, only way to get around this transfer the zone again from the primary.

Notify and transfer has been setup, this works flawless for creating/updating normal DNS entries.

Did anyone encounter a similar problem or has someone figured out a workaround for this particular problem?

December 17th, 2012 4:04pm

Hi,

It sounds like the zone serial number is not being incremented on DNSSEC signature refresh. The zone would not be transferred automatically if the serial number is unchanged. I will check this myself and submit a bug assuming I can reproduce the problem. Thanks for bringing up the issue.

-Greg


Free Windows Admin Tool Kit Click here and download it now
December 17th, 2012 7:22pm

Hi,

if you need further information please let me know, am able to dive in deep as I have full control over these servers.

Is there a possible workaround to use powershell to initiate a new zonetransfer from primary to secondary so I don't have to do this by hand?
  • Edited by AlwindB Tuesday, December 18, 2012 10:53 AM added question for workaround
December 18th, 2012 1:13pm

Hi,

I'm back in the office (was out for a while) and am trying to reproduce this problem. I initiated a manual key rollover on my primary server and after the rollover completed a new version of the zone was automatically transferred to the secondary server. The serial number was actually incremented twice (from 3 to 5).

Is this something that is only happening on automatic rollover?

Thanks,

-Greg

Free Windows Admin Tool Kit Click here and download it now
December 22nd, 2012 1:06am

Hi Greg,

confirmed that a manual rollover does update the zone serial on the primary and so updates the serial on the secondary

it really seems that the automatic rollover on the primary does not update the zone serials and so not triggers the transfer

in short: this problem only occurs during automatic rollover


December 24th, 2012 10:55am

Hello,

This is a major issue that must be resolved!

When the validity expires, it breaks the DNSSEC chain for every secondary server if left on automatic. Unfortunately, you cannot set the validity period high as in previous versions, so you need to manually transfer the zone every couple of weeks.

EDIT: Just to clarify, it's the RRSIG refresh for DNSKEY records that does not trigger a SOA serial update. 

Can Microsoft please acknowledge that this will be fixed?

Regards,

Eric Sjberg


  • Edited by Eric Sjöberg Thursday, June 27, 2013 9:18 AM clarify
Free Windows Admin Tool Kit Click here and download it now
June 27th, 2013 11:48am

Eric,

The workaround we implemented is by using a Powershell command to do the zone transfer.
It's ridiculous that Microsoft did not figure this out in time because this really delays the implementation of DNSSEC through the Windows platform.

powershell start-dnsserverzonetransfer -name "domainname" -FullTransfer

June 27th, 2013 3:10pm

Thank you, good workaround! It's a high impact, but obvious, bug. I agree that they really should have spotted this early, and that a fix should be published by now. Unless we're the first to actually implement WinSrv2012-based DNSSEC in production. :)

We have cloud based secondary public dns servers, and we cannot initiate a manual zone transfer.

I made this PS one-liner for updating SOA serial on the master server (using WMI, because DNSServer PS module is also just half done), which initiates an automatic zone transfer. I have scheduled this to run an hour after RRSIG expires.

"thezone.tld" | %{$zone = $_; $soa = get-wmiObject -class "MicrosoftDNS_SOAType" -namespace "root\MicrosoftDNS" | where-object {$_.ContainerName -eq $zone} ; $soa.Modify($null,++$soa.SerialNumber,$null,$null,$null,$null,$null,$null)}

Free Windows Admin Tool Kit Click here and download it now
June 27th, 2013 3:56pm

Well we must be the second to implement DNSSEC on WinSrv2012 because we are experiencing the same issue.

An accidental test using the extremely useful tools offered at http://dnsviz.net/ and a retest with http://dnssec-debugger.verisignlabs.com/ had us stumble across this issue.

This renders the DNSSEC feature incapable of functioning correctly out of the box and should have a fix made available as soon as possible. Anyone aware of a related KB?

October 5th, 2013 11:18am

Then I guess we are third.

MS: fix please!

Free Windows Admin Tool Kit Click here and download it now
October 9th, 2013 12:18pm

Hi,

Sorry but this dropped off my radar completely until the last couple posts.

As I understand the problem, it only occurs with automatic key rollover. The issue is that the serial number isn't updated when automatic key rollover occurs. It does get updated if you initiate a manual rollover. The serial number incrementing is needed so that secondary DNS servers are notified that a zone transfer is needed.

This is a little difficult to reproduce because the minimum ZSK rollover period is 7 days. I think I managed to get around that by just setting the time on the primary server to a week later. This changed the ZSK status to queued, and it was necessary for me to restart the DNS service to actually get the automatic rollover to start. I know this isn't reproducing the exact scenario but it's the best I was able to do without waiting a week.

When I did this with two DNS servers running Windows Server 2012 R2, the zone incremented from 12 to 15 and was automatically transferred to the secondary. So, it looks like I'll need to check that this problem does actually happen in Server 2012 but is fixed in 2012 R2, then see if there is a patch for Server 2012.

I'll let you know what I find.

Thanks,

-Greg

P.S. I probably don't need to say this, but don't try the procedure I described above of changing the primary server's time settings to cause an automatic key rollover. When you restore the time settings to normal the zone keys are unusable as might be expected :-)

October 10th, 2013 10:15pm

OK it looks like my repro method is no good because I'm seeing the serial number increment on automatic rollover on Server 2012 also.  After the serial number increments the zone transfers automatically. I see the same amount of incrementing also - the number goes up by 3, but I have to change the time on the server and restart the DNS service.

I'll have to see if there is another method to get the automatic rollover to begin other than restarting the DNS service. I think maybe this is what causes the zone to increment whereas it wouldn't normally.

-Greg

Free Windows Admin Tool Kit Click here and download it now
October 11th, 2013 12:13am

Hello Greg, thank you for looking into this issue!

I have successfully managed to reproduce the issue by setting the server time to say an hour before rollover and restarting DNS. It's important to let the server initiate the rollover automatically. Using powershell (Invoke-DnsServerSigningKeyRollover) or restarting dns post-rollover seems to increase the serial.

EDIT: I am using 2012 servers as R2 is not yet generally available.
October 11th, 2013 9:17am

Hi Eric,

I'm seeing that if I unsign the zone, this also does not cause it to transfer to secondary. Do you see this? I have an unsigned zone on my primary server but the secondary still has the signed version.

Thanks,

-Greg

Free Windows Admin Tool Kit Click here and download it now
October 12th, 2013 12:23am

Greg,

I have done some testing with unsigning the zone. It updates the serial properly on the primary, and the secondary will eventually transfer the zone successfully. My secondary was unable to transfer it for a long time (making 6522 events in the log (new version detected)), with some time outs (error event 6527). I think that's what you are experiencing in your environment as well. I am not sure what made it work again though - I just left it alone after a while and when I looked at it again it had transferred (event 3150 (wrote new version)).

October 14th, 2013 3:35pm

Hi,

I just wanted to let you know we are actively looking at this. The product team knows about it and we're working on a dependable, reproducible demonstration to isolate the bug. More information to follow.

Thanks

-Greg

Free Windows Admin Tool Kit Click here and download it now
October 17th, 2013 12:36am

Greg,

We have one DNS master running Windows 2012 where we manage all Internet zones with DNSSEC, and four DNS slaves running Linux/BIND that effectively provide the DNS service for the Internet. We hit a similar issue with RRSIG signature expiration for MX records. New RRSIGs were automatically generated by the Windows 2012 DNS service,  but the zone SERIAL field wasn't incremented. Consequently the DNS slaves didn't refresh the zone data and only answered the expired RRSIG. After we manually incremented the SERIAL, the DNS slaves updated the zone data and started to answer the new RRSIG for the MX record.

Additional information: 

  • RRSIG validity time: 720 hours;
  • Windows 2012 Server configured with Brazilian Time Zone; 
  • Slaves: CentOS with Bind 9.8.2

November 4th, 2013 8:27pm

Hi,

So far I haven't been able to replicate this. I'm not using a BIND secondary, but that shouldn't matter as the issue is a serial # on the zonefile on the primary server.

I was focusing mostly on Server 2012 R2 but I'll try again on Server 2012.

-Greg

Free Windows Admin Tool Kit Click here and download it now
November 5th, 2013 1:10am

Hi,

It looks like we'll need to examine this further. I'm not able to reproduce the problem on either Server 2012 R2 or Server 2012 using a test zone.

With this server and zone (see the image below), I've accelerated the rollover process and set all the ZSK to minimum values. The KSK settings are default. The ZSK settings are:

DNSKEY signature validity period: 6 hours

DS signature validity period: 6 hours

Zone record validity period: 6 hours

Automatic rollover enabled, rollover frequency: 7 days

The "rollover process complete" event is 7670, and the zone transfer to secondary server with new version (serial) number is event 6001. I did note that with these settings not every new serial number is transferred, but that might be expected since it is incrementing very fast.

I will try slowing this down by using default settings, and watch it again to see what happens. If you can share your settings this might help me to duplicate what is happening on your server. I would need to know both the ZSK and KSK settings if they are different from default.

-Greg

November 5th, 2013 10:15pm

My DNSSEC zones are all 720 hour validity with a 5 / 1 year rollover frequency for KSK / ZSK resp.

I have created a test zone and set it to 6 hour validity, I'll report back once it has refreshed.

(I doubt it has any effect on anything, but my production zones have KSK 4096 key length and ZSK 2048 key length (SHA256) and NSEC3.)

Free Windows Admin Tool Kit Click here and download it now
November 6th, 2013 6:47pm

Thanks for the info. I've duplicated your settings under accelerated conditions so I should get a ZSK rollover about 3 times each day.

When I was using default settings I noticed the zone was incrementing, but not every new version transferred. Again, this is probably due to the fact that it was incrementing so fast. Hopefully these slower settings will give me a good view into the problem.

-Greg

November 6th, 2013 10:46pm

Unfortunately (?), the SOA on my test zone with 6 hour refresh has updated properly each interval, and the secondary has transferred the zone as it should. I suspect that the issue may be caused by some additional parameter, possibly a timing issue when the refresh frequency is longer than a certain value or maybe even a service/server restart between refreshes. With a 30 day frequency, there is pretty much always a maintenance window within that period.

I notice another probably related issue with the secondary zone - the old RRSIG are not removed from the zone. Please take a look at the zone hjo.se@ns3.invid.se (primary) and compare the amount of RRSIGs on the secondary hjo.se@ns.invid.se. Both servers are Windows Server 2012. This zone is coming up on the 30 day refresh (November 13th), will be interesting to see what happens then.

Free Windows Admin Tool Kit Click here and download it now
November 7th, 2013 11:17am

Hi,

The zone with longer rollover settings than default, matching what you have, is rolling over and updating fine. Slowing the rollover down has actually given a 1:1 ratio of event 7670 (rollover process ... is complete) to event 6001 (successful transfer of zone). Previously the accelerated rollover was happening so fast it would sometimes rollover twice before the zone was transferred.

I think you are right that the issue is probably related to a timing issue. Are you certain that the rollover has completed and the serial number is not incremented? Or, is it that you are getting a signature expiration before the serial number increments? If the latter, then maybe we are getting signature expiration prior to the rollover completing.

The rollover does have certain wait timers. The following Windows PowerShell  command should show you the state of all signing keys, including if they are waiting for expiration of a validity period:

Get-DnsServerSigningKey -ZoneName test.com | fl *

I'm not sure offhand why there are so many RRSIG on your secondary. I count 11, with just one on the primary. I'll think about this. You're right that it could be related to the problem.

-Greg

November 8th, 2013 2:57am

We upgraded to 2012 R2 and haven't been able to see this problem at all since (and yes, it has been a while and automatic rollovers have been done). With Windows 2012 we had a lot of problems with secondary zones not beeing updated. My conclusion is that whatever it is, it was fixed with 2012 R2. We upgraded to R2 on both primary and secondary servers.
Free Windows Admin Tool Kit Click here and download it now
November 8th, 2013 10:39am

Hi Greg,

just to add to the troubleshooting, in 2012 we activated the DNSSEC feature by using the Microsoft default settings (in other words NEXT>NEXT>NEXT). So far, unfortunately, no fix for the updating of the secondary.

Just to show you the default settings Microsoft provides:

[Key signing key (KSK): 1]
Algorithm: RSA/SHA-256
Key length: 2048 bits
KSP: Microsoft Software Key Storage Provider
DNSKEY signature validity: 168 hours
Replication: Disabled
Rollover state: Enabled
Rollover frequency: 755 days
Initial rollover offset: 0 days


[Zone signing key (ZSK): 1]
Algorithm: RSA/SHA-256
Key length: 1024 bits
KSP: Microsoft Software Key Storage Provider
DNSKEY signature validity: 168 hours
DS Signature validity: 168 hours
Zone resource record signature validity: 240 hours
Replication: Disabled
Rollover state: Enabled
Rollover frequency: 90 days
Initial rollover offset: 0 days


[Authenticated Denial of Existence]
Authenticated Denial of Existence: NSEC3
NSEC3 Hash algorithm: RSA/SHA-1
NSEC3 iterations: 50
NSEC3 User provided salt length: 8
NSEC3 User provided salt: -
NSEC3 opt/out: No

[Trust anchor and polling configuration]
Distribute trust anchor: No
Rollover trust anchor (RFC 5011): Yes
DS hash algorithm: SHA-1 and SHA-256
DS record TTL: 3600 seconds
DNSKEY record TTL: 3600 seconds
Delegation polling period: 12 hours
Signature inception: 1 hours

November 8th, 2013 10:39am

Hi AlwindB,

I've tested the default parameters and have never been able to duplicate this problem. In fact, I've never been able to duplicate this problem at all using any parameters. I'm not saying I don't believe we have a problem - just that I cannot figure out what is causing it because I can't get my system to behave this way.

One thing that I do think is important is that we check the status of the keys. Please use the command:

Get-DnsServerSigningKey -ZoneName test.com | fl *

..and verify that the rollover is not in a queued or other waiting state.

Also verify that event 7670 is occurring in Event Manager. This event should be thrown each time rollover is complete. After 7670, if there is a secondary DNS server (with notify activated), you should see event 6001 corresponding to a successful zone transfer. If you see multiple occurrences of event 7670 and event 6001 is missing, this tells us that the notify process is broken. If you don't see event 7670, or the cmdlet output from Get-DnsServerSigningKey shows that rollover is in a wait state, then one of the signing key parameters, probably a validity period, is causing the problem.

Thanks,

-Greg

Free Windows Admin Tool Kit Click here and download it now
November 9th, 2013 12:25am

Olav,

Thanks very much for the information about the fix for your issue when you upgraded to Server 2012 R2. This is good news.

I am wondering now if there is still a bug in Server 2012, or if it is something related to a BIND secondary.

Thanks,

-Greg

November 9th, 2013 12:28am

Eric,

With respect to the multiple RRSIGs, I see that these are all expired except for the last one. They seem to be generated every 8 days with an validity period of 10 days - except for the last three which are different (validity of 30 days, generated every 24 days). Since it is impossible for an RRSIG to be generated on a secondary server, they must be coming from the primary. However, the primary deleted the expired RRSIGs at some point, and even deleted the last unexpired RRSIG when it created the newest RRSIG.

Since the RRSIGs had to come from the primary (they can't be 'added' to the secondary), the primary must have had multiple expired RRSIGs at some point. Assuming I'm looking at this correctly, the primary has a single RRSIG with an expiration date of Dec. 7 that was created on Nov 7. The secondary has several RRSIGs with the most recent one expiring on Nov 13.

It's interesting that the expiration for this RRSIG is the same date as the refresh you referred to. This might be just a coincidence.

Did you change some parameters in August to cause RRSIGs to have a 30 day validity period?

Thanks,

-Greg

Free Windows Admin Tool Kit Click here and download it now
November 9th, 2013 12:54am

There is another possibility other than the RRSIGs being scavenged on the primary but not the secondary.

Zone transfers can be incremental (IXFR) which means only some records (the updated ones) are transferred, not the entire zone. This might be what is happening - with the new RRSIG being transferred each time but not then being removed later from the secondary, whereas on the primary it is replaced.

-Greg

November 11th, 2013 7:59pm

So my hjo.se zone has refreshed the RRSIGs on primary, and it is now inconsistent on the secondary. Please have a look at the zone quickly as I will need to force a refresh tomorrow (before it is noticed...).

PS C:\> Get-DnsServerSigningKey -ZoneName hjo.se | fl *

KeyId                         : 54fc417e-9601-405f-9a45-0e2af05c2a6c
IsRolloverEnabled             : True
ActiveKey                     : Microsoft Software Key Storage Provider;10 f2 4a 5f 6d 4f 9c b7 4a b4
CryptoAlgorithm               : RsaSha256
CurrentRolloverStatus         : NotRolling
CurrentState                  : Active
DnsKeySignatureValidityPeriod : 30.00:00:00
DSSignatureValidityPeriod     : 7.00:00:00
InitialRolloverOffset         : 00:00:00
KeyLength                     : 4096
KeyStorageProvider            : Microsoft Software Key Storage Provider
KeyType                       : KeySigningKey
LastRolloverTime              :
NextKey                       :
NextRolloverAction            : Normal
NextRolloverTime              : 2018-06-08 09:53:44
RolloverPeriod                : 1820.00:00:00
RolloverType                  : DoubleSignature
StandbyKey                    : Microsoft Software Key Storage Provider;1d 4d a8 b4 d8 d2 ef 96 42 1f
StoreKeysInAD                 : False
ZoneName                      : hjo.se
ZoneSignatureValidityPeriod   : 10.00:00:00
PSComputerName                :
CimClass                      : root/Microsoft/Windows/DNS:DnsServerSigningKey
CimInstanceProperties         : {ActiveKey, CryptoAlgorithm, CurrentRolloverStatus, CurrentState...}
CimSystemProperties           : Microsoft.Management.Infrastructure.CimSystemProperties

KeyId                         : 8ac74cf6-bb21-4f07-ae40-981d7ccb8c6f
IsRolloverEnabled             : True
ActiveKey                     : Microsoft Software Key Storage Provider;3f 48 6e 90 c8 55 b1 b4 4d ef
CryptoAlgorithm               : RsaSha256
CurrentRolloverStatus         : NotRolling
CurrentState                  : Active
DnsKeySignatureValidityPeriod : 30.00:00:00
DSSignatureValidityPeriod     : 30.00:00:00
InitialRolloverOffset         : 00:00:00
KeyLength                     : 2048
KeyStorageProvider            : Microsoft Software Key Storage Provider
KeyType                       : ZoneSigningKey
LastRolloverTime              :
NextKey                       : Microsoft Software Key Storage Provider;40 22 38 01 33 10 b0 97 45 83
NextRolloverAction            : Normal
NextRolloverTime              : 2014-06-19 09:53:44
RolloverPeriod                : 370.00:00:00
RolloverType                  : PrePublish
StandbyKey                    :
StoreKeysInAD                 : False
ZoneName                      : hjo.se
ZoneSignatureValidityPeriod   : 30.00:00:00
PSComputerName                :
CimClass                      : root/Microsoft/Windows/DNS:DnsServerSigningKey
CimInstanceProperties         : {ActiveKey, CryptoAlgorithm, CurrentRolloverStatus, CurrentState...}
CimSystemProperties           : Microsoft.Management.Infrastructure.CimSystemProperties

I do not have any 7670 events, but it's not doing any rollovers, just signature refresh. I'm also missing all the 6001s, but that may be caused by some notify setting on the secondary?

I do have two 3150 that the DNS server has written the same version twice:

Log Name:      DNS Server
Source:        Microsoft-Windows-DNS-Server-Service
Date:          2013-11-06 04:06:05
Event ID:      3150
Level:         Information
Computer:      DNS04.jkp.invid.se
Description:
The DNS server wrote version 2013061435 of zone hjo.se to file hjo.se.dns.
  <EventData Name="DNS_EVENT_ZONE_WRITE_COMPLETED">
    <Data Name="param1">2013061435</Data>
    <Data Name="param2">hjo.se</Data>
    <Data Name="param3">hjo.se.dns</Data>
  </EventData>

Log Name:      DNS Server
Source:        Microsoft-Windows-DNS-Server-Service
Date:          2013-11-11 19:37:17
Event ID:      3150
Level:         Information
Computer:      DNS04.jkp.invid.se
Description:
The DNS server wrote version 2013061435 of zone hjo.se to file hjo.se.dns.
  <EventData Name="DNS_EVENT_ZONE_WRITE_COMPLETED">
    <Data Name="param1">2013061435</Data>
    <Data Name="param2">hjo.se</Data>
    <Data Name="param3">hjo.se.dns</Data>
  </EventData>

Also, you are correct that I made a change to the zone in August. The scheduled task I use to script a SOA serial increase had stopped working over the summer, and I activated it again in August.

Incremental zone transfers sounds interesting. Are there any settings that can me made regarding this? Such as disable it? :)

Thanks Greg for helping us with this, it's much appreciated!

Free Windows Admin Tool Kit Click here and download it now
November 12th, 2013 10:35am

Hello again!

We have now encountered our first problem in 2012 R2. A zone on the secondary server have not received the updated RRSIG records. They have now expired. The SOA-records are the same on the primary and the secondary. Only the RRSIG records are wrong (but every RRSIG record has expired on the secondary). We have the same number of RRSIG records in both zones.

A look in the event Viewer reveals that no event id 6522 on the econdary (A more recent version of zone found, Zone transfer in progress) can be found for the last zone version. The version before that is shown wit a lot if 6522's.

On the primary I can see two event id 3150 followed by a event id 6001 (four days later!) for the zone version prior to the latest version. For the last version of the zone there only exists two 3150's (two days apart!).

Two things seem to be odd to me:

1. Why are there two events with id 3150 (wrote new version to file...) for the same SOA serial on the primary?
2. It seems that the automatic rollover does not trigger the secondary to check for new version of the zone. (We have notify on). The fact that the last zone transfer happened four days after the new zone hade been written to the secondary, makes me wonder if we have another type of problem with zone transfers. But on the other hand, both servers have the same SOA serial, but not the same zone data. Perhaps problems with partial (incremental) zone transfers?

I don't if any of this information is helpful, but since we have a problem, I'm happy to help. Please tell me what more info I could give to help dig out the problem.

November 12th, 2013 12:15pm

Hm, found another interesting thing:

The only thing needed to correct the problem is to restart the DNS service on the secondary. No events of zone transfers of the zone or anything regarding the zone is recorded on primary or secondary, but the zone data is now correct and exactly the same as on the primary!

So the zone data must have been transferred to the secondary, but the new RRSIGS have somehow not been made available through the DNS service?

Someone with more insight in how the Microsoft DNS Service works can perhaps make something out of this.

Also remember that we are running 2012 R2.

Free Windows Admin Tool Kit Click here and download it now
November 12th, 2013 12:32pm

Hi Eric,

I am going to be busy today and won't have time to troubleshoot this a lot, but I've taken time to look at the servers.

The RRSIG and DNSKEY entries look the same to me as before (is there something different?). Essentially, there are valid RRSIGs on both the primary and secondary for the root A record, but the secondary has several expired ones and the one that is still valid expires tomorrow (11/13) whereas the primary only has a single record that is valid from 11/7 to 12/7.

Based on the output you provided, I see that the zone is not currently rolling over. That is not necessarily a problem because it might not be scheduled to roll over yet. This is a helpful way to see what is preventing a zone from rolling over when it is in queued status.

I should have mentioned that the event 6001 and 7670 will only occur on the primary. On the secondary you'll see 6522 and 3150.

You said the zone is not rolling over. Do you have it enabled for automatic rollover? I assume you do, but I wanted to make sure.

Notify is a setting on the primary. In DNS Manager you can see it in the zone properties dialog, Zone Transfers tab, click the Notify button. If you add the secondary server here to the list of servers to notify when the zone changes it will enable the primary to "push" notifications to the secondary. If you don't do this, then the SOA refresh interval is used instead - which in your case looks to be 20 minutes (1200 seconds).

I will continue looking at this. I'm also talking to the product team about this issue.

Thanks,

-Greg

November 12th, 2013 11:13pm

Hi Olov,

When you are looking at the RRSIGs on the secondary, are you using DNS Manager or the command line? If you are using DNS Manager, have you right-clicked the zone and refreshed the view? I just wonder about this because the console can definitely have a stale view of records that would get refreshed if you restart the DNS service, and there would be no zone transfer events. I assume you are refreshing the view, but I need to make sure.

The event IDs you are seeing do seem strange. If the RRSIG has expired, there should also be an event 1524 (DNSSEC signatures with key tag #####, .....in zone test.com, have expired). Do you see this?

Thanks,

-Greg

Free Windows Admin Tool Kit Click here and download it now
November 12th, 2013 11:38pm

Hi Greg!

It was a DNS-test that alerted me that there was a problem. You can see the testresult here:

http://dnscheck.iis.se/?time=1384246106&id=3729480&view=basic&test=standard

It's in Swedish, but I think you can understand the most of it.

After I ran this test, I started a remote desktop to the secondary server and started the management console. I guess that means it showed me the latest data.

Yes, there were a lot of 1524 on the secondary server. I'm sorry I missed to tell you that.

I have now also found something that makes the previous oddities less odd. I found events 6522 regarding the zone, shortly after I restarted the service earlier today: "A more recent version, version 2012112053 of zone arjeplogshus.se was found at the DNS server at 80.88.126.12. Zone transfer is in progress." Note that the version it is talking about is the same version it had of the SOA record when the above test failed, but that is perhaps normal if incremental transfers are used.

If there are anything more you need from me, please let me know. I have lots of zones with these problems now, so I could easily find more examples if needed.

November 13th, 2013 12:19am

I want to make clear that the problem is in no way related to rollovers and never was.

It is when the RRSIG expires (validity period) and is refreshed, that SOA serial is not increased. This will cause problems with replications to secondary servers.

I realize now that the RRSIG refresh is probably the only operation that only affects the extended dnssec records. Every other operation, like adding a record or updating an existing, will affect a "standard" record, and that will cause SOA serial to be updated.

Free Windows Admin Tool Kit Click here and download it now
November 13th, 2013 3:18pm

Interesting to see that there is other people having this problem.

There is a big push for DNSSEC from the government here in Sweden so it's critical that this will be solved shortly. (Or else people will deploy it on Bind).

We use two Server 2012 R2 dnsservers and store the records in files.

We have also noticed the problem with expired RRSIGs in the secondary zone.

The problem was not related to the SOA serial for us.

We incremented the serial and forced a transfer, checked that the serialnumbers matched but the secondary server still provided expired RRSIG's.

We solved the problem by restarting the dnsservice on the primary server and then doing a zone transfer.

This is what it looked like when the error occured:

Notice that the query against the first server (dns.net.umea.se) gets two RRSIG answers but the query to the secondary server (secdns.net.umea.se) got four RRSIG answers including two expired. 

C:\WINDOWS\system32>dig @dns.net.umea.se umea.se +dnssec DNSKEY

; <<>> DiG 9.3.2 <<>> @dns.net.umea.se umea.se +dnssec DNSKEY
; (1 server found)
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 739
;; flags: qr aa rd; QUERY: 1, ANSWER: 6, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: do; udp: 4000
;; QUESTION SECTION:
;umea.se.                       IN      DNSKEY

;; ANSWER SECTION:
umea.se.                3600    IN      DNSKEY  257 3 8 AwEAAafgSe8H1wfgKQNP36ZLfL4nIc4FOKWeECOoRlby
oRESZYIsJwV/ RB9W59trAXmoxlDomW3nh9p1hSdIRWEyvVrsQEGljk5pcpDichBpkGH7 0Ys0Cc4ghE4attUCfucLy2Z3QfmqtJ
gIjlXCGQjf7ElBd8lSI53rDTDG SZ8Meu3DBuLTfHtefXZQAgxuXm5RMl1U69VwkKRAlYGtSwdJQ6WGpTxu ja5/L8OsH5tqB6Sc
g60Q6NbGG8xcCiawRgnhbNRzV29eEuSgIBodS6AC FS8WRfRFbJNqsfLgEOBAp7pfNJtsomw0mrDe61+fdl4c0cVz2oJG6iPa wF
5ul+z0v2E=
umea.se.                3600    IN      DNSKEY  256 3 8 AwEAAcJi54HwXURYIhE+KhpnA9+0lKT385LosvH2YJqG
WYn2RJ0OcXhl WTpdrT4GYZo36fabbNEfpeCngwNEB/M2cMnK6FYLRBwRrBrEtP1ScD5f r1d09SwGnCm+16tz+7uMoO6As49iO8
Mk6dV3XHnzHNBeJ990r6wWDjGR v2mGtnCd
umea.se.                3600    IN      DNSKEY  257 3 8 AwEAAcB1qkSET+TtMzljtP3LeuUSEwGAlr/MLsM79DFG
KqdwZRGZRIvb 9Ye10YYGepLZLmAwsHnwVNarriTqkcb2RcpO95q0uEGlMkSi/zC3aQwn 5OAg/A/3UV5ye0hlb88JEH2qb6PLMZ
hdoCuAYiIZ+GzFcVGZ4WBRA9eO pU06ctNtnP9IH/UjgHnLtefrLeDwC3aql9nHxH2lDEpMHEBFSGyYYylY 76xDDGbviGVRQrNN
Og8SCzhfH5gB9RfhLnjgC6NgfGPwhax6O49B9HG5 xyUfJF/FHM48UrQ7wffCjCivNvykysl7rtN+r+xf4ngDIWTWV+v7tiE2 Zw
RamjmC7ps=
umea.se.                3600    IN      DNSKEY  256 3 8 AwEAAciq3aN4fHe80Xsd48+NkX7Ark4xOfDMV53+MaVR
RjMm5OaU87Ds QxIoydFCVwhkyD0m8TJRiPMpjXOhYucUVQWua/aX8JIbw4SbnSv9s2rr mg+KVtjHeWDHypqi3uuTkYxyFC2IXA
RIW4q2k0NkrrY2c2nVjzJiSTQB DUGxXxgb
umea.se.                3600    IN      RRSIG   DNSKEY 8 2 3600 20131201223305 20131124213305 18180
umea.se. oslqv0LlKYzvpuUBJYWX5C3T9oMitRuD6NESICNpXMrkCWnQSVbZwqt5 u5M89AgEFNxuo/rLjWlshrdoyzy1CS2K5f
dsWO9DWxlXhr7Kj9ZCLeFX oFQrKkmCx7eFqySeDSGgwPjNndBbdXF/RaFwIIIOrax1QZ+7ONAhlY8x BX4=
umea.se.                3600    IN      RRSIG   DNSKEY 8 2 3600 20131201223305 20131124213305 1816 u
mea.se. heUGVnKkWnB//9kuhAePh7bXMf/1IRdEKzCuf6IIOR4BpOySUwewVq+Y 8ADzMWKpBRGUBnmXyp1KdQzoTTgqPYFZ/BX
AVhk3U9ylyuP2MU49vKGl juZ5jZ+4oqnI18XRyaKdMlLTrEXKIX2mW/qrmdhB7KDzfSEJlhlcko3e vAlkAXJn7DUkS9FkBejtI
NUUNs0YGeU6DrdO1Cy1cT+CvopufIyIAmh/ byd7+eANcSy1onSse75MUKZ9+DH8grXKFehCPHAGJydkAEOOnlIOqYCN 1UyR4Ex
fQKYpc5Gusur2KfSuc4VFTSGGB0vMbPmFcC2pgkNQFGoV3OaP QaeWQw==

;; Query time: 8 msec
;; SERVER: 193.254.4.46#53(193.254.4.46)
;; WHEN: Wed Nov 27 18:01:40 2013
;; MSG SIZE  rcvd: 1346

C:\WINDOWS\system32>dig @secdns.net.umea.se umea.se +dnssec DNSKEY

; <<>> DiG 9.3.2 <<>> @secdns.net.umea.se umea.se +dnssec DNSKEY
; (1 server found)
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 1560
;; flags: qr aa rd; QUERY: 1, ANSWER: 8, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: do; udp: 4000
;; QUESTION SECTION:
;umea.se.                       IN      DNSKEY

;; ANSWER SECTION:
umea.se.                3600    IN      DNSKEY  257 3 8 AwEAAafgSe8H1wfgKQNP36ZLfL4nIc4FOKWeECOoRlby
oRESZYIsJwV/ RB9W59trAXmoxlDomW3nh9p1hSdIRWEyvVrsQEGljk5pcpDichBpkGH7 0Ys0Cc4ghE4attUCfucLy2Z3QfmqtJ
gIjlXCGQjf7ElBd8lSI53rDTDG SZ8Meu3DBuLTfHtefXZQAgxuXm5RMl1U69VwkKRAlYGtSwdJQ6WGpTxu ja5/L8OsH5tqB6Sc
g60Q6NbGG8xcCiawRgnhbNRzV29eEuSgIBodS6AC FS8WRfRFbJNqsfLgEOBAp7pfNJtsomw0mrDe61+fdl4c0cVz2oJG6iPa wF
5ul+z0v2E=
umea.se.                3600    IN      DNSKEY  256 3 8 AwEAAciq3aN4fHe80Xsd48+NkX7Ark4xOfDMV53+MaVR
RjMm5OaU87Ds QxIoydFCVwhkyD0m8TJRiPMpjXOhYucUVQWua/aX8JIbw4SbnSv9s2rr mg+KVtjHeWDHypqi3uuTkYxyFC2IXA
RIW4q2k0NkrrY2c2nVjzJiSTQB DUGxXxgb
umea.se.                3600    IN      DNSKEY  257 3 8 AwEAAcB1qkSET+TtMzljtP3LeuUSEwGAlr/MLsM79DFG
KqdwZRGZRIvb 9Ye10YYGepLZLmAwsHnwVNarriTqkcb2RcpO95q0uEGlMkSi/zC3aQwn 5OAg/A/3UV5ye0hlb88JEH2qb6PLMZ
hdoCuAYiIZ+GzFcVGZ4WBRA9eO pU06ctNtnP9IH/UjgHnLtefrLeDwC3aql9nHxH2lDEpMHEBFSGyYYylY 76xDDGbviGVRQrNN
Og8SCzhfH5gB9RfhLnjgC6NgfGPwhax6O49B9HG5 xyUfJF/FHM48UrQ7wffCjCivNvykysl7rtN+r+xf4ngDIWTWV+v7tiE2 Zw
RamjmC7ps=
umea.se.                3600    IN      DNSKEY  256 3 8 AwEAAcJi54HwXURYIhE+KhpnA9+0lKT385LosvH2YJqG
WYn2RJ0OcXhl WTpdrT4GYZo36fabbNEfpeCngwNEB/M2cMnK6FYLRBwRrBrEtP1ScD5f r1d09SwGnCm+16tz+7uMoO6As49iO8
Mk6dV3XHnzHNBeJ990r6wWDjGR v2mGtnCd
umea.se.                3600    IN      RRSIG   DNSKEY 8 2 3600 20131126080837 20131119070837 18180
umea.se. NanCcPAaCBortVcT/ewz+EN4fD6q/CRB6KdLyFJqTF6HHg1gNI8NnfX8 pMso2G1+LKbsyyBzjxTh5d+AebVqI+vfvY
jxuiadzK2bJ6avhFLTRx0/ BStLQuQ9b41fF1hMLuGuzXcNkAl0o2+YShccJ2oo+HDnRJ+Ch9W2jj0f ouY=
umea.se.                3600    IN      RRSIG   DNSKEY 8 2 3600 20131126080837 20131119070837 1816 u
mea.se. FXzJC3OLvj9YN5AQRAzSj6JIkox6r79JemdArnKPk4dndN3F4KTZ3a6B 6QST1ZcTkUveEvMf+A24RmVhBLvxfVubNQh
VLUwY65OKxVh1zto0kKRG g2vmtwdq6cIeedL6gGLxeKMmFcXQYJO2mROG0dHC2pyDTs0vri3iidZ1 Dv8/qSnL3507iCnF72Q4t
PcChRy5rmk5COL1xuuzQB+PLByAFB9k0/Ie DfbG3E2vXlgsskHosCNM0Y4prjJzWZrVTGAj67WHCl16GY6dR1zUwBlQ oZAhQA8
14WFb+W8/iLlZpeQ9dYaAwvLy/StqCLHtCYKO9ezikdPvPPpI I9AzkA==
umea.se.                3600    IN      RRSIG   DNSKEY 8 2 3600 20131201223305 20131124213305 18180
umea.se. oslqv0LlKYzvpuUBJYWX5C3T9oMitRuD6NESICNpXMrkCWnQSVbZwqt5 u5M89AgEFNxuo/rLjWlshrdoyzy1CS2K5f
dsWO9DWxlXhr7Kj9ZCLeFX oFQrKkmCx7eFqySeDSGgwPjNndBbdXF/RaFwIIIOrax1QZ+7ONAhlY8x BX4=
umea.se.                3600    IN      RRSIG   DNSKEY 8 2 3600 20131201223305 20131124213305 1816 u
mea.se. heUGVnKkWnB//9kuhAePh7bXMf/1IRdEKzCuf6IIOR4BpOySUwewVq+Y 8ADzMWKpBRGUBnmXyp1KdQzoTTgqPYFZ/BX
AVhk3U9ylyuP2MU49vKGl juZ5jZ+4oqnI18XRyaKdMlLTrEXKIX2mW/qrmdhB7KDzfSEJlhlcko3e vAlkAXJn7DUkS9FkBejtI
NUUNs0YGeU6DrdO1Cy1cT+CvopufIyIAmh/ byd7+eANcSy1onSse75MUKZ9+DH8grXKFehCPHAGJydkAEOOnlIOqYCN 1UyR4Ex
fQKYpc5Gusur2KfSuc4VFTSGGB0vMbPmFcC2pgkNQFGoV3OaP QaeWQw==

;; Query time: 58 msec
;; SERVER: 193.254.4.34#53(193.254.4.34)
;; WHEN: Wed Nov 27 18:01:46 2013
;; MSG SIZE  rcvd: 1808

November 28th, 2013 10:56am

We have a (poorly written) PS-script that finds all zones with RRSIG records that will expire in the next hours and starts a full zone transfer to the secondary for those zones. The problems with warning that RRSIG has expired has almost vanished, but we still have problems with some ISP's making DNS-requests for signed zones in our environment. Perhaps there are some timing problems within the script that surely could be written better, but for now we have unsigned our most important zones to avoid problems.

Free Windows Admin Tool Kit Click here and download it now
November 28th, 2013 6:44pm

Hi,

We are looking closely at the issue.

I see that there are RRSIGs that should be removed from the secondary zone because they are expired, but I need to know if this is causing a failure for DNS queries. Are you seeing failed DNSSEC validation because of this?

Thanks,

-Greg

December 3rd, 2013 8:16pm

For us there are definite problems for resolvers using BIND when there are expired RRSIG records in the zone. A direct result is that customers using ISPs with those resolvers cannot reach their services with us. Tests at dnscheck.iis.se and dnswiz, etc gives errors at those times.

BUT, even when we are forcing zone transfers so that the secondaries do not contain expired RRSIG records, there are intermediate problems. All tests reports OK, but still customers at certain ISP's cannot reach their services on a domain with a signed zone in Windows Server DNS.

All experts we ask have only one solution: Do NOT use Windows DNS!

Free Windows Admin Tool Kit Click here and download it now
December 3rd, 2013 11:49pm

I don't know if it is of any use, but here is some more information. One of the large ISPs in Sweden are investigating this problem with us and they can make it all work if they clear their resolver caches.
December 4th, 2013 6:33pm

Hi Olov,

Thanks for confirming there are problems. I've escalated the issue here to get help. When you said the problem continues after the zone is fixed the DNS resolver cache was my suspicion, so it's good this cause is known. If the zone transfer can be done before RRSIG expiration this will probably prevent a bad record from being cached.

Thanks again for providing information about the problem. This helps a great deal!

-Greg

Free Windows Admin Tool Kit Click here and download it now
December 4th, 2013 7:28pm

Hi again,

I've been doing some testing and I think I've found a bug.

If someone on this thread is willing to work with Microsoft Support on this it will help to get a bug fix in place faster. Please let me know - you can email me at: greg<dot>Lindsay<at>Microsoft<dot>com.

(replace the <dot> and <at> with . and @)

Thanks,

-Greg

December 6th, 2013 2:04am

Hi,

I haven't heard from anyone on this thread yet via email. We are pursuing the bug, but it will definitely help if we can work with someone to verify that the fix works.

The bug I was able to duplicate is that DNSSEC validation fails when a secondary server has multiple RRSIGs for the DNSKEY record. In other words multiple records of this type on the secondary is a problem:

<domain>  3600  IN   RRSIG   DNSKEY

If a resolving/caching/recursive DNS server points to the primary DNS server there are no problems because expired RRSIG DNSKEY records are not present on the primary. There is also not a problem with the secondary if a full zone transfer has happened recently because this will remove the expired RRSIGs.

Another way of fixing this problem is to make the zone primary on both servers which can be done by using Active Directory integrated DNS. For security purposes, one of the servers can be an RODC. Obviously this solution is a major change if the server is not already AD-integrated. A simpler fix in the meantime is to ensure full zone transfers occur on a regular basis.

I've checked and unfortunately there is not a way to configure automatic full zone transfers (AXFR) from the primary to the secondary. It is the incremental zone transfer (IXFR) that actually leads to expired RRSIGs not being deleted.

Thanks,

-Greg

Free Windows Admin Tool Kit Click here and download it now
December 11th, 2013 1:54am

I've received one email so far, thanks for this! It is a critical requirement that customers participate in a hotfix. If we can get more it will help.

-Greg

December 11th, 2013 8:58pm

Thanks, we have two customers now.
Free Windows Admin Tool Kit Click here and download it now
December 11th, 2013 9:28pm

Quick update - I've been working with our support folks to reproduce and analyze the problem. I'm hopeful we'll have a fix to test before long.

-Greg

January 18th, 2014 5:51am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics