Lync2010,Multi-Forest: Cannot send IM to/from other domain, TL_ERROR: For Rejected publisher (user@a.com) deserialize succeeded but could not resolve ServiceTagID as ServiceCluster was empty after deserialize.

Hello,

How do you do, This is our first post to the forum. Thank you for providing great help!

Ok, on to our problem: We have the following:

a.com (DC, LyncFE-a, Edge-a)
auser1@a.com

b.com (DC, LyncFE-b, Edge-b)
buser1@b.com
buser2@b.com

So, buser1 and buser2 of b.com can send IMs and see presence without a problem. But when buser1@b.com opens Lync Client and types sip address for auser1@a.com, it shows "Presence Unknown", and then when he types a message, his client shows "This message was not delivered to auser because this person is unavailable or offline".

When using the Lync Logging tool from LyncFE-b, we notice that SIP messages never go out of LyncFE-b to EdgeFE-b at all. The Snooper Log shows the below error in red (this happens the other way as well from a.com too):

Component: UserServices
Level: TL_ERROR
Flag: TL_COMPONENT
Function: CNotifyDocGenerator::Ge...
Source: NotifyDocs.cpp (4582)
Date Time: Now...2015..
Error Message:
(00000000000371AC0) For Rejected publisher (user@a.com) deserialize succeeded but could not resolve ServiceTagID as ServiceCluster was empty after deserialize.

When typing a random-name@a.com, it does actually go through the edges to LyncFE-a, but of course doesnt exist. Why do non-existant usernames go through the edges, but existant usernames not go through the edges? CSStaticRoute isnt set up at this point, not sure if it is required.

Does anyone have any guidance? Thank you so much


  • Edited by martKyu Tuesday, August 04, 2015 11:45 PM
August 4th, 2015 11:44pm

Hi martKyu,

As Edwin stated, first of all , you have to configure Lync federation which allows users to communicate with others outside their organization.

If this is not the case, You may follow the steps below to troubleshoot the federation issue.

1. First of all make sure Federation is enabled in topology.

2. Check the federation route in topology.

3. Check the federation and external access settings in Lync Control Panel. (External Access Policy / Access Edge Configuration / SIP Federated Domains)

4. Make sure you have the _sipfederationtls._tcp.sipdoamin and _sip._tls.sipdomain.com SRV records and pointed to Sip.domain.com A record.

5. (Optional) Check the access rules on firewall, also check if the required ports are opened.

6. Make sure the Certificate is valid. (If you use internal CAs, each federation partner should have others Root Certificate and install it on their Edge Server)

 

Best regards,

Eric

Free Windows Admin Tool Kit Click here and download it now
August 5th, 2015 3:34am

Hello, Thank you kindly for all of your help :)

I verified all the ideas you suggested but everything is ok. Do you have any guidance on debugging the database? I strongly believe that the Lync Server Front End SQL Database at each site has the wrong information and thats why it wont send existant-usernames, but will not send non-existant usernames.


I verified what you suggested, answers inline:

1. First of all make sure Federation is enabled in topology.
At both Front End Lync Servers Sites, if you open Lync Server 2010, download the latest topology, browse and left click the site name (second icon directly underneath Lync Server 2010), and look on the right side it is setup correctly, and point to the local edge servers:

LyncServer2010 \ LyncFE-a_site > Site federation route assignment > Federation: Edge-a
LyncServer2010 \ LyncFE-b_site > Site federation route assignment > Federation: Edge-b


2. Check the federation route in topology.
If by this you mean verify the edge routes, then yes via:

LyncServer2010 \ Edge Pools \ Edge-a >> External Settings > Sip, Web, A/v == sip.a.com, and sip port = 5061
LyncServer2010 \ Edge Pools \ Edge-b >> External Settings > Sip, Web, A/v == sip.b.com, and sip port = 5061

3. Check the federation and external access settings in Lync Control Panel. (External Access Policy / Access Edge Configuration / SIP Federated Domains)
All were checkmarked :

LynceFE-a >> External User Access >> External Access Policy / Access Edge Configuration == Global Federated (Checked) Remote (Checked) Public (Checked), and third tab of Federated Domains == Allow b.local, and Provider fourth tab only has defaults AOL/MSN/Yahoo

LynceFE-b >> External User Access >> External Access Policy / Access Edge Configuration == Global Federated (Checked) Remote (Checked) Public (Checked), and third tab of Federated Domains == Allow a.local, and Provider fourth tab only has defaults AOL/MSN/Yahoo


4. Make sure you have the _sipfederationtls._tcp.sipdoamin and _sip._tls.sipdomain.com SRV records and pointed to Sip.domain.com A record.
Yes there is split-brain setup so we have:
_sipfederationtls._tcp.a.com => SRV points to sip.a.com (5061)
_sip._tcl.a.com => SRV points to sip.a.com (443)
_sipfederationtls._tcp.a.local => SRV points to sip.a.local (5061)
_sip._tcl.a.local => SRV points to sip.a.local (443)
sip.a.com => IP Address of External Interface of Edge-a server
sip.a.local => IP Address of Front End Lync-a server

and other sides DNS server has:
_sipfederationtls._tcp.b.com => SRV points to sip.b.com (5061)
_sip._tcl.b.com => SRV points to sip.b.com (443)
_sipfederationtls._tcp.b.local => SRV points to sip.b.local (5061)
_sip._tcl.b.local => SRV points to sip.b.local (443)
sip.b.com => IP Address of External Interface of Edge-b server
sip.b.local => IP Address of Front End Lync-b server

5. (Optional) Check the access rules on firewall, also check if the required ports are opened.
All Firewalls are disabled for now on all Edges, Front End Lyncs, and Routers.

6. Make sure the Certificate is valid. (If you use internal CAs, each federation partner should have others Root Certificate and install it on their Edge Server)
All Certificates are OK, and all edge external certificates have their SAN name of edge.a.com. No certificate errors nor other errors are reported in the Lync Server 2010 Event log on any of the servers.


If you have other things to check feel free! I already tested the TLS and SIP via using custom ping packets, and resolved the Black Hole Router issue. Thus, the same Snooper Error message of "Rejected publisher (user@a.com) deserialize succeeded but could not resolve ServiceTagID as ServiceCluster was empty after deserialize" leads me to believe it might be a database problem. Any ideas?

Thank you kindly for all of your help :)


  • Edited by martKyu 1 hour 56 minutes ago
August 6th, 2015 1:29am

Hello, Thank you kindly for all of your help :)

I verified all the ideas you suggested but everything is ok. Do you have any guidance on debugging the database? I strongly believe that the Lync Server Front End SQL Database at each site has the wrong information and thats why it wont send existant-usernames, but will not send non-existant usernames.


I verified what you suggested, answers inline:

1. First of all make sure Federation is enabled in topology.
At both Front End Lync Servers Sites, if you open Lync Server 2010, download the latest topology, browse and left click the site name (second icon directly underneath Lync Server 2010), and look on the right side it is setup correctly, and point to the local edge servers:

LyncServer2010 \ LyncFE-a_site > Site federation route assignment > Federation: Edge-a
LyncServer2010 \ LyncFE-b_site > Site federation route assignment > Federation: Edge-b


2. Check the federation route in topology.
If by this you mean verify the edge routes, then yes via:

LyncServer2010 \ Edge Pools \ Edge-a >> External Settings > Sip, Web, A/v == sip.a.com, and sip port = 5061
LyncServer2010 \ Edge Pools \ Edge-b >> External Settings > Sip, Web, A/v == sip.b.com, and sip port = 5061

3. Check the federation and external access settings in Lync Control Panel. (External Access Policy / Access Edge Configuration / SIP Federated Domains)
All were checkmarked :

LynceFE-a >> External User Access >> External Access Policy / Access Edge Configuration == Global Federated (Checked) Remote (Checked) Public (Checked), and third tab of Federated Domains == Allow b.local, and Provider fourth tab only has defaults AOL/MSN/Yahoo

LynceFE-b >> External User Access >> External Access Policy / Access Edge Configuration == Global Federated (Checked) Remote (Checked) Public (Checked), and third tab of Federated Domains == Allow a.local, and Provider fourth tab only has defaults AOL/MSN/Yahoo


4. Make sure you have the _sipfederationtls._tcp.sipdoamin and _sip._tls.sipdomain.com SRV records and pointed to Sip.domain.com A record.
Yes there is split-brain setup so we have:
_sipfederationtls._tcp.a.com => SRV points to sip.a.com (5061)
_sip._tcl.a.com => SRV points to sip.a.com (443)
_sipfederationtls._tcp.a.local => SRV points to sip.a.local (5061)
_sip._tcl.a.local => SRV points to sip.a.local (443)
sip.a.com => IP Address of External Interface of Edge-a server
sip.a.local => IP Address of Front End Lync-a server

and other sides DNS server has:
_sipfederationtls._tcp.b.com => SRV points to sip.b.com (5061)
_sip._tcl.b.com => SRV points to sip.b.com (443)
_sipfederationtls._tcp.b.local => SRV points to sip.b.local (5061)
_sip._tcl.b.local => SRV points to sip.b.local (443)
sip.b.com => IP Address of External Interface of Edge-b server
sip.b.local => IP Address of Front End Lync-b server

5. (Optional) Check the access rules on firewall, also check if the required ports are opened.
All Firewalls are disabled for now on all Edges, Front End Lyncs, and Routers.

6. Make sure the Certificate is valid. (If you use internal CAs, each federation partner should have others Root Certificate and install it on their Edge Server)
All Certificates are OK, and all edge external certificates have their SAN name of edge.a.com. No certificate errors nor other errors are reported in the Lync Server 2010 Event log on any of the servers.


If you have other things to check feel free! I already tested the TLS and SIP via using custom ping packets, and resolved the Black Hole Router issue. Thus, the same Snooper Error message of "Rejected publisher (user@a.com) deserialize succeeded but could not resolve ServiceTagID as ServiceCluster was empty after deserialize" leads me to believe it might be a database problem. Any ideas?

What should I do now to diagnose this problem? ?



  • Edited by martKyu 23 hours 24 minutes ago
Free Windows Admin Tool Kit Click here and download it now
August 6th, 2015 5:27am

Hello, Thank you kindly for all of your help :)

I verified all the ideas you suggested but everything is ok. Do you have any guidance on debugging the database? I strongly believe that the Lync Server Front End SQL Database at each site has the wrong information and thats why it wont send existant-usernames, but will not send non-existant usernames.


I verified what you suggested, answers inline:

1. First of all make sure Federation is enabled in topology.
At both Front End Lync Servers Sites, if you open Lync Server 2010, download the latest topology, browse and left click the site name (second icon directly underneath Lync Server 2010), and look on the right side it is setup correctly, and point to the local edge servers:

LyncServer2010 \ LyncFE-a_site > Site federation route assignment > Federation: Edge-a
LyncServer2010 \ LyncFE-b_site > Site federation route assignment > Federation: Edge-b


2. Check the federation route in topology.
If by this you mean verify the edge routes, then yes via:

LyncServer2010 \ Edge Pools \ Edge-a >> External Settings > Sip, Web, A/v == sip.a.com, and sip port = 5061
LyncServer2010 \ Edge Pools \ Edge-b >> External Settings > Sip, Web, A/v == sip.b.com, and sip port = 5061

3. Check the federation and external access settings in Lync Control Panel. (External Access Policy / Access Edge Configuration / SIP Federated Domains)
All were checkmarked :

LynceFE-a >> External User Access >> External Access Policy / Access Edge Configuration == Global Federated (Checked) Remote (Checked) Public (Checked), and third tab of Federated Domains == Allow b.local, and Provider fourth tab only has defaults AOL/MSN/Yahoo

LynceFE-b >> External User Access >> External Access Policy / Access Edge Configuration == Global Federated (Checked) Remote (Checked) Public (Checked), and third tab of Federated Domains == Allow a.local, and Provider fourth tab only has defaults AOL/MSN/Yahoo


4. Make sure you have the _sipfederationtls._tcp.sipdoamin and _sip._tls.sipdomain.com SRV records and pointed to Sip.domain.com A record.
Yes there is split-brain setup so we have:
_sipfederationtls._tcp.a.com => SRV points to sip.a.com (5061)
_sip._tcl.a.com => SRV points to sip.a.com (443)
_sipfederationtls._tcp.a.local => SRV points to sip.a.local (5061)
_sip._tcl.a.local => SRV points to sip.a.local (443)
sip.a.com => IP Address of External Interface of Edge-a server
sip.a.local => IP Address of Front End Lync-a server

and other sides DNS server has:
_sipfederationtls._tcp.b.com => SRV points to sip.b.com (5061)
_sip._tcl.b.com => SRV points to sip.b.com (443)
_sipfederationtls._tcp.b.local => SRV points to sip.b.local (5061)
_sip._tcl.b.local => SRV points to sip.b.local (443)
sip.b.com => IP Address of External Interface of Edge-b server
sip.b.local => IP Address of Front End Lync-b server

5. (Optional) Check the access rules on firewall, also check if the required ports are opened.
All Firewalls are disabled for now on all Edges, Front End Lyncs, and Routers.

6. Make sure the Certificate is valid. (If you use internal CAs, each federation partner should have others Root Certificate and install it on their Edge Server)
All Certificates are OK, and all edge external certificates have their SAN name of edge.a.com. No certificate errors nor other errors are reported in the Lync Server 2010 Event log on any of the servers.


If you have other things to check feel free! I already tested the TLS and SIP via using custom ping packets, and resolved the Black Hole Router issue. Thus, the same Snooper Error message of "Rejected publisher (user@a.com) deserialize succeeded but could not resolve ServiceTagID as ServiceCluster was empty after deserialize" leads me to believe it might be a database problem. Any ideas?

What should I do now to diagnose this problem? ?



  • Edited by martKyu Thursday, August 06, 2015 8:02 AM
August 6th, 2015 5:27am

Hi martKyu,

You may run Test-CsFederatedParner command and check what error message that it returns.

Also check the event logs on Edge Server.

Best regards,

Eric
Free Windows Admin Tool Kit Click here and download it now
August 6th, 2015 5:37am

Hello Eric,

Thank you so much for all of your help with my problems. Here are the results:

From a.com's Lync Front Edge Server:

$ Test-CsFederatedPartner
TargetFQDN: Edge-a.local
Domain: b.local

TargetFQDN: Edge-a.local
Result: Success
Latency: 00:00:00
Error:
Diagnosis:

Ran the same test from b.com's Lync Front Edge Server:

$ Test-CsFederatedPartner
TargetFQDN: Edge-b.local
Domain: a.local

TargetFQDN: Edge-b.local
Result: Success
Latency: 00:00:00
Error:
Diagnosis:

No errors reported. Also no errors in the Event Log on any of the four servers. The only problem is the SIP error I posted in the top message which appears in the Snooper Log.

Any... guidance you can offer is severely greatfully appreciated as I have been working on this for almost a full month and the customer is waiting for me :(

August 6th, 2015 5:46am

Hi martKyu

I suggest you first execute the cmdlet: Enable-CsTopology -v 

Then update the SQL Back End for each Lync Server environment with Powershell command, if are Standard: Install-CsDatabase -Update -LocalDatabases or: Install-CsDatabase -Update -ConfiguredDatabases -SqlServerFqdn <EnterpriseEdition BackEnd FQDN> -UseDefaultSqlPaths if are Enterprise.

After that, execute the step 2 from Deployment Wizard in each Front End and Edge Server.

We must confirm that none error is appearing after that. Otherwise, please let me know if any error appears during this process from the event viewer for the Front End or Edge Server.

Regards.

Free Windows Admin Tool Kit Click here and download it now
August 6th, 2015 2:17pm

Hello,

As you mentioned that there is a federation trust between a.com and b.com, when b.com tries to subscribe the presence for a.com user, do you see the request hitting a.com's Edge server?

If not, the buser1 client would sent a subscribe to b.com's Edge, what is the response you get from b.com edge?

August 6th, 2015 2:54pm

Thank you guys for your guidance. Nope still having problems.

Question 1)

As your suggestion, I ran: Enable-CsTopology -v

Then  the Standard: Install-CsDatabase -Update -LocalDatabases

Then Step 2 from the deployment wizard on all four servers.

... There were NO CRITICAL or ERROR messages from the powershell commands, but there were WARNINGS about ACE Groups not being ready. Online when I searched this people often ran enable-csadomina and enable-csaforest, so doing so fixes these errors but they will happen again as they usually do. Otherwise everything was ok.

The warnings however got my attention and this happens in both domains: (same warning from Enable-CsadDomain sometimes):

Detailed information from Microsoft 2010 deploymentlog.html
Action: Process permissions on:"CN=RTCUniversalGlobalReadOnlyGroup; CN=Users, DC=a, DC=local"

Warning Ace: HQTRS\RTCUniversalGlobalReadOnlyGroup; Allow; ReadPropertly; None; None

Warning Ace: HQTRS\RTCUniversalGlobalReadOnlyGroup; Allow; ReadPropertly; None; None

Warning Ace: HQTRS\RTCUniversalGlobalReadOnlyGroup; Allow; ReadPropertly; None; None

...etc...


Question2)

Correct when sending from auser@a.com to buser@b.com, there are no SIP Subscribe Events even going out of the Front End Lync Server. Snooper shows that a.com is logged into the front end ok, and then when he types buser@b.com to send the message only this message is shown on the Front End, and the Edge Server shows nothing in the Snooper log. Here is the Snooper log from frontend-a (for when auser@a.com sends to buser@b.com):

Sequence#000005
Component: UserServices
Level: TL_ERROR
Flag: TL_COMPONENT
Function: CNotifyDocGenerator::Ge...
Source: NotifyDocs.cpp (4582)
Date Time: Now...2015..
Error Message:
(00000000000371AC0) For Rejected publisher (buser@b.com) deserialize succeeded but could not resolve ServiceTagID as ServiceCluster was empty after deserialize.

Free Windows Admin Tool Kit Click here and download it now
August 6th, 2015 8:45pm

For

Warning Ace: HQTRS\RTCUniversalGlobalReadOnlyGroup; Allow; ReadPropertly; None; None

Looks like all required AD attributes have not been updated

run Enable-csAdforest again

or check AD replication

For "For Rejected publisher (buser@b.com) deserialize succeeded but could not resolve ServiceTagID as ServiceCluster was empty after deserialize."

Seeming Like and issue with certificate or CRL check

Please check what's in the logs prior to this event

August 7th, 2015 3:03am

Thanks, not much progress this time.

Enable-CsadForest works fine, and the warning dissapear until the servers are rebooted, which isnt a problem.

The Rejected Publisher error is definitely our main problem now. There is nothing above it in the Snooper even when opening up the OCSLogger text file in notepad except for that same message of deserialize succeeded but could not resolve servicetagid.

There are no errors in the Lync Server, but if it is a certificate error how would I know? I think the certificates are valid as they are from the same CA server as the Edge Servers. Why is the servicetagid having a problem? Thanks so much!

Free Windows Admin Tool Kit Click here and download it now
August 10th, 2015 8:55pm

Hello

I ran the test again and opened the OCSLogger and added the ADConnect Tag, here are the results of sending a message from auser@a.local to buser@b.local:

ADConnect - TL_WARN TF_COMPONENT (000000000000) Topology version is not good for rebuild. current version = 1, available topo version = 1

ADConnect - TL_INFO TF_COMPONENT (000000000014A)  ADConnectionPool.GetConnection of type 2 by server DC-a.a.local)

ADConnect - TL_INFO TF_COMPONENT (00000000002GA) isServerMatch. DC-a.a.local:389 is a match for DC-a.a.local:389

ADConnect - TL_INFO TF_COMPONENT (00000000003JD) Connection to dc.a.local: has quality 1000

ADConnect - TL_INFO TF_COMPONENT (00000000005FA) Found conn in general pool 2 to dc-a.a.local:389

ADConnect - TL_INFO TF_COMPONENT (00000000005FF) ADSession: find using  dc-a.a.local:389 - LDAP search from CN=topology settings, cn=rtc service, cn=microsoft,cn=system,dc=a,dc=local, scope 1 filter (&(cn=325f3-123f213-f32ff31ea222)(objectclass=msrtcsip-GlobalTopologySetting)(objectcategory=msrtcsip-globaltopologysetting)), sizelimit 2, timeout 0)

ADConnect - TL_INFO TF_COMPONENT (00000000008AE) connection returned to pool, count decremented to 1

UserServices - TL_ERROR   TF_COMPONENT (0000000001G92) For rejected publisher buser@b.local, deserialize suceeded but could not resolve servicetagid as servicecluster was empty after deserialize

What is ServiceTagId and ServiceCluster, and why wont it forward buser@buser.local out of the Lync through the Edge servers?




  • Edited by martKyu Tuesday, August 11, 2015 10:59 AM
August 10th, 2015 10:05pm

To be honest to me  

"for rejected publisher (buser@b.local) deserialize suceeded but could not resolve servicetagid as servicecluster was empty?"

Mean that the user services in Lync is  unable to get the matching servicetagid as this Users information doesn't exist as part of the pool database.

Hence this could be the trigger to check if this is federated domain

Free Windows Admin Tool Kit Click here and download it now
August 13th, 2015 2:18am

Thank you for your help. As you said user services in Lync was unable to find the servicetagid, I did the following change, but it did not fix the original error above

Remove-CSUserReplicatorConfiguration -Identity global

However the error from above is still occuring, and auser@a.local still cannot send a message to buser@b.local (auser can connect to his local Front End Lync, but it never forwards it out of the Front End Lync to the Edge -> Edge -> b.local-Lync

Here is problem which I see on Snooper from the Front End Lync:

@SnooperResults@

ADConnect - TL_WARN TF_COMPONENT (000000000000) Topology version is not good for rebuild. current version = 1, available topo version = 1

Is something wrong with my topology? I published it each Lync Domain, and Invoke-CSManageStoreReplication, Get-CsManagementStoreReplications confirms True for both Lync and Edge Servers, so I dont think it is a topology, but WHY does Snooper Say the topology version is not good for the ADConnect? Is this a AdsiEdit Problem?

When enabling the Full Debugging Options in Snooper: ABCommon ABServer ABServerHttpHandler  ADConnect Application Server  AppShareOoty, I saw these errors:

@SnooperResults Max Logging@

SIP/2.0 480 Temporarily Unavailable

ms-diagnostics: 4190;reason="Failed to determine the user's pool to process request.";TargetUri="User1@contoso.net";source="server05.contoso.net"


  • Edited by martKyu Tuesday, August 18, 2015 8:46 AM
August 18th, 2015 8:26am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics