LDAP_MATCHING_RULE_IN_CHAIN query 13 times faster on AD LDS than AD DS

Hi all,

I've been busy in a Lab environment with a LDAP_MATCHING_RULE_IN_CHAIN query to validate its performance on a Windows 2012 R2 domain controller. The query comes from IBM's PureApps Administration console, and is pretty "hardcoded", using the root of the Directory as basedn to check in a specified group which (nested) members it has.

The pureapp guys were complaining that the query took too much time to finish in our production environment, seeing it to take from 75 seconds to even time-outs. We were complaining that this query was consuming all cpu, causing decrease of speed in LDAP service to other clients.

So I've started digging :) - details in the detail section...

Our conclusion so far:

 In our AD DS setup, the query took by average 51 seconds to complete.
 In our AD LDS setup, the query took by average 4 seconds to complete! Almost 13x faster!

Same ESX, same ammount of vCPU's, same amount of memory, same users, group and membership data (the AD LDS is a synchronized from AD DS via FIM).

Anyone a clue on this? Sure a Domain Controller is not a AD LDS server and has far more tasks & complexity. But that big difference? Me and colleagues are surprised to see this! We can duplicate this outside our lab in other environments (production, acceptance, test, ...).

Kind regards,
David

Details

all on these queries: https://msdn.microsoft.com/en-us/library/aa746475(v=vs.85).aspx

BaseDn: DC=contoso,DC=com
Scope: Subtree
 Filter:
(&(objectCategory=person)(memberOf:1.2.840.113556.1.4.1941:=CN=Admins,OU=Groups,DC=contoso,DC=com))    

which enumerates the users of this Admins Group, with support if nested. There are 3 users in that group, via Direct membership (not nested). The AD DS or AD LDS has 40.000 users.

Chapter 1: cpu! add more cpu!

Using perfmon on a dedicated ESX (32 vcpu's, PE M620 E5-269V2 2.7GHZ 12C (DELL) XEON E5 SAN attached) with only my VM running on it with 8Gbyte RAM + reboot between each test:

W2k12 R2 fully patched with x vCPU, fault 2 seconds:

1 vCPU: 62 seconds
2 vCPU: 58 seconds
3 vCPU: 57 seconds
4 vCPU: 48 seconds
6 vCPU: 51 seconds
8 vCPU: 50 seconds
16 vCPU: 59 seconds

Average: 51 sec

conclusion: one query is actually allocated to one cpu. The speed does not change with adding cpu's on a non loaded machine. The overal impact to the total cpu usage is of course lower with every extra cpu.

Chapter 2: There's caching!

In the above test in scenario 2, we've been repeating the same queries. When we did these within 1-2 minutes of the previous query, we could clearly see an improvement of +35% in answer time! resulting in an average of 33 seconds.

Conclusion; the caching helps, but is pratically not of use.

Chapter 3: our production is Windows 2008 R2, the lab is Windows 2012 R2!

Doing tests on the same hardware in the lab with a W2K8 R2 DC, we see these times:

2 vCPU: 75 seconds
(simmilar for increasing amount of vCPU's as in chapter 1)

Let's try if there's caching: 71 seconds. That's like 5% improvement.

Conclusion; Windows 2012 R2 is more efficient! :D

Chapter 4: well we also have AD LDS with the same data, let's try that!

Our AD LDS is a setup where we synchronize 2 AD DS environments to AD LDS with FIM, using an userproxyfull user class to make LDAP authentication (and authorization) nicely transparant. So the AD LDS is even 10.000 "users" bigger than the AD DS where we've been testing against. Many user attributes like manager, telephone, name, location, company are included.

We see these times:

2 vCPU: 4 seconds average (fastest was 3, slowest 5)

Surprise, where did this come from? that's like almost 13 times faster! 

Chapter 5: let's index "memberof"!

Why didn't Microsoft do that by default in AD DS? Mmm let's try anyway, knowing that indexing is only efficient if data is different enough for each user, and whith group memberships, we know, it isn't much different.

Result: we did not see improvement with every test.

Chpater 6: let's go social :)

August 26th, 2015 4:02pm

Hi David Burghgraeve,

Does the others query filter you can get the normal response performance? If you can not determine the LDAP client query the only thing I know is indexing the attribute.

LDAP_MATCHING_RULE_IN_CHAIN, scope is not limitedit can be base, one-level, or subtree. Some such queries on subtrees may be more processor intensive, such as chasing links with a high fan-out; that is, listing all the groups that a user is a member of. Inefficient searches will log appropriate event log messages, as with any other type of query.On the server side, it will evaluate your filter and perform the query.

You can refer the following KB to realize the indexing.

Creating Efficient Queries

https://msdn.microsoft.com/en-us/library/ms675874(v=vs.85).aspx

Indexing in Active Directory

http://blogs.technet.com/b/ad/archive/2008/04/01/how-to-create-a-mosiac-of-user-thumbnails-in-aduc-dsa-msc.aspx

MCM: Active Directory Indexing For the Masses

http://blogs.technet.com/b/askpfeplat/archive/2012/11/11/mcm-active-directory-indexing-for-the-masses.aspx

Im glad to be of help to you!

Free Windows Admin Tool Kit Click here and download it now
August 27th, 2015 11:30pm

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics