Troubleshooting People Picker with Netmon

While helping a customer out with an odd people picker issue a couple months ago, I came across a change in behavior in SharePoint 2013 that I figured I’d document while showing how I got to the solution using (mostly) Netmon to troubleshoot it.

The problem: The customer (running 2013 Foundation, so no UPA) came to me with two shared mailbox objects in AD that his end users were trying to use to secure something in their SharePoint site. One was resolving just fine in People Picker, while the other wasn’t. As far as they could tell, both mailboxes had the same settings in AD and both were in the same domain the farm is, and looking at the base properties, for them, I wasn’t seeing anything obviously different either. Both user accounts were marked as disabled, as well, which made it more confusing. I had him test the same accounts on their WSSv3 farm, and neither account could be resolved there, which I expected since the mailboxes were marked as disabled.

It was at that point I went straight to Netmon to see what the LDAP query and response looked like in both the working and failed case, which helped us locate the cause of the issue within an hour or so. To illustrate it, I’ll take you through this using my SP2013 Foundation test farm. I was able to reproduce this using standard user accounts, which are what you’ll see in the screenshots.

First, my failing / working screenshots using my creatively named test accounts:

WorkingFailing

Now let’s compare the base settings for each account:

WorkingADFailingAD

So other than the name, the only difference visible is that WorkingUser has “Password Never Expires” set, while FailingUser doesn’t. That’s the same as the two mailboxes, and tripped me up for a couple of minutes, but played absolutely no part in this.

Once I got the Netmon traces for both the working and failing scenarios, the fun began. Let’s compare the LDAP queries SharePoint was sending for each user:

(You can click on each image to see a larger version)

Working

Capture

Failing

failcapture

See any difference? Me either. SharePoint sent basically the same query with the same filters, looking for the same attributes. Okay, all good up to this point, so let’s compare the responses:

Working

WorkingLDAPQueryResult

Failing

FailingLDAPQueryResult

Notice any differences here? The first one that stood out to me was the userAccountControl value. In the case of the fail, it’s a blatant “disabled”, being 514 (see Q305144 for possible values), while in the working trace it was 66050, which I honestly wasn’t sure of at first, so I assumed some flag was set on the account that kept SharePoint from discarding the result. After a little digging, I realized that just meant it was a disabled account with “DONT_EXPIRE_PASSWORD” set (as shown in the account settings), which had no influence on the People Picker behavior. That took me back to the traces to see what else in the response was different. Notice anything about the other returned attributes? Maybe one present in the working trace that’s missing in the failing trace?

Snip

Seemed odd to me that this could be the issue, but I wanted to validate that the failing user didn’t actually have that set. Sure enough, only the working one did:

WorkingSIDFailingSID

I then moved to the source code to see if we actually used this value, and it turns out that in SP2013, we do. It’s been a while since I looked this up, so I’m going from memory here, but if I recall, we do this to handle cross-forest scenarios. If we see that the account is marked as disabled and msExchMasterAccountSid is populated, we assume it’s a synchronized contact object from another forest and treat it as valid, and resolve it in People Picker. If we see the account is disabled and there’s no msExchMasterAccountSid value, we assume it’s really a disabled object and discard it from the results. To test that, I used ADSI Edit to copy the value of ObjectSID for the failing user account to msExchMasterAccountSid:

Fixed

From that point on, that test account started resolving properly. I then hit my customer contact up, had him do the same thing on the non-resolving account, and voila, the account was now resolving in People Picker:

FixedFailing

Note that we do let you change the attribute we key off of for the SID value, so they also could have fixed it by telling SharePoint to use the ObjectSID value instead of msExchMasterAccountSid by modifying the PeoplePickerSearchReplicatedMasterSIDPropertyName, but that would then mean that all disabled user accounts would begin resolving in People Picker, and since this was a one-off issue, we decided to just go with the original workaround on that one object.

So yea, I cheated by having source access, but it was Netmon that got me there. It’s one of my favorite tools for troubleshooting things like this, as I like to see exactly what’s being sent in the query, and what’s coming back in the response to see if something about the returned object might be triggering unexpected behavior in People Picker.

No Comments