After we made a change to the syncronization connection (added an exclusion for userAccountControl BitOn Equals 2), the service is trying to do a DS_FULLSYNC. This fails on step 2 of 6 (Six domains in the forest, one step to import each) with the error stopped-database-connection-lost.
We have the same issue in our test environment when trying to do a DS_FULLSYNC on two specific domains. It appears to be related to the size of the domain since the two that fail are the largest by far (80000 and 40000 objects, I think).
No errors reported on the SQL server, and the only information in the event logs is on the SP admin server are:
Event ID 6322 FIMSynchronizationService
The server encountered an error because the connection to SQL Server failed.
Event ID 6075
The management agent "MOSSAD-***** Profile Synch" failed on run profile "DS_FULLSYNC" because the connection to the server database was lost.
Additional Information
Discovery Errors : "0"
Synchronization Errors : "0"
Metaverse Retry Errors : "0"
Export Errors : "0"
Warnings : "0"
User Action
Verify that SQL Server is running.
Event ID 2004
The FIM Synchronization Service failed to update the timestamp. Verify that SQL Server is running.
Error Code: 0x80230621
Error Message: (A connection to SQL Server could not be established)
First thought was a timeout on the connection, however on test I have set the max time on queries for the SQL server to unlimited, and the connection timeout for DS_FULLSYNC to 20 minutes, neither change made a difference.
In Production, Step 1 importing the first domain runs for around 25 minutes and completes successfully.
Step 2 importing the next domain runs for about 30 minutes and then fails: stopped-database-connection-lost. If I try to 'resume run profile' it will run for about 5 minutes on step two and I get the same error. (note that looking at the "Number of user profiles" in SharePoint, I can see it increment by a few hundred/thousdand each time I do a resume)
Environment: SP 2010 ServicePack 1 CU for July 2011 KB2536599, OS Server 2008 R2, Dell PowerEdge R710 12GB RAM 2x QuadCore Intel E5530 (SP Upgraded from existing MOSS 2007)
SQL 2005 ServicePack 4,OS Server 2008 R2, Dell PowerEdge R710 32GB RAM 2x QuadCore Intel E5530
Oh, and we removed the exclusion filter in test and still have the same isuses. If Anyone has any ideas or knows of another Timeout/Connection settings, it would be gretaly appreciated.