Over the last three weeks we've seen huge increases in the prevalence of SE_REPL_SLOW_SECONDARY_THROTTLE wait type on our databases. Our workload has not changed, but at many times during the day we are "locked out" of our databases. This wait type relates to replication to a secondary database Azure maintains for availability of the three replicas of every SQL Azure database.
It is not uncommon to have a situation where we need to shut down our application that accesses the database for over and hour before we can insert even a single row into the database. Just to clarify, we need to leave our database with no transactions running at all for over an hour before we can update, delete or insert data. If we try to perform a write operation in that time, it simply times out and the "wait_time" exactly equals the "total_elapsed_time" while it is running.
Our application certainly does some ETL type operations but our workload hasn't changed recently and we are regularly crippled by this wait type, but this only started happening in the last three weeks. We've even sharded our database which has reduced the problem but it is still a huge issue. We've tried using more temp tables as SQL Azure doesn't replicate these to replicas, but this isn't compatible with SqlBulkCopy as per this discussion: http://social.msdn.microsoft.com/Forums/en-US/ssdsgetstarted/thread/51b26995-509f-419e-8545-3a3c16a6fc14
Has Microsoft recently changed the threshold for throttling? Are others experiencing the same problem? Any ideas for reducing this wait type?
Cheers
David
Here's the code I'm running to get info on the wait type:
SELECT TOP 10 r.session_id, r.plan_handle,
r.sql_handle, r.request_id,
r.start_time, r.status,
r.command, r.database_id,
r.user_id, r.wait_type,
r.wait_time, r.last_wait_type,
r.wait_resource, r.total_elapsed_time,
r.cpu_time, r.transaction_isolation_level,
r.row_count, st.text
FROM sys.dm_exec_requests r
CROSS APPLY sys.dm_exec_sql_text(r.sql_handle) as st
GROUP BY r.session_id, r.plan_handle,
r.sql_handle, r.request_id,
r.start_time, r.status,
r.command, r.database_id,
r.user_id, r.wait_type,
r.wait_time, r.last_wait_type,
r.wait_resource, r.total_elapsed_time,
r.cpu_time, r.transaction_isolation_level,
r.row_count, st.text
ORDER BY r.total_elapsed_time desc