SharePoint and SQL Timeout Issues

Introduction

Bit of a sweeping post this time, as intermittent SQL Server connectivity errors can affect all SharePoint products (hence the copious use of Tags) but the audience of this post is generally geared at our older SharePoint product users. While they can occur for a number of reasons, there are several options and tools we can leverage to defend ourselves against, isolate or mitigate such SQL timeout issues within SharePoint products.

Connection Error

The error can be a bit mis-leading and open ended, so best to take the information in this post as exploritory advise more than anything. The error can indicate a serious problem, potential or existing, so don’t ignore them entirely:

Error establishing database connection. SQL Error: -2.: System.Data.SqlClient.SqlException: Timeout expired.  The timeout period elapsed prior to completion of the operation or the server is not responding.

Monitoring

This might sound fairly obvious at first, but answer this honestly; could you provide a history of SharePoint Health on demand? In order to establish a useful pattern and identity potential problems, we must be able to correlate data records such as ULS Logs, Event Logs, etc.
For example, Event ID 3355, 3760 and 5586 are generally linked to database connectivity issues with “Microsoft SharePoint Products and Technologies”. There are some further reading articles at the bottom of this post. Essentially though, you should have some mechanism (manual or automated) that can match these occurances against a ULS entry to trigger an alert system of a potential or current connectivity problem.

Troubleshooting (Network)

TCP Offloading
Intermittent connectivity issues can occur when TCP Offloading is enabled. While disabling it is a fairly simple Registry change on the SharePoint web front-end and application tier servers, you should be 100% certain this is the root cause before making the change in a Production environment. Check out this Microsoft Support article before proceeding.
Scalable Networking Pack (SNP)
This is definatley one for the older environments still among us, specifically Windows Server 2003 SP2, that have TCP Offload-enabled network interfaces, which occasionally cause generic intermittent network connectivity issues (which naturally extends to SQL connectivity woes). Consider disabling the SNP Features or check this Microsoft Support article for more options and instructions. There are some links at the bottom of this post for updates from the Microsoft Download Centre that switch off SNP Features.

Troubleshooting (Database)

Timeout Values
Possibly the most commonly employed defence against regular SQL connectivity timeouts is to increase the default timeout value from 15 seconds. Depending on latency between your SQL database and application server tiers, be very careful what you consider an “acceptable value” here. My advise would be; ensure you can first predict a timeout issue, then incrementally increasing the value until the timeout occurrences cease – then monitor, monitor and monitor some more.
This TechNet article sets out the instructions for changing this value. Although it’s for SharePoint Server 2007, the process is extremely similar for SharePoint 2010 and 2013.
High-Stress
Yes, I know SQL connectivity issues can cause high-stress but I’m talking about your environment, not your blood pressure. If your environment falls into the “high-stress scenario” then have a look at this (retired) Microsoft Support article.
Connection Aliases
I will assume that you are using SQL Alias’ for your SQL Server connections. Something that is rarely utilised when a DBA isn’t present though, is the Dynamically Determine Port settings within the SQL Server Client Network Utility configuration. When this option is skipped, the DBNETLIB will attempt to contact the SQL Server through known UDP ports to establish a connection. In certain circumstances, this can lead to losses of connectivity. Obviously, this requires some configuration changes to both the SharePoint and SQL products, so approach correctly and plan carefully before making any changes.

Other Considerations

I actually have a lot more to say on this subject and may add additional content in the future. Other concepts that should be explored include, but not limited to; TCP Chimney, TCP Filtering, Firewall and IPSEC Policies.
SQL connectivity issues can be something, and nothing. Monitoring can make the difference in your reaction to an incident during an intermittent network glitch or full-on connectivity loss. Please ensure your environment is in a supported and healthy state at all times, maintenance and housekeeping can help reduce the risk of connectivity issues simply caused by “bloated” or un-necessary data transfer.

Further Reading

About the author