Search

sp_testlinkedserver Fails Catastrophically when Linked Server Is Offline by JediSQL

Closed
as Fixed Help for as Fixed

5
0
Sign in
to vote
Type: Bug
ID: 522821
Opened: 12/31/2009 3:07:06 PM
Access Restriction: Public
0
Workaround(s)
4
User(s) can reproduce this bug
I was working on designing a distributed system using SQL Server 2008 Linked Servers. I wanted to engineer in fault tolerance for Linked Server being offline. I wanted to find the least expensive call to determine if a Linked Server is online and available. I searched BOL and I found sp_testlinkedserver, and it looked just like what I need. So I tried it out with a linked server with this script:

» declare @w int;
» exec @w = sys.sp_testlinkedserver N'DATASVR'
» print case @w when 0 then 'Online' else 'Offline' end;

When I ran it I got this result:

» Online

I thought "Wow, that's cool. I could really have used that in SQL 2000."

Then, being a thorough experimenter, I wanted to see it in action when the Linked Server is offline. So I stopped the SQL Server 2000 service on the Linked Server. This is the result I got:

» OLE DB provider "SQLNCLI10" for linked server "DATASVR" returned message "Login timeout expired".
» OLE DB provider "SQLNCLI10" for linked server "DATASVR" returned message "A network-related or instance-specific error has occurred while establishing a connection to SQL Server. Server is not found or not accessible. Check if instance name is correct and if SQL Server is configured to allow remote connections. For more information see SQL Server Books Online.".
» Msg 2, Level 16, State 1, Line 0
» Named Pipes Provider: Could not open a connection to SQL Server [2].

Notice that "Offline" was never printed. The call to sp_testlinkedserver fails so catastrophically that the next statement is never executed! What is the point of having something to test a connection if subsequent code cannot recover from the test?

We evaluated the concept of putting the call to sp_testlinkedserver in a try-catch block, but we could just put our code that uses the Linked Server in the try-catch block as well. It seems the code inside of sp_testlinkedserver should have a try-catch block so that the [OLE DB provider "SQLNCLI10" for linked server "XXXXX"] errors don't crash sp_testlinkedserver.
Details (expand)
Product Language
English

Version

SQL Server 2008 - Developer Edition

Category

SQL Engine

Operating System

Windows Server 2008
Operating System Language
US English
Steps to Reproduce
Run this batch in an SSMS 2008 Query Window, replacing "DATASVR" with the name of a Linked Server that is defined in sys.servers, but where the Linked Server is not running:

declare @w int;
exec @w = sys.sp_testlinkedserver N'DATASVR'
print case @w when 0 then 'Online' else 'Offline' end;
Actual Results
OLE DB provider "SQLNCLI10" for linked server "DATASVR" returned message "Login timeout expired".
OLE DB provider "SQLNCLI10" for linked server "DATASVR" returned message "A network-related or instance-specific error has occurred while establishing a connection to SQL Server. Server is not found or not accessible. Check if instance name is correct and if SQL Server is configured to allow remote connections. For more information see SQL Server Books Online.".
Msg 2, Level 16, State 1, Line 0
Named Pipes Provider: Could not open a connection to SQL Server [2].
Expected Results
The word "Offline" displayed in the Messages tab of the query results pane.

Platform

X64
File Attachments
0 attachments
Sign in to post a comment.
Posted by Mark Freeman on 11/18/2011 at 9:34 AM
If this has allegedly been fixed, why does http://msdn.microsoft.com/en-us/library/ms189809(v=SQL.90).aspx still show the incorrect information? It says it was last Updated: 14 April 2006.

Return Code Value
--------------------------------------------------------------------------------

0 (success) or 1 (failure)

The 2008 and 2008R2 versions of the page have also not been updated. Only the "Denali" version was updated with the correct information. The versions for the older versions also need to be updated before you can really consider this "fixed".
Posted by Microsoft on 8/8/2011 at 10:48 AM
Removed the incorrect Return Code section. Changed the introduction to say: Tests the connection to a linked server. If the test is unsuccessful the procedure raises an exception with the reason of the failure.

Posted by JediSQL on 6/28/2011 at 10:59 AM
Perhaps we could get an sp_testlinkedserver2 (or something like that) that has the 0/1 return so that both behaviors are available to meet all needs.
Posted by John Paul Cook on 10/11/2010 at 10:37 AM
A 0/1 outcome is needed for people who need to programmatically ascertain the status of a linked server.
Posted by Microsoft on 6/8/2010 at 12:40 PM
Thanks for the feedback. Unfortunately, the documentation and implementation for sp_testlinkedserver are inconsistent, and have been so since the procedure was added in SQL Server 2005. BOL says that the procedure returns 1 on failure. The implementation throws an exception, propagating the error we get from OLE DB.

Both behaviors are useful in different circumstances. If you're writing a script (your example), you'd probably prefer 0/1. If you're using this interactively in SSMS, the exception is more useful because it (may) give details about the problem.
We've discussed this among the team members and concluded that the best thing to do at this point is to fix BOL to match reality.

We will do so in the next release.

Thanks again for your feedback.

Regards,

Joachim Hammer
Program Manager
SQL Server
Posted by SueJen on 1/25/2010 at 12:03 PM
What's worse is that a failure invalidates the transaction. While I could test the linked server outside of the transaction, in some cases it may be 30 seconds between the beginning of the transaction and access to the linked server. Plenty of time to lose the connection. So I'm not finding much use for this procedure either.

declare @ret int
set xact_abort off
begin transaction
begin try
    exec @ret = sp_testlinkedserver N'XX01'
end try
begin catch
    select @ret = -1
end catch
select @@TRANCOUNT TRANCOUNT, XACT_STATE() XAC_STATE, @ret ret
commit transaction

results in:

OLE DB provider "OraOLEDB.Oracle" for linked server "ED01" returned message "ORA-12545: Connect failed because target host or object does not exist".
TRANCOUNT XAC_STATE ret
----------- --------- -----------
1         -1        -1

Msg 3930, Level 16, State 1, Line 12
The current transaction cannot be committed and cannot support operations that write to the log file. Roll back the transaction.
Msg 3998, Level 16, State 1, Line 1
Uncommittable transaction is detected at the end of the batch. The transaction is rolled back.
Sign in to post a workaround.