Home Dashboard Directory Help
Search

sp_testlinkedserver Fails Catastrophically when Linked Server Is Offline by JediSQL


Status: 

Closed
 as Fixed Help for as Fixed


6
0
Sign in
to vote
Type: Bug
ID: 522821
Opened: 12/31/2009 3:07:06 PM
Access Restriction: Public
0
Workaround(s)
view
5
User(s) can reproduce this bug

Description

I was working on designing a distributed system using SQL Server 2008 Linked Servers. I wanted to engineer in fault tolerance for Linked Server being offline. I wanted to find the least expensive call to determine if a Linked Server is online and available. I searched BOL and I found sp_testlinkedserver, and it looked just like what I need. So I tried it out with a linked server with this script:

» declare @w int;
» exec @w = sys.sp_testlinkedserver N'DATASVR'
» print case @w when 0 then 'Online' else 'Offline' end;

When I ran it I got this result:

» Online

I thought "Wow, that's cool. I could really have used that in SQL 2000."

Then, being a thorough experimenter, I wanted to see it in action when the Linked Server is offline. So I stopped the SQL Server 2000 service on the Linked Server. This is the result I got:

» OLE DB provider "SQLNCLI10" for linked server "DATASVR" returned message "Login timeout expired".
» OLE DB provider "SQLNCLI10" for linked server "DATASVR" returned message "A network-related or instance-specific error has occurred while establishing a connection to SQL Server. Server is not found or not accessible. Check if instance name is correct and if SQL Server is configured to allow remote connections. For more information see SQL Server Books Online.".
» Msg 2, Level 16, State 1, Line 0
» Named Pipes Provider: Could not open a connection to SQL Server [2].

Notice that "Offline" was never printed. The call to sp_testlinkedserver fails so catastrophically that the next statement is never executed! What is the point of having something to test a connection if subsequent code cannot recover from the test?

We evaluated the concept of putting the call to sp_testlinkedserver in a try-catch block, but we could just put our code that uses the Linked Server in the try-catch block as well. It seems the code inside of sp_testlinkedserver should have a try-catch block so that the [OLE DB provider "SQLNCLI10" for linked server "XXXXX"] errors don't crash sp_testlinkedserver.
Details
Sign in to post a comment.
Posted by A.Butsch on 5/22/2014 at 1:30 AM
Hello,

we use SQL-Server 2008R2 and we have the same problem.
Can we have an Update for SQL-Server 2008R2.
Posted by Mark Freeman on 11/18/2011 at 9:34 AM
If this has allegedly been fixed, why does http://msdn.microsoft.com/en-us/library/ms189809(v=SQL.90).aspx still show the incorrect information? It says it was last Updated: 14 April 2006.

Return Code Value
--------------------------------------------------------------------------------

0 (success) or 1 (failure)

The 2008 and 2008R2 versions of the page have also not been updated. Only the "Denali" version was updated with the correct information. The versions for the older versions also need to be updated before you can really consider this "fixed".
Posted by Microsoft on 8/8/2011 at 10:48 AM
Removed the incorrect Return Code section. Changed the introduction to say: Tests the connection to a linked server. If the test is unsuccessful the procedure raises an exception with the reason of the failure.

Posted by JediSQL on 6/28/2011 at 10:59 AM
Perhaps we could get an sp_testlinkedserver2 (or something like that) that has the 0/1 return so that both behaviors are available to meet all needs.
Posted by John Paul Cook on 10/11/2010 at 10:37 AM
A 0/1 outcome is needed for people who need to programmatically ascertain the status of a linked server.
Posted by Microsoft on 6/8/2010 at 12:40 PM
Thanks for the feedback. Unfortunately, the documentation and implementation for sp_testlinkedserver are inconsistent, and have been so since the procedure was added in SQL Server 2005. BOL says that the procedure returns 1 on failure. The implementation throws an exception, propagating the error we get from OLE DB.

Both behaviors are useful in different circumstances. If you're writing a script (your example), you'd probably prefer 0/1. If you're using this interactively in SSMS, the exception is more useful because it (may) give details about the problem.
We've discussed this among the team members and concluded that the best thing to do at this point is to fix BOL to match reality.

We will do so in the next release.

Thanks again for your feedback.

Regards,

Joachim Hammer
Program Manager
SQL Server
Posted by SueJen on 1/25/2010 at 12:03 PM
What's worse is that a failure invalidates the transaction. While I could test the linked server outside of the transaction, in some cases it may be 30 seconds between the beginning of the transaction and access to the linked server. Plenty of time to lose the connection. So I'm not finding much use for this procedure either.

declare @ret int
set xact_abort off
begin transaction
begin try
    exec @ret = sp_testlinkedserver N'XX01'
end try
begin catch
    select @ret = -1
end catch
select @@TRANCOUNT TRANCOUNT, XACT_STATE() XAC_STATE, @ret ret
commit transaction

results in:

OLE DB provider "OraOLEDB.Oracle" for linked server "ED01" returned message "ORA-12545: Connect failed because target host or object does not exist".
TRANCOUNT XAC_STATE ret
----------- --------- -----------
1         -1        -1

Msg 3930, Level 16, State 1, Line 12
The current transaction cannot be committed and cannot support operations that write to the log file. Roll back the transaction.
Msg 3998, Level 16, State 1, Line 1
Uncommittable transaction is detected at the end of the batch. The transaction is rolled back.
Sign in to post a workaround.