Search

Log Shipping Stops Working on Clustered Instance if Failed Over to Another Node by Allan Hirt

Active

11
0
Sign in
to vote
Type: Bug
ID: 559435
Opened: 5/16/2010 4:39:31 AM
Access Restriction: Public
4
Workaround(s)
1
User(s) can reproduce this bug
If you configure log shipping on a clustered instance where the program files is something other than the main system drive, log shipping stops working once the instance is failed over to another node. The reason is that when the Add Node operation is run on the other node(s), it does not configure the SQL Server instance properly. On the original node, everything is installed properly (for example, Z:\Program Files\Microsoft SQL Server\100\Tools\Binn\SqlLogShip.exe). On the other node, it puts SqlLogShip.exe on the original system drive. The SQL Server Agent job LSBackup_DBName references this as its step:
"Z:\Program Files\Microsoft SQL Server\100\Tools\Binn\sqllogship.exe" -Backup 6B81BF42-4AA8-4DE3-8349-5E54EE0C52ED -server KILROY

So clearly this will not work when things are failed over to another node if SqlLogShip is not where SQL Server is expecting it to be.

See pictures for verification and also check my blog for integrated pics and instructions.
http://www.sqlha.com/blog/post/2010/05/15/Bug-Combining-Failover-Clustering-Log-Shipping-When-Programs-Installed-On-Another-Drive.aspx
Details (expand)
Product Language
English

Version

SQL Server 2008 SP1

Category

Setup

Operating System

Windows Server 2008 R2
Operating System Language
US English
Steps to Reproduce
1. Install a clustered instance of SQL Server slipstreamed with SP1 on W2K8 R2 where all of the program files are on another drive (such as D:\).
2. Configure log shipping to another SQL Server instance
3. Verify log shipping is working when the instance is running on the node where the instance was first installed
4. Fail the instance over to another node where an Add Node operation was performed
5. Verify that the t-log backup job fails
6. Fail the instance back to the first node
7. Verify that the t-log backup job works
Actual Results
Functions as above - log shipping does not work on any node other than the original install node.
Expected Results
Log shipping should work no matter which node it is running on, and the files should be in the right place on all nodes - not just the first.

Platform

X64
File Attachments
File Name Submitted By Submitted On File Size  
InstallPath1.jpg 5/16/2010 20 KB
InstallPath2.jpg 5/16/2010 12 KB
InstallPath3.jpg 5/16/2010 67 KB
Node1.jpg 5/16/2010 282 KB
Node2B.jpg 5/16/2010 159 KB
Node2A.jpg 5/16/2010 383 KB
AfterFailover.jpg 5/16/2010 115 KB
FailBack.jpg 5/16/2010 112 KB
Sign in to post a comment.
Posted by Microsoft on 5/20/2013 at 4:03 PM
Hi Allan,

given my previous message, I'm archiving this work item for now.

Best regards
Jean-Yves Devant 

Program Manager Servicing and Lifecycle Experience of High Availability Technologies in SQL Server
Posted by Microsoft on 5/20/2013 at 4:03 PM
Hi Allan,

thanks for taking the time to share your feedback, this is really important to us.
Unfortunately this does not meet the bar now. We do not plan to address this for now. We will revisit the decision if more customers vote for the issue.

Best regards
Jean-Yves Devant 
Program Manager Servicing and Lifecycle Experience of High Availability Technologies in SQL Server
Posted by Microsoft on 5/25/2010 at 12:06 PM
Hi Allan,

As we discussed offline, this will be considered as a potential DCR for a future release.

Here is the detail we have for now:

The issue is that the SQL Agent Jobs for Log Shipping have the full path specified in the job step. When the SQL installation is different on clustered nodes, the path to sqllogship.exe will only be valid on one side.

Workaround:

Ensure that the OS path environment variable to sqllogship.exe is in place on all nodes of the cluster. i.e. c:\program files\microsoft sql server\100\tools\binn

Manually edit the Log Shipping jobs, by removing the hardcoded path to sqllogship.exe.

The Log Shipping jobs should work on all nodes after this workaround.

Changing this to a DCR. The shipped log shipping code is by design, but we may consider code changes in next major release of SQL .



Thanks for reporting this issue.

Max Verun
SQL Server
Sign in to post a workaround.
Posted by Allan Hirt on 6/24/2010 at 10:24 PM
sqltom - Not totally a workaround because that means two things:
1. Manually copying files to a location where they were not intended ... and may not be possible in some customer environments. It's only incorrect on the Add Node operation.
2. If you do not copy files in A, that means every time you fail over, you'd need to change the SQL Server Agent job. That wouldn't be acceptable by most. A workaround? Sure. Practical? Probably not.
Posted by Tom Michaels [MSFT] on 5/19/2010 at 1:13 PM
This issue would be classified as a DCR to Log Shipping.

Workaround:
1) Insure that the OS path environment variable to sqllogship.exe is in place on all nodes of the cluster.
i.e. "c:\program files\microsoft sql server\100\tools\binn "

2) Manually edit the Log Shipping jobs, by removing the hardcoded path to sqllogship.exe in the job steps.

3) The Log Shipping jobs should work on all nodes after this.

4) Test the modified configuration.
Posted by altrstar on 5/18/2010 at 5:20 AM
I hit this nasty thing a couple of weeks back too.

If you cant uninstall, you can also try copying files & file structure (should be 3 folders, most importantly Tools) from e.g. "Z:\Program Files\Microsoft SQL Server\100\" to "C:\Program Files\Microsoft SQL Server\100\". Not the best way to do things but it'll work till the next SP. Just make sure you remember to alter it back before applying it.

Cheers,

Kevin
Posted by Allan Hirt on 5/16/2010 at 4:40 AM
The only workaround is to install everything on the original system drive, but that may not be desirable or a standard in some companies.