Home Dashboard Directory Help
Search

Merge replication update procedure does multiple updates??? by jkersemakers


Status: 

Closed
 as Won't Fix Help for as Won't Fix


1
0
Sign in
to vote
Type: Bug
ID: 661905
Opened: 4/15/2011 5:46:44 AM
Access Restriction: Public
1
Workaround(s)
view
0
User(s) can reproduce this bug

Description

My company is using merge replication in combination with Mobile devices.

During merge replication there are situations that a record is updated several times.

While investigating msmerge_update_sp_xxxx stored procedures in combination with SQL Profiler and triggers, I think I discovered a strange behaviour in the way that records are updated during a merge synchronization process. Correct me if I am wrong.

When a synchronised record is marked for update:
Step I -> For every Filter column that is updated an update statement is executed.
Step II -> When an indexing column is updated, a separate update statement is executed.
Step III -> For non indexing and non filter columns finally an update statement is executed.

The multiple update executions can cause troubles. Only after all updates have been transmitted, the server holds a valid record that ever existed at the subscriber.

But during a transaction... triggers are able to react on changes made to records and (wrong) decisions can be taken based on the incomplete updated data. I agree that triggers can be marked NOT FOR REPLICATION but then they aren't fired at all during the replication process and the action will not be performed at all!

An example.
A simple record structure (datatypes int) of table has columns Fx, Fy, Ix, Iy, Nx, Ny

Columns:
- Fx and Fy are Filter Columns,
- Ix and Iy are Indexed Columns
- Nx and Ny are non indexed, non filter columns
Record R1(Fx=1, Fy=1, Ix=1, Iy=1, Nx=1, Ny=1)

Record R1 is updated at the subscriber by statement
update A
set Fx=2, Fy=2, Ix=2, Iy=2, Nx=2, Ny=2
where Guid = @Guid;

The user synchronizes.
Step Ia-> Merge replication sends update statement to update Filter column Fx
-->R1(Fx=2, Fy=1, Ix=1, Iy=1, Nx=1, Ny=1)
Step Ib-> Merge replication sends update statement to update Filter column Fy
-->R1(Fx=2, Fy=2, Ix=1, Iy=1, Nx=1, Ny=1)

Step II-> Merge sends update statement for Indexes columns Ix, Iy
-->R1(Fx=2, Fy=2, Ix=2, Iy=2, Nx=1, Ny=1)

Step III-> Merge replication executes the update statement for the other columns
-->R1(Fx=2, Fy=2, Ix=2, Iy=2, Nx=2, Ny=2)

Only after the update of step III the record holds a consistent state with values that definitely existed.

Updating records in steps breaks the rule that an atomic transfer cannot be subdivided and must be processed in its entirety or not at all. Sure, the update is performed as a whole inside a transaction, but the record update is subdivided in several update statements and that is incorrect!

Vital information is lost this way. Fx, Fy, Ix, Iy, Nx and Ny were updated at the same time, so inside a trigger the:
- inserted table should have values (Fx=2, Fy=2, Ix=2, Iy=2, Nx= 2, Ny=2)
- deleted table should have values (Fx=1, Fy= 1, Ix=1, Iy=1, Nx= 1, Ny=1)
Unfortunately this is not the case. Merge replication splits the updates.

After Step Ia the:
- inserted table has values (Fx=2, Fy=1, Ix=1, Iy=1, Nx=1, Ny=1)
- deleted table has values (Fx=1, Fy=1, Ix=1, Iy=1, Nx=1, Ny=1)

After Step Ib the:
- inserted table has values (Fx=2, Fy=2, Ix=1, Iy=1, Nx=1, Ny=1)
- deleted table has values (Fx=2, Fy=1, Ix=1, Iy=1, Nx=1, Ny=1)
The information of the new values for Ix and Nx aren't available yet. Information that Fx is updated is not available!

After Step II the:
- inserted table has values (Fx=2, Fy=2, Ix=2, Iy=2, Nx=1, Ny=1)
- deleted table has values (Fx=2, Fy=2, Ix=1, Iy=1, Nx=1, Ny=1)
In step II the information that Fx and Fy have been changed is not available and the updated data of Nx and Ny isn't available yet.

After Step III the:
- inserted table has values (Fx=2, Fy=2, Ix=2, Iy=2, Nx=2, Ny=2)
- deleted table has values (Fx=2, Fy= 2, Ix=2, Iy=2, Nx=1, Ny=1)
Step III looses the information that Fx, Fy, Ix and Iy have been changed.

There are more side effects. A trigger could be checking a logical key. Step Ia, Ib and II for example simply could cause a violation of such a check, because Record R1 holds an inconsistent state! Combination Ix=2, Nx=1 doesn't have to exist in the original table!

What if data needs to be updated if Ix has been changed. After step II, Nx isn't updated yet, so that value can't be used. After Step III, the trigger doesn't know column Ix is updated.

I updated the row in one step, why is the update performed in multiple steps?

Marking a trigger not for replication is not an option in some situations, some actions (like updating redundant data columns) need to be done and send back immediately to the client. Running a SQL job simply isn't always an option.

In another closed topic, the response was that replication was acting like it should. I think this conclusion is incorrect.

I think merge update procedure really need a redesign, where an update is performed as exactly one statement.

Greetings,

Jack
Details
Sign in to post a comment.
Posted by Microsoft on 1/26/2012 at 4:24 PM
Hi Jack,

We've looked into this. As mentioned earlier although we would like to improve the product this way we will not have the means to do it in a foreseeable future. As such I'm archiving the discussion.
That said your feedback is very valuable to us, we keep track of the suggestion.


Best regards
Jean-Yves Devant
Program Manager (Replication/CDC/CT)
SQL Server Engine Team
Posted by Microsoft on 8/4/2011 at 4:10 PM
Hi Jack,

Sorry, this item has been prematurely resolved, before I could share feedback/explanation. That should not have happened. For those who are reading this, I wanted to share there are couple other issues in the repl space that have been prematurely resolved in the last couple weeks. I'll get back to each of those issues and share feedback/explanations.

We've looked into this. Indeed the current behavior is a side effect of the current design. The design around this should be better. That said, due to the complexity of the current precomputed-filtering logic, changing this is a major redesign. Although we would like to improve the product we will not have the means to do it in a foreseeable future.
That said your feedback is very valuable to us, we keep track of the issue.

Thanks for sharing the workaround, I have had time to look into it but I will and will get back to this thread.

Best regards
Jean-Yves Devant
Program Manager (Replication/CDC/CT)

SQL Server Engine Team
Posted by jkersemakers on 7/13/2011 at 10:21 AM
I see this is marked as resolved as by design. I really cannot believe what I am reading, this must be some kind of joke.

Just talk with someone who is a real expert in databases, such a person will tell you the design is wrong.

But I already expected this answer. I already implemented a workaround and I am willing to share this.



Posted by Microsoft on 5/5/2011 at 7:57 AM
Hi Jack

unfortunately I have not been able to switch on this yet.
I'll keep you posted.

Best regards
Jean-Yves Devant
Program Manager (Replication/CDC/CT)
SQL Server Engine Team
Posted by jkersemakers on 5/1/2011 at 10:07 PM
Dear Jean-Yves,

Is there any progress at the moment about this issue?

Best regards
Jack Kersemakers
Mobile Developer at FieldAssist
Posted by Microsoft on 4/19/2011 at 10:41 AM
Hi jkersemakers ,
    
thanks for taking the time to share your feedback, this is really important to us.
We will investigate the issue and get back to you.

Best regards
Jean-Yves Devant
Program Manager (Replication/CDC/CT)
SQL Server Engine Team
Sign in to post a workaround.
Posted by jkersemakers on 7/13/2011 at 11:11 AM
Here is a workaround. When the multiple update of a table (Table_X) is causing troubles:
- Add a new table Table_Y to the database. Only a subset of columns will be needed.
- First of all the guid (Marked as RowGuidCol)
    - columns that need to be used during comparison and whose original value must be known.
- Add a new table Table_Z just being able to perform the action after the multiple update has been done. (make a Guid column marked as RowGuidCol and a dummy int column)
- Add a trigger to Table_X. Inside the trigger
    - Add records of deleted to Table_Y for those Guids not yet existing in Table_Y.
    - Check if Replication is active (sessionproperty('replication_agent') = 1) and if it's not then insert a dummy record to Table_Z
- Mark the trigger on Table_X with sp_setTriggerOrder as a first firing trigger.
- Add an insert trigger to Table_Z. Inside the trigger
- Do whatever you normally would do inside the trigger of Table_X. The logic inside the trigger of Table_X should be moved to this trigger. Just use the data from Table_Y (original data) and Table_X (updated data) to make the decisions.
- Add another trigger to Table_Z.
- Trigger should delete all record inside Table_Y and Table_Z
- Mark this trigger as a last firing trigger.
- Set the @processing_order of Table_X to an integer value >= 0.
- Add Table_Z to the replication process and give the @Processing_order a higher value than the value of Table_X.
- Set the Replication option of the Article Table_Z to Bidirectional
- At the subscriber, just before synchronization add a dummy record to Table_Z.
- Synchronize.

Here the description of what is happening.
- When data of Table_X is updated, the new trigger on Table_X that is marked as a first trigger, sends a copy of the data to Table_Y.
- If it's Merge replication (Replication_agent can be the trigger on Table_X will not add a record to Table_Y. (Remember that we added a dummy record to Table_Z at the subscriber). Merge replication will perform it's multiple updates. Later in the process merge replication will synchronize the dummy record of Table_Z. When it is inserted all updates to Table_X will already be done. The triggers on Table_Z will do their job.

- If however it's not merge replication who did the update, then the trigger on Table_X adds a dummy record to Table_Z, causing the triggers on Table_Z to be fired immediately after the update.

Since Merge replication starts a transaction, it is a safe workaround and works just fine.

--> triggers on Table_Y will be fired.
File Name Submitted By Submitted On File Size  
Readme.zip 4/19/2011 64 KB
Readme.zip 4/19/2011 64 KB