Sunday, March 25, 2012

Does someone know when MS plans to resolve this Bug?

Hi,

I have a problem with merge replication again. This is the replication between sql server 2005 standard edition and express edition. non-filtering. after the merge agent ran normally for a couple of days, the merge replication always stops at the uploading point. It seems like the merge agent can't run the 'MSmakegeneration' sp. Here is the output log. I wish someone could give me hand. Thank you very much!

2007-05-27 21:37:19.656 Microsoft SQL Server Merge Agent 9.00.3042.00
2007-05-27 21:37:19.656 Copyright (c) 2005 Microsoft Corporation
2007-05-27 21:37:19.656
2007-05-27 21:37:19.656 The timestamps prepended to the output lines are expressed in terms of UTC time.
2007-05-27 21:37:19.656 User-specified agent parameter values:
-Publication Brown
-Publisher FMSL_SERVER
-Subscriber Brown-WF
-Distributor FMSL_SERVER
-PublisherDB Brown
-SubscriberDB Brown
-PublisherLogin sa
-PublisherPassword **********
-PublisherSecurityMode 0
-OutputVerboseLevel 0
-SubscriberSecurityMode 1
-SubscriptionType 1
-DistributorLogin sa
-DistributorPassword **********
-DistributorSecurityMode 0
-Output C:\OUTPUTFILE.txt
-Outputverboselevel 2
2007-05-27 21:37:19.703 Percent Complete: 0
2007-05-27 21:37:19.703 Connecting to Subscriber 'Brown-WF'
2007-05-27 21:37:19.703 Connecting to OLE DB Subscriber at datasource: 'Brown-WF', location: '', catalog: 'Brown', providerstring: '' using provider 'SQLNCLI'
2007-05-27 21:37:19.812 OLE DB Subscriber: Brown-WF
DBMS: Microsoft SQL Server
Version: 09.00.3042
catalog name: Brown
user name: dbo
API conformance: 0
SQL conformance: 0
transaction capable: 1
read only: F
identifier quote char: "
non_nullable_columns: 0
owner usage: 15
max table name len: 128
max column name len: 128
need long data len:
max columns in table: 1000
max columns in index: 16
max char literal len: 131072
max statement len: 131072
max row size: 131072
2007-05-27 21:37:19.812 OLE DB Subscriber 'Brown-WF': {call sp_MSgetversion }
2007-05-27 21:37:19.828 OLE DB Subscriber 'Brown-WF': set nocount on declare @.dbname sysname select @.dbname = db_name() declare @.collation nvarchar(255) select @.collation = convert(nvarchar(255), databasepropertyex(@.dbname, N'COLLATION')) select collationproperty(@.collation, N'CODEPAGE') as 'CodePage', collationproperty(@.collation, N'LCID') as 'LCID', collationproperty(@.collation, N'COMPARISONSTYLE') as 'ComparisonStyle',cast(case when convert (int,databasepropertyex (@.dbname,'comparisonstyle')) & 0x1 = 0x1 then 0 else 1 end as bit) as DB_CaseSensitive,cast(case when convert (int,serverproperty ('comparisonstyle')) & 0x1 = 0x1 then 0 else 1 end as bit) as Server_CaseSensitive set nocount off
2007-05-27 21:37:19.828 OLE DB Subscriber 'Brown-WF': {?=call sp_helpsubscription_properties (N'FMSL_SERVER', N'Brown', N'Brown')}
2007-05-27 21:37:19.843 Distributor security mode: 0, login name: sa, password: ********.
2007-05-27 21:37:19.843 Percent Complete: 0
2007-05-27 21:37:19.843 Connecting to Distributor 'FMSL_SERVER'
2007-05-27 21:37:19.843 Connecting to OLE DB Distributor at datasource: 'FMSL_SERVER', location: '', catalog: '', providerstring: '' using provider 'SQLNCLI'
2007-05-27 21:37:20.390 OLE DB Distributor: FMSL_SERVER
DBMS: Microsoft SQL Server
Version: 09.00.3042
catalog name:
user name: dbo
API conformance: 0
SQL conformance: 0
transaction capable: 1
read only: F
identifier quote char: "
non_nullable_columns: 0
owner usage: 15
max table name len: 128
max column name len: 128
need long data len:
max columns in table: 1000
max columns in index: 16
max char literal len: 131072
max statement len: 131072
max row size: 131072
2007-05-27 21:37:20.718 OLE DB Distributor 'FMSL_SERVER': {call sp_MSgetversion }
2007-05-27 21:37:20.796 OLE DB Distributor 'FMSL_SERVER': {call sp_helpdistpublisher (N'FMSL_SERVER') }
2007-05-27 21:37:20.984 OLE DB Distributor 'FMSL_SERVER': select datasource, srvid from master..sysservers where upper(srvname) = upper(N'FMSL_SERVER')
2007-05-27 21:37:21.078 OLE DB Distributor 'FMSL_SERVER': {call sp_MShelp_merge_agentid (0,N'Brown',N'Brown',null,N'Brown',90,N'Brown-WF')}
2007-05-27 21:37:21.171 OLE DB Subscriber 'Brown-WF': {call sys.sp_MSadd_merge_history90 (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)}
2007-05-27 21:37:21.187 OLE DB Distributor 'FMSL_SERVER': {call sp_MShelp_profile (91, 4, N'')}
2007-05-27 21:37:21.312 Percent Complete: 0
2007-05-27 21:37:21.312 OLE DB Subscriber 'Brown-WF': {call sys.sp_MSadd_merge_history90 (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)}
2007-05-27 21:37:21.312 Initializing
2007-05-27 21:37:21.312 OLE DB Distributor 'FMSL_SERVER': {call sys.sp_MSadd_merge_history90 (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)}
2007-05-27 21:37:21.312 Connecting to OLE DB Publisher at datasource: 'FMSL_SERVER', location: '', catalog: 'Brown', providerstring: '' using provider 'SQLNCLI'
2007-05-27 21:37:21.421 Percent Complete: 0
2007-05-27 21:37:21.421 Connecting to Publisher 'FMSL_SERVER'
2007-05-27 21:37:21.421 OLE DB Distributor 'FMSL_SERVER': {call sys.sp_MSadd_merge_history90 (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)}
2007-05-27 21:37:21.765 OLE DB Publisher: FMSL_SERVER
DBMS: Microsoft SQL Server
Version: 09.00.3042
catalog name: Brown
user name: dbo
API conformance: 0
SQL conformance: 0
transaction capable: 1
read only: F
identifier quote char: "
non_nullable_columns: 0
owner usage: 15
max table name len: 128
max column name len: 128
need long data len:
max columns in table: 1000
max columns in index: 16
max char literal len: 131072
max statement len: 131072
max row size: 131072
2007-05-27 21:37:22.109 OLE DB Publisher 'FMSL_SERVER': set nocount on declare @.dbname sysname select @.dbname = db_name() declare @.collation nvarchar(255) select @.collation = convert(nvarchar(255), databasepropertyex(@.dbname, N'COLLATION')) select collationproperty(@.collation, N'CODEPAGE') as 'CodePage', collationproperty(@.collation, N'LCID') as 'LCID', collationproperty(@.collation, N'COMPARISONSTYLE') as 'ComparisonStyle',cast(case when convert (int,databasepropertyex (@.dbname,'comparisonstyle')) & 0x1 = 0x1 then 0 else 1 end as bit) as DB_CaseSensitive,cast(case when convert (int,serverproperty ('comparisonstyle')) & 0x1 = 0x1 then 0 else 1 end as bit) as Server_CaseSensitive set nocount off
2007-05-27 21:37:22.234 OLE DB Publisher 'FMSL_SERVER': {call sp_MSgetversion }
2007-05-27 21:37:22.406 Connecting to OLE DB Publisher at datasource: 'FMSL_SERVER', location: '', catalog: 'Brown', providerstring: '' using provider 'SQLNCLI'
2007-05-27 21:37:22.828 OLE DB Publisher: FMSL_SERVER
DBMS: Microsoft SQL Server
Version: 09.00.3042
catalog name: Brown
user name: dbo
API conformance: 0
SQL conformance: 0
transaction capable: 1
read only: F
identifier quote char: "
non_nullable_columns: 0
owner usage: 15
max table name len: 128
max column name len: 128
need long data len:
max columns in table: 1000
max columns in index: 16
max char literal len: 131072
max statement len: 131072
max row size: 131072
2007-05-27 21:37:23.515 OLE DB Subscriber 'Brown-WF': {call sys.sp_MSadd_merge_history90 (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)}
2007-05-27 21:37:23.515 Percent Complete: 0
2007-05-27 21:37:23.515 OLE DB Subscriber 'Brown-WF': {call sys.sp_MSadd_merge_history90 (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)}
2007-05-27 21:37:23.515 Retrieving publication information
2007-05-27 21:37:23.515 OLE DB Distributor 'FMSL_SERVER': {call sys.sp_MSadd_merge_history90 (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)}
2007-05-27 21:37:23.625 Percent Complete: 0
2007-05-27 21:37:23.625 Retrieving subscription information.
2007-05-27 21:37:23.625 OLE DB Distributor 'FMSL_SERVER': {call sys.sp_MSadd_merge_history90 (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)}
2007-05-27 21:37:28.203 OLE DB Subscriber 'Brown-WF': {call sys.sp_MSadd_merge_history90 (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)}
2007-05-27 21:37:28.203 Percent Complete: 0
2007-05-27 21:37:28.203 Uploading data changes to the Publisher
2007-05-27 21:37:28.203 OLE DB Distributor 'FMSL_SERVER': {call sys.sp_MSadd_merge_history90 (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)}
2007-05-27 21:42:28.312 Percent Complete: 0
2007-05-27 21:42:28.312 The process is waiting for a response from the query '{call sys.sp_MSmakegeneration (?)}'
2007-05-27 21:42:28.312 OLE DB Distributor 'FMSL_SERVER': {call sys.sp_MSadd_merge_history90 (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)}

In addition, from the replication monitor, I can see the last error message is:

The merge process was unable to create a new generation at the 'Subscriber'. Troubleshoot by restarting the synchronization with verbose history logging and specify an output file to which to write. (Source: MSSQL_REPL, Error number: MSSQL_REPL-2147200994)

And this problem happened on a couple of subscribers on the same day. All the rest are OK so far.
|||now I'm pretty sure that sp_makegeneration doesn't work on the subscriber side properly. I tried to run it manually, it was just keeping running all the time, without any message coming out.

anyone got any idea about this?
|||if the makegeneration proc isn't returning, can you try profiling it to see what's blocking it? Could be a long-running query.|||hi greg,

Glad to hear from you. your reply is always so quickSmile

Actually I did a trace, which shows at that moment, only sp_msmakegeneration was running, nothing else. And the sp was just running, never finish. I have been waitting for about 15 minutes, and tried several times. always got the same situation.

And all the other subscribers, which run properly, just finished the replication within 5 minutes.
|||WII, what I was asking is to do a statement-level trace (add SPTongue TiedtmtStarting event) to see what query inside the proc (or some child proc) is long-running.|||Hi Greg,

I did the statement-level trace as you suggested. But the trace information is too large and unreadable to me.

What I can find is that such kind of queries were keeping running infinitely:

if @.error<>0
update dbo.MSmerge_contents with (rowlock)
set partchangegen = (-@.target_gen )
where partchangegen = (-@.gen) and tablenick = @.art_nick
if @.error<>0
update dbo.MSmerge_metadataaction_request with (rowlock)
set generation = @.target_gen
where generation = @.gen and tablenick = @.art_nick
if @.error<>0
select @.changes_in_target_gen = @.changes_in_target_gen + @.contents_rows_moved
select @.changes_in_gen = @.changes_in_gen - @.contents_rows_moved
update dbo.MSmerge_genhistory with (rowlock) set changecount = @.changes_in_gen where generation = @.gen

if @.@.error<>0
update dbo.MSmerge_genhistory with (rowlock) set changecount = @.changes_in_target_gen where generation = @.target_gen

........

I feel there is some deadlock? Can you see anything from the log? Thx
|||And also, i just learned from some article, that

select tCustomers.* from tCustomers
inner join MSmerge_contents on tCustomers.rowguid = MSmerge_contents.rowguid
inner join MSmerge_genhistory on MSmerge_contents.generation = MSmerge_genhistory.generation
where MSmerge_genhistory.genstatus = 0

I can see all the unuploaded changes on the subscribers that got the problem, which means all the merge triggers were fired normally when any change happened on them.
|||I opened a support case with Microsoft. And this issue is treated as a bug now. So I suppose u guys will see the answer in a future hotfix.

Thanks guysSmile
|||

Hi WII

Just wondered how far your support case had got, we have exactly the same problem and I wondered if you have got a fix?

Thanks,

SMN

|||Same problem here, merge agent gets stuck on sys.sp_MSmakegeneration, which happened after a reinitialization and don't go away even after dropping and creating the subscription.|||Same problem at our system.... Some subsribers do not replicate anymore....

We found the same cause sp_MSmakegeneration (timeout respectively no return value?)

Our system supports salesmen with an application to create orders. We can not reset the subscribtions, as the data of the salesmen would be lost. Our customer looses every day big money due to this bug. There is no way for me to use the workaround as describe above, as we use partially colum level replication. I don't want to create a mess in system tables. If someone would provide a solution for this as well? (Or a comprehensive script, step by step what to do....?)

Otherwhise it would be good, if an experienced programmer could have a look at this "sp_MSmakegeneration" and complete the work and provide a temporarily substitution for this script. It seems to have a hidden dog.

E.g. Comment of MS Programmer in sp_MSmakegeneration Line 265: " -- TODO: possibly add some code for 0 artnick generations..."

Thanks for any help or advice.
|||http://support.microsoft.com/kb/936305

It seems that one of the Hotfixes apply to this timeout problem|||Has anyone determined that this hotfix corrects this problem?

thanks
jg
|||Did NOT fix problem here.sql

2 comments:

Unknown said...
This comment has been removed by the author.
Unknown said...

Don't know if you already solved it, but I did find a workaround that worked on my server environment (Windows Server 2008 R2/SQL Server 2008 R2). I created a new merge replication that replicates a new table (_TEST_REPL) and launched the snapshot agent. As before, the process hang executing sp_MSmakegeneration, but I left if to see what would happen. When it finished, al the production publications started to work again. Hope this help somebody!

Post a Comment