Monday, March 26, 2012

Merge Agents - two-tier question

We've got a SQL server in the US and another in China, with several merge
agents set up to synchronize between the two boxes. Occasionally the agent
will stop itself, with an error message of "Unable to query subscriber" (or
something similar; I don't have the error in front of me right now
unfortunately). The ironic thing is that although it reports an error, it
doesn't show an error-style "Big Red X" for the agent, and instead shows a
typical "Stopped Agent" icon, as if I had right-clicked it and chosen
"Stop" manually.
My questions are these:
1. Why does it stop itself? My guess would be that packet loss between the
two boxes causes replication to fail occasionally but that's only a guess.
And why does it look as though it was stopped manually?
2. Is there a way to automatically restart stopped agents? Worst-case
scenario, I could write something to stop/restart the SQL Server Agent
process, but ideally I'd like to have it set up so that within a couple of
minutes of it stopping, it restarts itself.
Any help would be greatly appreciated.
Regards,
Scott McNairScott McNair <scott.mcnair@.sfmco.takethispartout.com> wrote in
news:Xns959E5CA224D06sfmco@.207.46.248.16:

> My questions are these:
> 1. Why does it stop itself? My guess would be that packet loss
> between the two boxes causes replication to fail occasionally but
> that's only a guess. And why does it look as though it was stopped
> manually?
> 2. Is there a way to automatically restart stopped agents? Worst-case
> scenario, I could write something to stop/restart the SQL Server Agent
> process, but ideally I'd like to have it set up so that within a
> couple of minutes of it stopping, it restarts itself.
Bump.
We've got a large number of them that stopped themselves today. Some of
them are straight-out "Failed" items, and some are listed as "Succeeded",
but are actual failures. Examples:
Succeeded / The Processcould not query row metadata at the 'Subscriber'.
Failed / The process could not make a generation at the 'Subscriber'.
Succeeded / The process could not enumerate deletions at the 'Subscriber'.
We had six items stop themselves last night, out of 24 items, and this is
typical - around 25% failure rate.

No comments:

Post a Comment