Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Made initial table synchronization parallel. #198

Open
wants to merge 2 commits into
base: REL2_x_STABLE
Choose a base branch
from

Conversation

insafk
Copy link

@insafk insafk commented Nov 28, 2018

  • Made initial table synchronization parallel.
    (at pglogical_apply.c:1802 in process_syncing_tables())

  • If no worker slots are available to launch bgworker, the process just throws WARNING, and retries again after some time, rather than throwing ERROR and exiting.
    (at pglogical_worker.c:142 and line 207 in pglogical_worker_register())

  • Up to max_sync_workers_per_subscription sync workers will be created per subscription, and that many tables will be sync-ed parallelly.
    (at pglogical_worker.c:124 in pglogical_worker_register())

  • During initial table data copying, we used to create one more backend connection to the target database, with the backend in COPY FROM stdin mode, and the bgworker used to route CopyData from "COPY TO" output from publisher to this backend. Now, this extra backend connection code has been removed. Instead, the sync worker directly writes into the underlying target database. This sync worker is also in replication mode, meaning that FK violation constraint triggers won't get called, and this allows us to sync each table in any arbitrary order.
    (at pglogical_sync.c:516 in copy_table_data())

  • Now the subscription init code just gets the list of tables from the subscribed replication_set and add those table's meta into the local meta and mark them as in INIT state. Then the apply process spawns sync workers for each non-READY tables. In contrast, in the old code, the subscription init process used to get the list of tables and copied them sequentially.
    (at pglogical_sync.c:622 in copy_replication_sets_data() and pglogical_apply.c:1792 in process_syncing_tables())

  • In the previous code, sync worker in CATCHUP state doesn't exit even if it has replayed up to the required LSN. It exits only after receiving at least one logical change (WAL) from the publisher, because the exit code is written only in handle_commit() function. Now the same code(after little modification) from handle_commit() is copied into the process_syncing_tables() as well. And this process_syncing_tables() will get called periodically, and within that, the sync process exits if it has caught up with the apply process.
    (at pglogical_apply.c:1808 in process_syncing_tables())

ringerc and others added 2 commits November 1, 2018 13:37
Initial table synchronization is done parallelly by multiple sync
workers. Some more improvements were made which is explained below.

If no worker slots are available to launch bgworker, the process just
throws WARNING, and retries again after some time, rather than throwing
ERROR and exiting.

Up to max_sync_workers_per_subscription sync workers will be created per
 subscription, and that many tables will be sync-ed parallelly.

During initial table data copying, we used to create one more backend
connection to the target database, with the backend in COPY FROM stdin
mode, and the bgworker used to route CopyData from  "COPY TO" output
from publisher to this backend. Now, this extra backend connection code
has been removed. Instead, the sync worker directly writes into the
underlying target database. This sync worker is also in replication
mode, meaning that FK violation constraint triggers won't get called,
and this allows us to sync each table in any arbitrary order.

Now the subscription init code just gets the list of tables in the
subscribed replication_set and add those table's meta in INIT state.
Then the apply process spawns sync workers for each non-READY tables.
In contrast, in the old code, the subscription init process used to get
the list of tables and copied them sequentially.

In the previous code, sync worker in CATCHUP state doesn't exit even if
it has replayed up to the required LSN. It exits only after receiving at
least one logical change (WAL) from the publisher, because the exit code
is written only in handle_commit() function. Now the same code(after
little modification) from handle_commit() is copied into the
process_syncing_tables() as well. And this process_syncing_tables() will
get called periodically, and within that, the sync process exits if it
has caught up with the apply process.
@ringerc
Copy link
Contributor

ringerc commented Dec 11, 2018

Thanks very much for reaching out.

Sync is astonishingly difficult to get correct and race-free even when running one at a time. Parallel sync has mostly not been implemented for time reasons, because it can increase the risk of deadlocks, and because we've been unsure how to most effectively choose a level of parallelism that improves performance without overloading the system.

There's also another issue that I don't think can be solved sensibly without core postgres changes: managing session replication origins. Right now, a replication origin combines two quite distinct jobs. It keeps track of the remote lsn that corresponds to the current local committed position, so we don't replay a remote change twice or miss one. But it also gets recorded alongside the commit timestamp to identify the node the transaction was replicated from. There's no way to set one differently to the other in a race-free, crash-safe manner.

When you made the change "Instead, the sync worker directly writes into the underlying target database" you broke pglogical's defense against that issue. The replication origin IDs for the sync commits will now differ from commits made during normal streaming. That breaks conflict detection, including the conflict detection we do when dealing with sync overlapping concurrent streamed changes.

Sync gets dramatically harder when you're dealing with concurrent write activity on the tables being synced.

I don't think I'll be able to merge the work directly. In addition to the replication origin issues, pglogical3 (not yet open source) has been heavily restructured and the sync process has changed a lot. We don't expect to release a pglogical 2.3 in which we could add a major feature like parallel sync on the pglogical2 back-branch. I realise that's frustrating when you can't get your hands on the current development trunk, and I don't have any answers for about when it will be open source.

@ringerc
Copy link
Contributor

ringerc commented Dec 14, 2018

Hm, I got email notification that you commented, but nothing here.

To be honest I don't remember where the deadlocks can occur, only that they can in a simple resync + other writes under load.

In any case, the replication origin issue is a showstopper for this approach. I cannot, and will not, merge this change. I'm trying to work on a patch for core postgres that'd make it possible at the moment though.

I'm glad you've patched it to meet your needs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants