Skip to content

Move Tables: throttle based on target credentials#1709

Open
danieljoos wants to merge 4 commits into
feature-move-tablesfrom
danieljoos/move-tables-throttler
Open

Move Tables: throttle based on target credentials#1709
danieljoos wants to merge 4 commits into
feature-move-tablesfrom
danieljoos/move-tables-throttler

Conversation

@danieljoos

Copy link
Copy Markdown
Contributor

A Pull Request should be associated with an Issue.

We wish to have discussions in Issues. A single issue may be targeted by multiple PRs.
If you're offering a new feature or fixing anything, we'd like to know beforehand in Issues,
and potentially we'll be able to point development in a particular direction.

Related issue (internal): https://github.com/github/database-infrastructure/issues/8212
Related issue (public): #1681

Further notes in https://github.com/github/gh-ost/blob/master/.github/CONTRIBUTING.md
Thank you! We are open to PRs, but please understand if for technical reasons we are unable to accept each and any PR

Description

This PR adapts the throttler to use the target credentials in move-tables mode.
Status variables are fetched from the target server and control replicas are expected to be in the target cluster.

In case this PR introduced Go code changes:

  • contributed code is using same conventions as original code
  • script/cibuild returns with no formatting errors, build errors or unit test errors.

@danieljoos danieljoos requested a review from meiji163 as a code owner June 12, 2026 08:48
Copilot AI review requested due to automatic review settings June 12, 2026 08:48
@danieljoos danieljoos added the feature-move-tables PRs that are associated with the new move-tables feature label Jun 12, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to make move-tables mode throttle against the target cluster by (a) using target credentials when checking control replicas for lag and (b) reading status variables from the appropriate DB connection in move-tables mode.

Changes:

  • Add a helper in the throttler to choose control-replica connection credentials based on whether move-tables mode is active.
  • Update Applier.ShowStatusVariable to query status variables from the move-tables target DB connection in move-tables mode.
  • Add/extend unit tests for the above behavior and clarify CLI help for --throttle-control-replicas in move-tables mode.
Show a summary per file
File Description
go/logic/throttler.go Chooses control-replica credentials from the move-tables target config when in move-tables mode.
go/logic/throttler_test.go Adds a focused test covering the connection-config selection behavior.
go/logic/applier.go Switches status-variable reads to use the move-tables target DB handle when in move-tables mode.
go/logic/applier_test.go Adds tests validating status-variable reads in normal and move-tables modes.
go/cmd/gh-ost/main.go Updates CLI help text for --throttle-control-replicas regarding move-tables behavior.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 5/5 changed files
  • Comments generated: 3

Comment thread go/cmd/gh-ost/main.go
Comment thread go/logic/applier.go
Comment thread go/logic/throttler.go
Comment thread go/logic/migrator.go
Comment on lines +1643 to 1647
if !mgtr.migrationContext.IsMoveTablesMode() {
<-mgtr.firstThrottlingCollected // replication lag
}
<-mgtr.firstThrottlingCollected // HTTP status
<-mgtr.firstThrottlingCollected // other, general metrics

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why only this first firstThrottlingCollected is gated behind move tables mode? Can't really tell just looking at this diff

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see down below we aren't emitting collectReplicationLag when in move tables. What's the reason for that?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh okay, reading into the code more is this because collectReplicationLag relies on the heartbeat table in some way? And we don't do the heartbeat in move tables

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, some of the collect...Lag goroutines in throttler.go get that firstThrottlingCollected channel handed over and write to it as soon as they're set up.
As we're skipping the collectReplicationLag goroutine for move-tables, we also must not await the channel, here. Otherwise it would block forever at the initializeThrottler func.

Comment thread go/logic/throttler.go
if !thlr.migrationContext.IsMoveTablesMode() {
go thlr.collectReplicationLag(firstThrottlingCollected)
}
go thlr.collectControlReplicasLag()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to gate this one behind move tables mode? Quick glance at the code of collectControlReplicasLag, it looks like it relies on the changelog table which we also don't have in move-tables mode right?

@danieljoos danieljoos Jun 12, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, the collectReplicationLag is disabled for move-tables mode.

In that function, gh-ost typically collects the replication lag from the replica it is attached to (reading the binlog data).
So for migrations it then knows the "impact" of the copy-writes it did on the corresponding primary - and if these are causing too much stress on the replica it is attached to. It then begins to throttle automatically.

For move-tables we can't use that mechanism, as we're only connected to the target primary and not reading the binlog on the target side.

sorry misread your question. I'll look into that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature-move-tables PRs that are associated with the new move-tables feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants