Skip to content

feat(ci): wire commander e2e end-to-end (kubeconfig connect mode + enable-modules step)#33

Open
duckhawk wants to merge 4 commits into
mainfrom
feat/commander-e2e-enable-modules
Open

feat(ci): wire commander e2e end-to-end (kubeconfig connect mode + enable-modules step)#33
duckhawk wants to merge 4 commits into
mainfrom
feat/commander-e2e-enable-modules

Conversation

@duckhawk

@duckhawk duckhawk commented Jun 29, 2026

Copy link
Copy Markdown
Member

What

Adds a Deckhouse Commander cluster provider and the wiring to run a module's e2e suite against a Commander-created cluster, without regressing the existing (dvp / other) providers. Rebased on current main and realigned onto the new ssh/v2 + connector abstractions per review.

Provider (internal/provisioning/commander)

Mirrors the DVP provider's shape:

  • config.go — path-or-inline SSH credentials (Validate / Resolve / Credentials), like dvp/config.go. The master host is resolved from the Commander connection info (not configured); the jump key is optional and defaults to the master key.
  • connect.go — a connector built on internal/infrastructure/ssh/v2: Route(jump, master) + NewWithRetry, Exec-fetches the kubeconfig off the master (super-admin.conf/admin.conf), opens an in-process API tunnel (OpenTunnel), and returns (*rest.Config, cleanup). A package-level Connect(ctx, environ, logger) is the shared entry point.
  • kubeconfig.gobuildRestConfig (overrides the server to the tunnel's local addr), same as dvp.
  • provider.goBootstrap/Remove now only talk to the Commander API (create + wait Ready / delete). No legacy SSH client, no exportKubeconfig/.sshinfo sidecar.

Connect + module enablement

  • In-process connect everywhere. cmd/enable-modules and the test suite both connect through the commander connector — SSH to the master via the bastion, kubeconfig fetched off the master, in-process API tunnel. No kubeconfig artifact, no external SSH tunnel.
  • cmd/enable-modules enables the module-under-test via EnableAndConfigureModules (ModuleConfig + ModulePullOverride from cluster_config), after waiting for the Deckhouse deckhouse.io/v1alpha1 ModuleConfig API.
  • The suite's ClusterCreateModeKubeconfig + ConnectViaKubeconfig (which extended the legacy path) are gone; replaced by ClusterCreateModeCommanderConnect wired to the connector. Interim connect-side of the provider flow, shaped to fold into a formal clusterprovider Provider.Connect.

Reusable workflow (.github/workflows/e2e.yml)

  • New inputs cluster_provider (dvp default | commander) and module_image_tag.
  • Job graph unchanged: resolve → bootstrap → run-tests → teardown.
  • bootstrap (commander) only calls the Commander API — no kubeconfig artifact.
  • run-tests (commander): a single gated step injects the connection env (TEST_CLUSTER_CREATE_MODE=commanderConnect + E2E_COMMANDER_*, SSH key inline via $GITHUB_ENV); enable-modules and the suite connect in-process. e2e-api-tunnel.sh deleted, no download-artifact, no key-file materialization. The Run E2E tests step stays byte-identical for other providers (dvp).

docs/CI.md updated.

Validation

go build / go vet / gofmt / unit tests are green after the rebase + realignment. The commander flow was previously proven end-to-end against sds-object (deckhouse/sds-object#20: bootstrap → enable-modules → System + Lightweight specs, 16/16 → teardown); re-validation against this realigned workflow is in progress.

Follow-ups / coordination (from review)

  • Env-var vocabulary — kept the E2E_COMMANDER_SSH_* prefix mirroring dvp's field shape; happy to rename both providers to a neutral shared set (e.g. E2E_SSH_*) once we agree on names.
  • Provider.Connect — the connector already returns (*rest.Config, cleanup); ready to fold suite connect into a formal clusterprovider.Provider.Connect once its return contract is settled.

🤖 Generated with Claude Code

@github-code-quality

github-code-quality Bot commented Jun 29, 2026

Copy link
Copy Markdown

Code Coverage Overview

Languages: Go

Go / code-coverage/go

The overall coverage in the branch remains at 18%, unchanged from the branch.

Show a code coverage summary of the most impacted files.
File b212665 4e419c5 +/-
internal/infras.../ssh/v2/conn.go 89% 87% -2%
internal/infras...sh/v2/tunnel.go 84% 83% -1%
pkg/cluster/cluster.go 0% 0% 0%
internal/provis...ander/config.go 0% 0% 0%
internal/provis...nder/connect.go 0% 0% 0%
internal/provis...r/kubeconfig.go 0% 0% 0%
internal/kubern...ander/client.go 48% 49% +1%
internal/provis...der/provider.go 39% 40% +1%
internal/infras...sh/v2/dialer.go 39% 41% +2%

Updated July 01, 2026 17:12 UTC
Code Coverage is in Public Preview. Learn more and provide us with your feedback.

Comment thread .github/workflows/e2e.yml Outdated
duckhawk added a commit that referenced this pull request Jul 1, 2026
Squashed set of the commander e2e-pipeline work (PR #33):
- Commander provider fetches the cluster kubeconfig over SSH (via bastion) when
  the Commander API does not expose it; writes a <kubeconfig>.sshinfo sidecar.
- Raise the commander HTTP client / bootstrap timeouts for slow cluster create.
- kubeconfig cluster-connect mode (no SSH) + cmd/enable-modules + wait for the
  Deckhouse ModuleConfig API.
- Reusable workflow: commander bootstrap uploads kubeconfig; run-tests opens an
  SSH tunnel to the master API and enables the module in-process (no separate
  enable-modules job). All commander wiring is provider-gated so dvp/other modes
  are unchanged; run-tests depends only on [resolve, bootstrap]. go mod download
  (not tidy) is used at run time. docs/CI.md + WORKLOG updated.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@duckhawk duckhawk force-pushed the feat/commander-e2e-enable-modules branch from 8c2e415 to b1d2126 Compare July 1, 2026 06:22
duckhawk added a commit that referenced this pull request Jul 1, 2026
Squashed set of the commander e2e-pipeline work (PR #33):
- Commander provider fetches the cluster kubeconfig over SSH (via bastion) when
  the Commander API does not expose it; writes a <kubeconfig>.sshinfo sidecar.
- Raise the commander HTTP client / bootstrap timeouts for slow cluster create.
- kubeconfig cluster-connect mode (no SSH) + cmd/enable-modules + wait for the
  Deckhouse ModuleConfig API.
- Reusable workflow: commander bootstrap uploads kubeconfig; run-tests opens an
  SSH tunnel to the master API and enables the module in-process (no separate
  enable-modules job). All commander wiring is provider-gated so dvp/other modes
  are unchanged; run-tests depends only on [resolve, bootstrap]. go mod download
  (not tidy) is used at run time. docs/CI.md + WORKLOG updated.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@duckhawk duckhawk force-pushed the feat/commander-e2e-enable-modules branch from b1d2126 to 0c7365d Compare July 1, 2026 11:47
@duckhawk duckhawk requested a review from fastrapier July 1, 2026 11:49
Comment thread cmd/enable-modules/main.go Outdated
}

func main() {
kubeconfigPath := os.Getenv("KUBE_CONFIG_PATH")

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Я kubeconfig уже положил в E2E_DVP_BASE_CLUSTER_KUBECONFIG

Comment thread internal/config/env.go Outdated
// from a kubeconfig file (KUBE_CONFIG_PATH) with no SSH tunnel. Used by the CI
// pipeline's run-tests step, where the cluster was bootstrapped out-of-band
// (e.g. by the Commander provider) and its kubeconfig handed off as an artifact.
ClusterCreateModeKubeconfig = "kubeconfig"

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

я эти устаревшие режимы удалить хотел, не нужно дальше развивать эту логику

ClusterProvider ProviderMode `env:"E2E_TEST_CLUSTER_PROVIDER,required"`

Тут лежит новый конфиг который хранит режимы
Режимы лежат тут - https://github.com/deckhouse/storage-e2e/blob/main/pkg/clusterprovider/mode.go

// key and the (usually required) jump host. SSHUser overrides the
// Commander-reported master user; the jump key defaults to the master key.
SSHPrivateKeyPath string `env:"E2E_COMMANDER_SSH_PRIVATE_KEY_PATH"`
SSHUser string `env:"E2E_COMMANDER_SSH_USER"`

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Можем эти переменные унифицировать, чтобы одни и те же не копировать, мне для dvp тоже нужны большая часть из этих переменных, можем нейтрально назвать чтобы и тебе и мне подходили

return fmt.Errorf("no SSH user for the master of cluster %q (set E2E_COMMANDER_SSH_USER)", name)
}

var sshClient ssh.SSHClient

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

можешь пожалуйста на новый ssh client переехать, я сделал, в целом основные механизмы у меня готовы. Exec сейчас подолью, OpenTunnel уже есть

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

От старой версии планирую избавиться, нет смысла с ним дальше жить

Comment thread pkg/cluster/cluster.go Outdated
// out-of-band (e.g. by the Commander provider, whose kubeconfig is handed off as
// an artifact). The returned resources carry only the Kubeconfig; teardown only
// needs to release the lock (the separate teardown job removes the cluster).
func ConnectViaKubeconfig(ctx context.Context) (*TestClusterResources, error) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Вот этот механизм я думаю в Provider вынести, чтобы у провайдера был метод Connect, пока не решил что он должен возвращать, по этому не добавлял. Можем согласовать что тебе пригодится оттуда кроме rest.Config

Comment thread pkg/cluster/cluster.go Outdated
ctx, cancel := context.WithTimeout(context.Background(), config.ClusterCreationTimeout)
defer cancel()

GinkgoWriter.Printf(" ▶️ Connecting via kubeconfig (mode: %s, KUBE_CONFIG_PATH=%s)\n", config.TestClusterCreateMode, config.KubeConfigPath)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ginko лучше не использовать, из пакета я ее выпилю, лучше на обычный slog перейти, лишний пакет нет смысла тащить сюда

Comment thread .github/scripts/e2e-api-tunnel.sh Outdated
@@ -0,0 +1,82 @@
#!/usr/bin/env bash

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

тут обсудить надо почему на скриптах хочешь сделать

Comment thread .github/workflows/e2e.yml Outdated
# The commander provider fetches the new cluster's kubeconfig over SSH
# (the Commander API does not expose it), typically via a jump host, so the
# SSH key secret is materialized to disk here.
- name: Materialize commander SSH key

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

тут тоже сделать в конфиге нужно по примеру тут -

KubeConfigPath string `env:"E2E_DVP_BASE_CLUSTER_KUBECONFIG_PATH"`

Сделать 2 режима, 1 для CI, там мы обычно content храним, работать с ним тоже можно
А для локального запуска оставить поддержку PATH

Comment thread .github/workflows/e2e.yml Outdated
# server is the node-local 127.0.0.1:<port>). ALL commander wiring lives in
# the commander-gated steps below, so for other providers (dvp, …) the
# "Run E2E tests" step runs with exactly its previous environment.
- name: Download cluster kubeconfig

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

а нужен ли тут download? или же мы просто можем получить его, я у себя планировал использовать тот что положили для bootstrap, открыть туннель, и достать с мастера нужный ключ в рамках connect
Чтобы Ci максимально легким был

duckhawk and others added 2 commits July 1, 2026 15:23
Squashed set of the commander e2e-pipeline work (PR #33):
- Commander provider fetches the cluster kubeconfig over SSH (via bastion) when
  the Commander API does not expose it; writes a <kubeconfig>.sshinfo sidecar.
- Raise the commander HTTP client / bootstrap timeouts for slow cluster create.
- kubeconfig cluster-connect mode (no SSH) + cmd/enable-modules + wait for the
  Deckhouse ModuleConfig API.
- Reusable workflow: commander bootstrap uploads kubeconfig; run-tests opens an
  SSH tunnel to the master API and enables the module in-process (no separate
  enable-modules job). All commander wiring is provider-gated so dvp/other modes
  are unchanged; run-tests depends only on [resolve, bootstrap]. go mod download
  (not tidy) is used at run time. docs/CI.md + WORKLOG updated.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…facts

Address @fastrapier's review by rebuilding the commander provider on the
new abstractions instead of the legacy SSH client and CI plumbing.

- Provider migrated to internal/infrastructure/ssh/v2: a connector
  (connect.go) mirrors the DVP one — Route(jump, master) + NewWithRetry,
  Exec-fetches the kubeconfig off the master, opens an in-process API
  tunnel, returns (*rest.Config, cleanup). No more legacy ssh client,
  no exportKubeconfig/.sshinfo sidecar.
- config.go gains the DVP-style path-or-inline credentials (Validate +
  Resolve + Credentials); jump key optional (defaults to the master key).
- enable-modules and the test suite connect IN-PROCESS via the connector,
  so the CI drops the kubeconfig artifact upload/download, the external
  SSH tunnel and e2e-api-tunnel.sh (deleted); a single gated step injects
  the connection env so "Run E2E tests" stays byte-identical for dvp.
- Remove the legacy-extending bits: ConnectViaKubeconfig and the
  kubeconfig create-mode are replaced by ClusterCreateModeCommanderConnect
  wired to the connector; drops the ginkgo-based kubeconfig case.

Remaining coordination (per review): unify the SSH env-var vocabulary
across dvp/commander, and fold suite connect into a formal
clusterprovider Provider.Connect — the connector here is shaped to slot
into that.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@duckhawk duckhawk force-pushed the feat/commander-e2e-enable-modules branch from 0c7365d to 33c3a52 Compare July 1, 2026 12:44
duckhawk and others added 2 commits July 1, 2026 16:25
LoadConfig (used by the in-process Connect for enable-modules and the
suite) parsed the whole Config, whose TemplateName carried env `required`
— but connect never creates a cluster, so enable-modules failed with
`E2E_COMMANDER_TEMPLATE_NAME is not set`. Drop the struct-tag requirement
and validate TemplateName in createCluster (Bootstrap) instead, where it
is actually needed.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
ConnectToCommanderCluster passed the caller's ClusterCreationTimeout
context to the connector, which ties the SSH client + tunnel serve loop
to it. When the connect dispatch block returned (defer cancel) the tunnel
listener closed, so every later suite API call failed with
"dial tcp 127.0.0.1:<port>: connect: connection refused" and BeforeSuite
timed out. Detach cancellation (context.WithoutCancel) so the tunnel lives
until CleanupTestCluster tears it down; connect setup stays bounded by the
connector's NewWithRetry timeout.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jul 1, 2026

Copy link
Copy Markdown

Code Coverage

Package Line Rate Health
github.com/deckhouse/storage-e2e/internal/cluster 0%
github.com/deckhouse/storage-e2e/internal/config 51%
github.com/deckhouse/storage-e2e/internal/infrastructure/ssh 0%
github.com/deckhouse/storage-e2e/internal/infrastructure/ssh/v2 80%
github.com/deckhouse/storage-e2e/internal/kubernetes/commander 49%
github.com/deckhouse/storage-e2e/internal/kubernetes/deckhouse 0%
github.com/deckhouse/storage-e2e/internal/kubernetes/storage 0%
github.com/deckhouse/storage-e2e/internal/kubernetes/virtualization 0%
github.com/deckhouse/storage-e2e/internal/logger 52%
github.com/deckhouse/storage-e2e/internal/provisioning/commander 16%
github.com/deckhouse/storage-e2e/internal/provisioning/dvp 63%
github.com/deckhouse/storage-e2e/internal/provisioning/dvp/vm 78%
github.com/deckhouse/storage-e2e/pkg/cluster 4%
github.com/deckhouse/storage-e2e/pkg/clusterprovider 89%
github.com/deckhouse/storage-e2e/pkg/clusterprovider/registry 100%
github.com/deckhouse/storage-e2e/pkg/kubernetes 5%
github.com/deckhouse/storage-e2e/pkg/retry 94%
github.com/deckhouse/storage-e2e/pkg/storage-e2e 0%
github.com/deckhouse/storage-e2e/pkg/testkit 4%
Summary 18% (2589 / 14484)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants