Skip to content

S3: surface bucket listing failures and fix multi-role object count#5035

Open
shahzadhaider1 wants to merge 2 commits into
mainfrom
s3-surface-list-errors
Open

S3: surface bucket listing failures and fix multi-role object count#5035
shahzadhaider1 wants to merge 2 commits into
mainfrom
s3-surface-list-errors

Conversation

@shahzadhaider1

@shahzadhaider1 shahzadhaider1 commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Problem

S3 scans reported 0 objects scanned for buckets that contain objects, with no errors logged. Investigation showed ListObjectsV2 was failing with AccessDenied on every configured role, but because a role was assumed, the failure was logged at V(3) and never surfaced. The scans completed "successfully" while scanning nothing.

The suppression exists for list-all-buckets mode (role without a bucket list), where the scanner probes every bucket in the account and denials are expected. Applying it when buckets are explicitly configured hides real failures on targets the user asked to scan. Note that role-assumption (STS) failures also surface on this code path, since role credentials are resolved lazily.

Changes

Commit 1: surface listing failures for explicitly configured buckets

  • Listing failures are now logged at error level whenever the bucket was explicitly configured, regardless of role. Suppression to V(3) remains only for list-all-buckets mode.
  • Decision lives in listErrorsAreExpected, covered by a unit test.
  • New metric bucket_list_errors_total{bucket, role_arn} records every listing failure; previously a failed bucket left no trace in metrics.

Commit 2: accumulate object count across role passes

  • scanBuckets runs once per configured role and reset its object counter each pass, so the final progress message reported only the last role's count; a multi-role scan could report 0 objects scanned even when earlier roles scanned objects. The counter is now owned by Chunks and shared across passes.
  • Unit test pins the cumulative behavior.

Impact

Misconfigured access (IAM, bucket policy, or role trust policy) on an explicitly configured bucket is now visible at default log verbosity and in metrics, and the scan completion message reports the true total across all roles.

Checklist:

  • Tests passing (make test-community)?
  • Lint passing (make lint this requires golangci-lint)?

Note

Medium Risk
Changes S3 scan observability and progress reporting for IAM/list failures and multi-role scans; behavior is more visible but scan logic paths are otherwise unchanged.

Overview
S3 scanning now surfaces bucket listing failures when targets are explicitly configured, and reports the correct total object count across multiple assumed roles.

Listing errors used to be downgraded to verbose logs whenever any role was assumed, which hid AccessDenied and STS failures on buckets the user named. listErrorsAreExpected now only treats denials as expected in list-all-buckets mode (role set, no explicit bucket list); otherwise failures log at error level. Every list failure increments bucket_list_errors_total with bucket and role_arn labels.

Chunks owns a shared object counter passed into scanBuckets on each role pass, so the completion message reflects the cumulative scan instead of resetting to the last role’s pass.

Reviewed by Cursor Bugbot for commit f9b729c. Bugbot is set up for automated code reviews on this repo. Configure here.

Listing failures were logged at V(3) whenever a role was assumed, hiding
access and role-assumption errors for buckets the user explicitly asked
to scan. Suppression now applies only in list-all-buckets mode, and a
bucket_list_errors_total metric records every listing failure.
@shahzadhaider1 shahzadhaider1 requested a review from a team June 12, 2026 16:02
@shahzadhaider1 shahzadhaider1 requested a review from a team as a code owner June 12, 2026 16:02
scanBuckets runs once per configured role and reset its object counter
each pass, so the final progress message only reflected the last role's
count. Multi-role scans could report 0 objects scanned even when earlier
roles scanned objects.
@shahzadhaider1 shahzadhaider1 force-pushed the s3-surface-list-errors branch from 6827420 to f9b729c Compare June 12, 2026 16:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant